You're looking at a specific version of this model. Jump to the model overview.
datalab-to /ocr:909c96fc
Input schema
The fields you can use to run this model with an API. If you don’t give a value for a field its default value will be used.
| Field | Type | Default value | Description |
|---|---|---|---|
| file |
string
|
Input file. Must be one of: .pdf, .doc, .docx, .ppt, .pptx, .png, .jpg, .jpeg, .webp
|
|
| max_pages |
integer
|
Min: 1 |
Maximum number of pages to process. Cannot be specified if page_range is set - these parameters are mutually exclusive
|
| visualize |
boolean
|
False
|
Draw red polygons on the input image(s) to visualize detected text regions and return the annotated images
|
| page_range |
string
|
Page range to parse, comma separated like 0,5-10,20. Example: '0,2-4' will process pages 0, 2, 3, and 4. Cannot be specified if max_pages is set - these parameters are mutually exclusive
|
|
| skip_cache |
boolean
|
False
|
Bypass the server-side cache and force re-processing. By default, identical requests are cached to save time and cost. Enable this to get fresh results
|
| return_text |
boolean
|
False
|
Return extracted text as a single string with all text lines concatenated. Each line is separated by a newline character
|
| return_pages |
boolean
|
True
|
Return detailed page information including text lines, bounding boxes, polygons, and character-level data. When disabled, only text and page_count will be returned
|
Output schema
The shape of the response you’ll get when you run this model with an API.
Schema
{'description': 'OCR output with optional text, pages, and visualizations',
'properties': {'page_count': {'description': 'Total number of pages '
'processed. Only returned if '
'return_pages=true'},
'pages': {'description': 'List of pages with detailed OCR '
'information. Only returned if '
'return_pages=true. Each page '
'contains: page (number), image_bbox '
'(coordinates), and text_lines (list '
'of detected text). Each text_line '
'has: text (string), confidence '
'(0-1), bbox ([x1,y1,x2,y2]), polygon '
'([[x1,y1],[x2,y2],[x3,y3],[x4,y4]]), '
'and chars (character-level data with '
'text, bbox, polygon, confidence, '
'bbox_valid)'},
'text': {'description': 'Extracted text as a single string '
'with all text lines concatenated, '
'separated by newlines. Only returned '
'if return_text=true'},
'visualizations': {'description': 'List of images with red '
'polygons drawn around '
'detected text regions. Only '
'returned if '
'visualize=true'}},
'title': 'OCROutput',
'type': 'object'}