lucataco / olmocr-7b

A release preview of the olmOCR model from Ai2 that's fine tuned from Qwen2-VL-7B-Instruct using the olmOCR-mix-0225 dataset

  • Public
  • 2.7K runs
  • L40S
  • GitHub
  • Weights
  • License
Run with an API

Input

*file

Input PDF file

integer

Page number to process

Default: 1

number

Sampling temperature

Default: 0.8

integer

Maximum number of tokens to generate

Default: 100

Output

['{"primary_language":"en","is_rotation_valid":true,"rotation_correction":0,"is_table":false,"is_diagram":false,"natural_text":"Christians behaving themselves like Mahomedans. \\n\\n4. The natives soon had reason to suspect the viceroy\'s sincerity in his expressions of regret at the proceedings of which they complained. For about this time the Dominican friars, under pretence of building a convent, erected a fortress on the island of Solor, which, as soon as finished, the viceroy garrisoned with a strong force. The natives very naturally felt indignant at this additional encroachment, and took every opportunity to attack the garrison. The monks, forgetful of their peaceable profession, took an active part in these skirmishes, and many of them fell sword in hand.\\n\\nThe Mahomedan faith has been appropriately entitled, *The religion of the sword*; and with equal propriety may we so designate the religion of these belligerent friars. The Portuguese writers give an account of one of their missionaries, Fernando Vinagre, who was as prompt in the field of battle as at the baptismal font. This man, though a secular priest, undertook the command of a squadron that was sent to the assistance of the rajah of Tidore, on which occasion he is said to have acted in the twofold capacity of a great commander, and a great apostle, at one time appearing in armour, at another in a surplice; and even occasionally, baptizing the converts of his sword without putting off his armour, but covering it with his ecclesiastical vest. In this crusade he had two\\n\\n---\\n\\n3 Geddes History, &c., pp. 24—27.\\n\\nPudet hsec opprobria nobis\\nVel dici potuisse.\\n\\n4 Called *Tadura* or *Daco*, an island in the Indian Ocean, one of the Moluccas\\n\\n5 *These a la Dragoon conversions.* Geddes\' History, p. 27."}']
Generated in

Run time and cost

This model costs approximately $0.076 to run on Replicate, or 13 runs per $1, but this varies depending on your inputs. It is also open source and you can run it on your own computer with Docker.

This model runs on Nvidia L40S GPU hardware. Predictions typically complete within 78 seconds. The predict time for this model varies significantly based on the inputs.

Readme

olmOCR Logo

olmOCR-7B-0225-preview

This is a preview release of the olmOCR model that’s fine tuned from Qwen2-VL-7B-Instruct using the olmOCR-mix-0225 dataset.

Quick links: - 📃 Paper - 🤗 Dataset - 🛠️ Code - 🎮 Demo

The best way to use this model is via the olmOCR toolkit. The toolkit comes with an efficient inference setup via sglang that can handle millions of documents at scale.

Usage

This model expects as input a single document image, rendered such that the longest dimension is 1024 pixels.

The prompt must then contain the additional metadata from the document, and the easiest way to generate this is to use the methods provided by the olmOCR toolkit.

Manual Prompting

If you want to prompt this model manually instead of using the olmOCR toolkit, please see the code below.

In normal usage, the olmOCR toolkit builds the prompt by rendering the PDF page, and extracting relevant text blocks and image metadata.

License and use

olmOCR is licensed under the Apache 2.0 license. olmOCR is intended for research and educational use. For more information, please see our Responsible Use Guidelines.