Extract text from images
These models perform optical character recognition, extracting text from images. They can help digitize text from scanned documents, photos, and other visual media.
Best for image to text extraction: abiruyt/text-extract-ocr
For most OCR tasks, we recommend the abiruyt/text-extract-ocr model. This versatile tool makes it simple to extract plain text from a wide variety of images.
Best for document extraction: cuuupid/marker
To get clean markdown or JSON from PDF, epub, or other document formats, use Marker. It’s a pipeline of models that supports all languages, removes headers and footers, formats equations and code blocks, and more. It can also OCR text from PDFs saved in image format.
Other utilities
Some other useful models for your text extraction pipeline:
- mickeybeurskens/latex-ocr specializes in recognizing LaTeX equations from images and converting them into usable LaTeX code
- cjwbw/docentr cleans up degraded images, removing bleed-through, artifacts and smudging
- willywongi/donut extracts structured JSON data from receipts
- pbevan1/llama-3.1-8b-ocr-correction fixes OCR errors in digitized text by reconstructing the original content using LLaMA 3.1
Featured models
Recommended models
pbevan1 / llama-3.1-8b-ocr-correction
LLaMA 3.1-8B, finetuned on a synthetic OCR dataset for superior OCR correction.
cuuupid / glm-4v-9b
GLM-4V is a multimodal model released by Tsinghua University that is competitive with GPT-4o and establishes a new SOTA on several benchmarks, including OCR.
cudanexus / ocr-surya
Surya is a document OCR toolkit that does:
mickeybeurskens / latex-ocr
Optical character recognition to turn images of latex equations into latex format.
awilliamson10 / meta-nougat
Nougat: Neural Optical Understanding for Academic Documents
kshitijagrwl / pii-extractor-llm
PII Data Extraction from Text
willywongi / donut
Extract structured data from receipt images using Donut 🍩 (Document Understanding Transformer)
cjwbw / docentr
End-to-End Document Image Enhancement Transformer