π Github | π₯ Model Download | π Paper Link | π Arxiv Paper Link |
DeepSeek-OCR: Contexts Optical Compression
Explore the boundaries of visual-text compression.
Usage
Inference using Huggingface transformers on NVIDIA GPUs. Requirements tested on python 3.12.9 + CUDA11.8οΌ
vLLM
Refer to πGitHub for guidance on model inference acceleration and PDF processing, etc.
Visualizations
![]() |
![]() |
![]() |
![]() |
Acknowledgement
We would like to thank Vary, GOT-OCR2.0, MinerU, PaddleOCR, OneChart, Slow Perception for their valuable models and ideas.
We also appreciate the benchmarks: Fox, OminiDocBench.