datalab-to/ocr

Detect and transcribe text in images with accurate bounding boxes, layout analysis, reding order, and table recognition, in 90 languages

68 runs

Readme

Datalab OCR is a powerful document OCR toolkit that does:

  • OCR in 90+ languages that benchmarks favorably vs cloud services
  • Line-level text detection in any language
  • Layout analysis (table, image, header, etc detection)
  • Reading order detection
  • Table recognition (detecting rows/columns)
  • LaTeX OCR

For more information, visit https://datalab.to