OCR-PDF Project

Overview

This project extracts text from PDF files using Tesseract Optical Character Recognition (OCR). It downloads a PDF from a given URL, converts each page into an image, and then extracts the text using Tesseract OCR. The project is on Github.

Usage

Provide a url for a pdf and it will provide the text of the pdf.

License

This project is licensed under the MIT License