Skip to content(if available)orjump to list(if available)

Llama-Scan: Convert PDFs to Text W Local LLMs

firesteelrain

Ironically, Ollama likely is using Tesseract under the hood. Python library ocrmypdf uses Tesseract too. https://github.com/ocrmypdf/OCRmyPDF

david_draco

Looking at the code, this converts PDF pages to images, then transcribes each image. I might have expected a pdftotext post-processor. The complexity of PDF I guess ...

firesteelrain

There is a very popular Python module called ocrmypdf. I used it to help my HOA and OCR’ing of old PDFs.

https://github.com/ocrmypdf/OCRmyPDF

No LLMs required.

roscas

Almost perfect, the PDF I tested it missed only a few symbols.

But that is something I will use for sure. Thank you.