How we made our optical character recognition (OCR) code more accurate
8 comments
·May 21, 2025abc-1
booder1
5.5.0 released November last year. Still a very active project as far as I can tell and runs on CPU. Even compared to best open source GPU option it is still pretty good. VLMs work very differently and don't work as well for everything. Why is it out of date?
amelius
Well, at least I can apt-get install tesseract.
That doesn't hold for any of the GPU-based solutions, last time I checked.
null
camtarn
Neat article, but I feel like I have no idea why they're doing this! Is transcribing code from images really such a big use case?
dewey
> To best support software engineers when they want to transcribe code from images, we fine-tuned our pre-processing pipeline to screenshots of code in IDEs, terminals, and online resources like YouTube videos and blog posts.
Even with these examples that seems like a very narrow use case.
FloatArtifact
From an accessibility standpoint, yes. To be able to pattern match where you are in I.D.E without using an accessibility api
lelag
Maybe they want to compile the Apollo Guidance Computer source code...
https://www.softwareheritage.org/wp-content/uploads/2019/07/...
Anything that mentions tesseract is about 10 years out of date at this point.