How we made our optical character recognition (OCR) code more accurate

abc-1

Anything that mentions tesseract is about 10 years out of date at this point.

booder1

5.5.0 released November last year. Still a very active project as far as I can tell and runs on CPU. Even compared to best open source GPU option it is still pretty good. VLMs work very differently and don't work as well for everything. Why is it out of date?

amelius

Well, at least I can apt-get install tesseract.

That doesn't hold for any of the GPU-based solutions, last time I checked.

null

[deleted]

camtarn

Neat article, but I feel like I have no idea why they're doing this! Is transcribing code from images really such a big use case?

dewey

> To best support software engineers when they want to transcribe code from images, we fine-tuned our pre-processing pipeline to screenshots of code in IDEs, terminals, and online resources like YouTube videos and blog posts.

Even with these examples that seems like a very narrow use case.

FloatArtifact

From an accessibility standpoint, yes. To be able to pattern match where you are in I.D.E without using an accessibility api

lelag

Maybe they want to compile the Apollo Guidance Computer source code...

https://www.softwareheritage.org/wp-content/uploads/2019/07/...

HN

How we made our optical character recognition (OCR) code more accurate

How we made our optical character recognition (OCR) code more accurate