Joaquim Rocha
Published on

OCRFeeder 0.7.8

  • avatar
    Joaquim Rocha
  • Principal Software Engineering Manager at Microsoft

That’s right, one more release of OCRFeeder. If you’re wondering why so much time for apparently so little changes, it has to do with some super cool things I’ve been working on at Igalia, but you’ll know about that really soon.

This new release brings a few bug fixes such as:

  • Fix recognition after using the Unpaper tool;
  • Fix an Unpaper issue due to an nonexistent variable
  • Prevent the version of Tesseract OCR engine from appearing in the recognized text

This last issue happened after an update in Tesseract which made it print “Tesseract Open Source OCR Engine v3.02 with Leptonica” to the standard output. Since the default way that the Tesseract engine is configured wasn’t discarding the text printed to the standard output, it would appear as part of the recognized text.

After a bit of discussion in the bug report, the conclusion was that OCRFeeder needed a way to detect the changes in the OCR engines’ configuration. This means this new version includes a way to check the needs for these updating the configuration and will warn the user about it once (on start-up). If it can update the engines’ configuration automatically it will say so and ask for confirmation, otherwise it will ask the user to change it manually and offer a way to open the OCR Engines Manager dialog. The pictures below show what I just wrote:

OCRFeeder warnings

OCRFeeder warnings

(note that the first time you use this new version and since this feature wasn’t extensively tested, it might warn you even for engines that do not need a change; still, if it happens, it’ll be only once)

To see the entire list of changes and the amazing work of the GNOME i18n team, check out the NEWS file.

Source Tarball Git Bugzilla