OCRFeeder 0.8.1

Taking advantage of the holidays, I have been dedicating some time to my side projects so today I am giving you OCRFeeder version 0.8.1!

The last OCRFeeder version had a very important change which was the port to GObject introspection and I was already expecting a few bugs to pop up here and there. That proved to be true and so this version is mainly about bug fixing.
Specifically there was an issue related to GDK’s threads which caused the application to abort. Besides that, exporting a document or saving/loading a project was not working correctly due to unicode issues (because Python is very nice but working with unicode is sometimes more annoying than it should be, at least in versions prior to Python 3).
Anyway, all that should be working correctly now!

Besides squashing bugs, I also made some long due changes: made the Preferences dialog smaller (by adding its contents to a scrolled window) and migrated the application and engines’ settings to the XDG user configuration folder as opposed to .ocrfeeder.
Yes, I know that I should be using GSettings for the application’s settings by now but there were more critical changes to be done.
Besides a small change in the widgets that set a box’s type (from a radio button style to a non-indicator, grouped pair of buttons), there are no other UI changes but I really like how much more polished OCRFeeder seems with the nice recent GTK+ styles.

ocrfeeder-0.8.1-screenshot

Future

I have a number of ideas to make the application better not only in terms of UI/UX but also in terms of features. The detection algorithm hasn’t been touched for years and I am sure it can be improved not only in terms of performance but also in terms of accuracy.
One cool feature I’d love to see implemented is to have a quick way of translating a document’s contents. This would be helpful e.g. to users living abroad who might need to translate letters to a language they speak.
Nonetheless, as mentioned in my previous post about OCRFeeder, it is indeed not easy to find the time and motivation to dedicate to the project these days with all the work, life and other side projects so I don’t know when I will have time for it again. In that regard, if you want to give me a hand, you’d make me very happy as there is a lot of work to be done.

Happy holidays everyone!

Source tarball
Git
Bugzilla

OCRFeeder 0.7.11 released

Here is 2013’s first version of OCRFeeder, version 0.7.11.

For this version, a number of bugs were fixed, especially some that were affecting saving and loading projects.
Some small improvements were also made such as being able to load multiple images at once and being able to choose the OCR engine from the command line interface version of OCRFeeder (using the -e option).

Now for the main feature, I developed something that had been requested by a good number of users: being able to easily choose the language for the OCR engine.
When I developed OCRFeeder, I wanted to make it easy for users to use system-wide OCR engines from the layout analysis that OCRFeeder performs but I also wanted it to remain powerful and that’s why the engines are configured in a general, abstract way, as if from the command line.
Some OCR engines support setting the language in order to get a better recognition and while, users could already set the language of an engine manually using the OCR editor dialog, they wanted to have a nice drop-down list with the languages instead.
This represented a real challenge: to keep the old and flexible configuration and, at the same time, offer a high-level way of choosing the language.

OCRFeeder's new configuration
So here is how it works. There is a new special argument keyword $LANG that will be replaced by the new field “language argument” and the currently set language. Since engines support different languages (or none) and call them different names (e.g. Tesseract expects “por” for the Portuguese, others may expect “pt”) there is another new field called “languages” which should be a map between the language code in the ISO 639-1 and the name of the language of the engine expects, as shown in the screenshot.

Languages combo
To show the languages, there is a new tab in the areas’ editor called Misc (in lack of a better name for a tab that’s holding more stuff in the future) with the languages combo. This combo shows a check on the languages that the currently selected engine recognizes as seen in the screenshot.

There is also a new setting in the preferences dialog with the default language and the first time the application runs, it will assign it to the user’s locale.
One thing must be taken into account: even though Tesseract supports an extensive list of languages, the users must have those packages installed in their distros, otherwise, recognition will of course fail.

To finish, related to my recent job search, I have spent this week in San Francisco getting to know some people from an exciting start-up and despite the jet lag, I managed to finish this release so I can now say that least part of OCRfeeder was designed and developed in California ๐Ÿ˜›

Source tarball
Git
Bugzilla

OCRFeeder version 0.7.10

The previous OCRFeeder‘s version was released in April. I have been busy with Skeltrack and other projects but, between my personal time and Igalia‘s precious hackfest time, here we have a new version of the best Free Software OCR application.

For this 0.7.10 version I have improved the way that the document generators (the classes that generate the desired exportation formats) are used inside OCRFeeder. I have abstracted their use making it easy to add new document generators in the future.
The command line version, which has been limited to generating only the original exportation formats (ODT and HTML), also benefits from these changes; from this version on, it is possible to generate documents with any of the existing exportation format from the command line. For example, to generate a plain text file:

$ ocrfeeder-cli -i scan1.ppm -i scan2.jpeg -f TXT -o text_doc.txt

The current PDF exportation still has flaws that will take some time to fix but for now I have fixed a big issue: line wrap. The text lines would not wrap when written in the PDF document and so, long lines would go beyond the pages’ limits. This should be improved with this new version and I hope I have the time in the future to fix the other issues.

Moving (or swapping) pages by dragging them seems to have stopped working. This seems like a PyGTK bug but anyway it was the necessary excuse to implement actions for selecting and moving the pages using the menu or shortcuts. This will make the mentioned bug less important and also offers the possibility of moving pages easily to visually impaired users.

Screenshot of the select or move pages menus

Future

I want to fix some issues in OCRFeeder’s architecture, especially in what comes to the UI part. This should probably be done together with a port the amazing GObject’s Introspection.
Jan Losinski, from TU Dresden, was kind enough to send me some patches that make the OCRFeeder’s recognition parallel. This feature needs to be polished but it will likely land in the next version of OCRFeeder.
Last but not least, I need to check how to make it easy to integrate the user’s language in the OCR recognition. I exchanged some emails with the people from AltLinux distro who seem to have already implemented this in their repositories but I need time to try and review their patches.

Contribute

If you want to contribute and make this project better, fear not! The code is all Python and I’m available to help you get started so email me if you’re interested.

Enjoy OCRFeeder 0.7.10!

Source tarball
Git
Bugzilla