Dmitry Nikitin, a Russian OCRFeeder user have sent me RPM packages (for Mandriva 2010) for OCRFeeder together with the spec file and everything.
I haven’t tried those yet but since I have also received email asking for RPM packages, I have uploaded those to Igalia‘s server so you can download them here.
I will soon create a branch with the RPM package generation so it’s easy for anyone to build them.
Big thanks to Dmitry Nikitin.
Update: I had the idea Dmitry was Russian judging from his offering to translate OCRFeeder to Russian (apart from Ukrainian) and his email being from a .ru domain. Dmitry has clarified he is, in fact, Ukrainian and so I’m sorry for the confusion and would like to rectify it with this update.
Joaquim, do you know why ocrfeeder is not packaged by more distributions? I just installed tesseract 3.0 from Fedora’s rpm repository, and I would love to install ocrfeeder also from rpm.
Hi Oscar,
It’s not package because nobody did it yet ๐
I use Fedora but I try to use my time for coding instead of packaging. Still, I’ll try to push for this as soon as I can.
Cheers,
Hello Joaquim,
I’ve been using tesseract+fedora for a long time, and now that tesseract 3.0 has been pushed to stable repos, I would like to try this GUI. I usually work with scientific papers and the way I OCR them is as follows:
Crop all text areas with Gimp, and save them in Suffix-XXX.tif, where XXX is a serial number, and then use very simple a bash script to read all tifs (OCR, this batch processing is very fast actually) and finally concatenate (cat *.txt > file.txt) all files.
As you can see, the slowest step in the process is to crop textboxes. I can do it very fast in gimp by using a few hot keys, but it is actually slow and tedious for big projects. I think it would be superb to have some kind of batch croping, based on a template, provided that all pages have the same layout, or use a differetn template for first, odd and eaven pages. Do you think this could be possible to be coded?. I know nothing about coding, so I can only help you with this idea.
Anyway, I haven’t tryed your software, but it looks awesome in the videocast. Other propietary OCRs should be thrilling with this piece of work (did you read that Abbyy, lol). Thanks for releasing it to the free world (GPL).
Greetings form Mexico.
Hey Christian,
The feature you’re talking about seems like a specialized use case but I think it is interesting, so, I’ll think about how we could implement that.
My problem is lack of time, let’s see if I find the time to do it.
Thanks for the kind words about OCRFeeder.