Aug 30

Compiling tesseract OCR

Category: Linux   — Published by tengo on August 30, 2011 at 2:41 pm

At least on Ubuntu, you shouldn't!
sudo apt-get install tesseract-ocr (and more language specific packages available via apt!)

This here are the basic steps to compile tesseract, the open-sourced OCR software that rivals commercial packages (on Ubuntu 11.04 and similar):

sudo apt-get install subversion automake libtool leptonica-progs libleptonica libleptonica-dev

svn checkout http://tesseract-ocr.googlecode.com/svn/trunk/ tesseract-ocr
./runautoconf

Patch the configure file to include the path to /usr/include/leptonica (but it still fails for me!)

mkdir build-directory
cd build-directory
../configure
make
make install

More reading: cpan: Image::OCR::Tesseract, Compiling on OSX.