I recommend looking at pdfbeads. It's in ruby and the documentation is mostly in Russian ( http://rubyforge.org/docman/view.php/9752/10692/pdfbeads.ru.html ), but it provides both a library and an easy to use executiable to build PDFs out of hOCR files and images. You literally just point it at a directory with page images and hOCR files and it spits out a PDF. Very handy. Also, the DIY Book Scanner forum (diybookscanner.org ) is a great resource if you're into these sorts of things... Eric Lease Morgan wrote: > > On Mar 13, 2013, at 8:07 AM, Ben Brumfield<[log in to unmask]> wrote: > >> >> https://github.com/idigbio-aocr/RESTAPI/tree/master/doc > > > Interesting. Printed for future reference. Thank you. > > BTW, I did finally get Image::OCR::Tesseract to make, make test, and > make install correctly. I did not have the correct/proper libraries > installed for Tesseract's supporting Leptonica library. Now I need to > find a PDF library similar to libtff and libpng. > > -- > Eric Morgan