I'm working on scanning some documents in a collection and then preforming OCR on the documents. Thus far, I've used Adobe Acrobat Pro's OCR function with some success but the machines I'm working on are fairly old Pentium 4 Dell boxes, this makes opening 600 DPI scans painful and preforming OCR an entirely valid excuse for a long coffee break. As you might expect, I'm looking for a way to speed up this process at the OCR end of things, since the scanning can only move so quickly. I'm wondering if any of you have experience with any open OCR solutions such as: Tesseract-OCR <http://code.google.com/p/tesseract-ocr/> or ocropus<http://code.google.com/p/ocropus/>. At a glance, Tesseract seems to be further along in development. Any other suggestions on how best to approach this sort of task would be appreciated if you've done similar work. I've got my own Ubuntu Server I'm planning on evaluating one or both of these on, as much for my own interest as the project's or the organization's. Since I'm an unpaid part-time intern and the only one who's working on this project, I'm willing to learn to do things the hard way so they're easier in the long run. Thanks for any suggestions or advice you may be able to offer. -- ~Andrew M. Kelly MLIS Degree Candidate, Simmons GSLIS 2011 Archives & Librarianship Intern, Boston University: African Presidential Archive & Research Center Evening Library Assistant, Bay State College twitter: @a_m_kelly