Thanks for all of these responses, I'm looking forward to investigating further over the weekend. I'll let you know how it goes. ~Andrew M. Kelly MLIS DegreeCandidate, Simmons GSLIS 2011 Archives & Librarianship Intern, Boston University: African Presidential Archive & Research Center Evening Library Assistant, Bay State College twitter: @a_m_kelly On Wed, Jul 28, 2010 at 8:57 PM, Jason Ronallo <[log in to unmask]> wrote: > I don't know if this would help, but you may want to look at this > script which wraps cuneiform and other utilities to OCR a PDF. Since > you're not starting with a PDF you could modify it or write something > similar in your scripting language of choice. > > http://github.com/gkovacs/pdfocr > https://launchpad.net/~gezakovacs/+archive/pdfocr > > Jason > > On Wed, Jul 28, 2010 at 11:46 AM, Andy Kelly <[log in to unmask]> wrote: > > I'm working on scanning some documents in a collection and then > preforming > > OCR on the documents. Thus far, I've used Adobe Acrobat Pro's OCR > function > > with some success but the machines I'm working on are fairly old Pentium > 4 > > Dell boxes, this makes opening 600 DPI scans painful and preforming OCR > an > > entirely valid excuse for a long coffee break. > > > > As you might expect, I'm looking for a way to speed up this process at > the > > OCR end of things, since the scanning can only move so quickly. I'm > > wondering if any of you have experience with any open OCR solutions such > as: > > Tesseract-OCR <http://code.google.com/p/tesseract-ocr/> or > > ocropus<http://code.google.com/p/ocropus/>. > > At a glance, Tesseract seems to be further along in development. Any > other > > suggestions on how best to approach this sort of task would be > appreciated > > if you've done similar work. > > > > I've got my own Ubuntu Server I'm planning on evaluating one or both of > > these on, as much for my own interest as the project's or the > > organization's. Since I'm an unpaid part-time intern and the only one > who's > > working on this project, I'm willing to learn to do things the hard way > so > > they're easier in the long run. > > > > Thanks for any suggestions or advice you may be able to offer. > > > > -- > > ~Andrew M. Kelly > > MLIS Degree Candidate, Simmons GSLIS 2011 > > Archives & Librarianship Intern, Boston University: African Presidential > > Archive & Research Center > > Evening Library Assistant, Bay State College > > twitter: @a_m_kelly > > >