ABBYY's engine is pretty good; though depending on whether you've already
scanned the text you might end up with higher thruput by having the OCR
performed at each scanning station.
I'm not sure if the non-server software is multi-core/multi-processor
aware; the version that is used in the drivers for the scansnap S1500 is a
rev down from current.
Depending on your budget you might also want to take a look at KoFax; some
high end bulk scanners come with low end versions of kofax, but it can be
very nice, especially if you are acquiring documents that have stereotyped
layouts, since it can be trained to pick out metadata, and to distinguish
between document types.
Simon
Also, fujitsu document scanners ftw.
On Nov 4, 2011 11:38 AM, "Michael Della Bitta" <[log in to unmask]>
wrote:
> Hello, everyone,
>
> NYPL is currently investigating OCR solutions and I was wondering if
> anybody had any opinions. Currently toward the top of the pile are
> Tesseract and ABYY Recognition Server, each for different reasons, so
> I'd appreciate hearing about anybody's experiences with those two, but
> any information you might be able to provide would be most helpful.
>
> Thanks,
>
> Michael Della Bitta
>
> Senior Applications Developer
> Information Technology Group
> The New York Public Library
> 40 West 20th Street, 5th Floor
> New York, NY 10011-4211
> (212) 621-0609
>
|