Definitely worth checking out
http://docmorph1.nlm.nih.gov/docmorph/mymorph.htm as it's free and
government sponsored (another way of saying you've already paid for it
once... :) also am a fan of adobe acrobat's ocr and optimizer.
On 11/4/11 2:25 PM, "Simon Spero" <[log in to unmask]> wrote:
>ABBYY's engine is pretty good; though depending on whether you've already
>scanned the text you might end up with higher thruput by having the OCR
>performed at each scanning station.
>I'm not sure if the non-server software is multi-core/multi-processor
>aware; the version that is used in the drivers for the scansnap S1500 is a
>rev down from current.
>Depending on your budget you might also want to take a look at KoFax;
>high end bulk scanners come with low end versions of kofax, but it can be
>very nice, especially if you are acquiring documents that have stereotyped
>layouts, since it can be trained to pick out metadata, and to distinguish
>between document types.
>Also, fujitsu document scanners ftw.
>On Nov 4, 2011 11:38 AM, "Michael Della Bitta"
><[log in to unmask]>
>> Hello, everyone,
>> NYPL is currently investigating OCR solutions and I was wondering if
>> anybody had any opinions. Currently toward the top of the pile are
>> Tesseract and ABYY Recognition Server, each for different reasons, so
>> I'd appreciate hearing about anybody's experiences with those two, but
>> any information you might be able to provide would be most helpful.
>> Michael Della Bitta
>> Senior Applications Developer
>> Information Technology Group
>> The New York Public Library
>> 40 West 20th Street, 5th Floor
>> New York, NY 10011-4211
>> (212) 621-0609