I have used Microsoft Office Document Imaging that works really well with tiff files. Most, if not all scanners, will scan into tiffs which you can then convert into text, rtf or word files easily.
The other one I used was Pro Millennium which is compatible with ms word, excel etc.
I would highly recommend both of them.


-----Original Message-----
From: Code for Libraries [mailto:[log in to unmask]] On Behalf Of Emmanuel Di Pretoro
Sent: Tuesday, 3 February 2009 7:54 PM
To: [log in to unmask]
Subject: Re: [CODE4LIB] "best" OCR package?


It wasn't a recommendation since I never try it, but I've heard a lot of good about tesseract. It was currently developed by Google, but I don't know if they use it.

Some link :

Hope this help,

Emmanuel Di Pretoro

2009/2/3 Alberto Accomazzi <[log in to unmask]>

> Sorry if this is a bit off-topic, but I was wondering if any of you
> clever fellows have a recommendation for an OCR package, possibly with
> a native linux port.  I know about OCRopus but I have a feeling that
> commercial products still have a significant edge over public domain
> packages.  So what are you using and/or do you know what the big guys
> (google, IA, microsoft) are using?
> Thanks,
> -- Alberto
> --
> Dr. Alberto Accomazzi                  aaccomazzi(at)cfa harvard edu
> Project Manager
> NASA Astrophysics Data System              
> Harvard-Smithsonian Center for Astrophysics
> 60 Garden St, MS 67, Cambridge, MA 02138, USA

Please Note: The information contained in this e-mail message 
and any attached files may be confidential information and 
may also be the subject of legal professional privilege.  If you are
not the intended recipient, any use, disclosure or copying of this
e-mail is unauthorised.  If you have received this e-mail by error
please notify the sender immediately by reply e-mail and delete all
copies of this transmission together with any attachments.