On Oct 14, 2013, at 4:49 PM, Robert Haschart <[log in to unmask]> wrote:
>> For a limited period of time I am making publicly available a Web-based program called PDF2TXT --http://bit.ly/1bJRyh8
>
> Although based on some subsequent messages where you mention tesseract
> maybe I misunderstood and your tool only handles pdfs that have already
> been OCR'ed which would explain why the second document (which only
> contains page images) fails.
Robert, that's correct. As of right now the document needs to have been previously OCRed. --Eric
|