On Oct 11, 2013, at 1:49 PM, Matthew Sherman <[log in to unmask]> wrote:
>> For a limited period of time I am making publicly available a Web-based
>> program called PDF2TXT -- http://bit.ly/1bJRyh8
>
> Very slick, good work. I can see where this tool can be very helpful. It
> does have some issues with some characters, but this is rather common with
> most systems.
Again, thank you for the support. Yes, there are some escaping issues to be resolved. "Release early. Release often." I need help with the graphic design in general.
Here's an enhancement I thought of:
1. allow readers to authenticate
2. allow readers to upload documents
3. documents get saved in readers' cache
4. allow interface to list documents in the cache
5. provide text mining services against reader-selected documents
6. go to Step #1
It would also be cool if I could figure out how to finish the installation of Tesseract to enable OCRing. [1]
[1] OCRing - http://serials.infomotions.com/code4lib/archive/2013/201303/1554.html
--
Eric Morgan
|