Erica:
I've used Paperwork (https://openpaper.work/en/) in the past with good
results. It's open source and runs on Linux and Windows. If you'd be
interested in running a web application you might give some of these
options a look (https://github.com/kba/awesome-ocr#ocr-gui) or maybe
even look into a document management web application
(https://github.com/awesome-selfhosted/awesome-selfhosted#document-management)
though the later might be overkill for your use case.
Finally, if you're running a Mac somewhere and have money to spend, I
cannot overstate how much I love DevonThink
(https://www.devontechnologies.com/apps/devonthink) which has a server
version and uses ABBYY on the backend. My quick test this morning
suggests it doesn't have the issue you're describing.
best,
ak
--
ander kierig
Web Application Developer
University of Minnesota Libraries
[lib.umn.edu](https://www.lib.umn.edu)
they/them
On 2022-08-05 at 18:12 (-0500) Erica FINDLEY wrote:
> All,
>
> ABBYY has been a favorite program of mine for transforming batches of
> TIFF
> files into a PDF and extracting the text.
>
> However, I have recently run into this known issue
> <https://support.abbyy.com/hc/en-us/articles/360013874239-Each-page-is-duplicated-with-the-thumbnail-image-while-converting-TIFF-to-PDF-in-FineReader>even
> though each TIFF file is the same resolution.
>
> I opened a support ticket with ABBYY and their proposed resolution is
> for
> me to convert to another format (jpg) then to pdf. I do not like this
> for
> two reasons 1)it is time and resource consuming to do two
> transformations
> and 2) there is some image quality loss when doing this.
>
>
> This leaves me with two questions:
>
> 1. Has anyone been able to find a better workaround for this issue?
>
> 2. Does anyone have recommendations for another GUI based OCR program?
> My
> quick research is pointing to Tesseract, but since I work with
> volunteers
> I'd prefer a GUI based solution.
>
> Thanks!
>
> *Erica Findley (she/her)*
> *Systems & Metadata Librarian*
> *x80591*
> Multnomah County Library
> Isom Operations Center: Thu 8 am - 5 pm, Fri 1:30 pm - 5:30 pm
> Teleworking: Mon - Wed 8 am - 5 pm, Fri 8 am - 12 pm
> multcolib.org <http://www.multcolib.org/>
> My pronouns are she/her/hers
|