Kevin, This may or may not be what you are looking for, and definitely a solution only an engineer would think of. In JavaScript/jQuery an image can be uploaded into an html canvas form (maybe .pdf also, not sure) and displayed inside the browser (can be saved if needed). Using Google Visions AI API .txt can be pulled from the graphic that is being displayed in the browser. My experience has been that the OCR results are very very good. This may or may not be a solution. Brent Fergsuon, MLS Librarian, Elkhart Public Library https://myepl.org ________________________________ From: Code for Libraries <[log in to unmask]> on behalf of Kevin Schlottmann <[log in to unmask]> Sent: Tuesday, June 16, 2020 12:48 PM To: [log in to unmask] <[log in to unmask]> Subject: [CODE4LIB] Zonal OCR for catalog cards Hi all, As we get deeper into our work-from-home projects, we are getting to collections that were richly described using catalog cards, long before computerized systems for discovery were adopted. Our card scanner allows us to quickly convert these cards to PDF, but rather than copying-and-pasting the text, I'm hoping to go a step further and get structured data off of them. I'm wondering if anyone here has ever leveraged zonal OCR, such as the kind used for business cards or invoices, to break out OCRed data in catalog cards. I did a quick Google and a search in the archives here, but didn't see anything right off the bat. I think the basic tools for throwing something together are all there, but I'm hoping someone has already explored this and stitched something together. Kevin --- Kevin Schlottmann Head of Archives Processing Rare Book & Manuscript Library Butler Library, Room 801 Columbia University 535 W. 114th St., New York, NY 10027 (212) 854-8483