I inadvertently purchase ABBYY Finereader 11 Corporate thinking that it would be capable of outputting to ALTO XML. I was wrong. ABBYY Finereader Engine does:/
Ultimately, I want to OCR some newspaper images and export them to ALTO XML and, until the proof of concept is done, I want to try to do it on the cheap. My plan this morning was to write some scripts to OCR them using Microsoft Office Document Imaging (MODI) and then export the results to ALTO XML which could be a big project. Has anyone done this before or know of a quick and dirty way to get some OCR data?
Paul Smith's College