On Mon, Jul 21, 2025 at 12:20 PM Hammer, Erich F <[log in to unmask]> wrote:
> Without going into details, we inherited a sizeable collection of physical
> materials from another library, and were only able to capture the unique
> MARC records in image (PDF) form.
The details provide the parameters for the easiest/best methods (and it's
hard to imagine there's not a good story behind getting stuck with images
of records without actually having records). I assume there's a reason you
don't just do the conversion in Acrobat or use one of the many utilities or
services.
A true OCR process is likely to be error prone, I'd be concerned about
positional data and encoding issues even if the other stuff is right.
Parsing for identifiers and downloading actual MARC records might prove
faster and more reliable if these aren't local only.
kyle
|