Sounds like it is moot, but I would have tried to determine if the catalog
supported Z39.50 and pull the MARC records that way. Ex Libris
theoretically supports the protocol, but I suppose they might charge to
open up the port.
https://developers.exlibrisgroup.com/alma/integrations/z39-50/
In a pinch, you might be able to use another library's Z39.50 endpoint for
the source data. Even records for the educational kits could probably be
found at another library, especially if you knew of one supporting an
education program with a hands-on focus. Library of Congress still has a
web page of Z39.50 resources, including Z39.50 hosts available for testing
on.
https://www.loc.gov/z3950/agency/resources/
https://www.loc.gov/z3950/agency/resources/testport.html
Some years ago, I was interviewing for a systems librarian position (and
coming from a position as a science librarian) and the library I was
interviewing with suggested a topic covering mobile catalogs, which were
becoming a hot button item for library administrators. For my presentation
to the library staff and faculty, I wrote an extremely ad hoc web interface
in PHP to simulate a mobile catalog and pulling their own data using
Z39.50. As I recall, Z39.50, at least with that catalog backend, retrieved
the MARC records, but not the holdings information linked to it, so the
call numbers being displayed sometimes did not reflect local cataloging or
were missing entirely when it was not present in the MARC record. Still,
good enough that it didn't disgrace me.
Tom
On Tue, Jul 22, 2025 at 10:55 AM Hammer, Erich F <[log in to unmask]> wrote:
> It's a not-very-interesting story of disorganization, poor communication,
> too few employees and a touch of corporate greed:
>
> A nearby, small college shuttered. Our University decided to try to scoop
> up the well-regarded early-education program and snag the former library's
> unique collection of educational "kits". The former site was scheduled for
> deletion in short-order, and ExLibris essentially tried to extort us for a
> ridiculously astronomical amount to give us the records. Nobody thought to
> ask our sole developer (who may have been able to scrape the records in a
> useable format) until they had just left for a 3-month parental leave, so
> someone assigned a student to manually bring up all the records to capture
> the information. Their solution was to generate PDFs of every page. The
> site and data is no more at this point, so we have what we have.
>
> The PDFs were generated with text, not OCR'd (as I originally suggested),
> so the text is accurate. However, the strings are broken up, and of
> course, PDF readers don't know how the text "fits" together. Thus,
> selected text is recognized in columns, but not of the same length due to
> wrapping. It's a mess.
>
> Erich
>
>
> On Monday, July 21, 2025 at 21:48, Kyle Banerjee eloquently inscribed:
>
> > On Mon, Jul 21, 2025 at 12:20 PM Hammer, Erich F <[log in to unmask]>
> > wrote:
> >
> >> Without going into details, we inherited a sizeable collection of
> physical
> >> materials from another library, and were only able to capture the unique
> >> MARC records in image (PDF) form.
> >
> > The details provide the parameters for the easiest/best methods (and
> > it's hard to imagine there's not a good story behind getting stuck with
> > images of records without actually having records). I assume there's a
> > reason you don't just do the conversion in Acrobat or use one of the
> > many utilities or services.
> >
> > A true OCR process is likely to be error prone, I'd be concerned about
> > positional data and encoding issues even if the other stuff is right.
> > Parsing for identifiers and downloading actual MARC records might prove
> > faster and more reliable if these aren't local only.
> >
> > kyle
>
>
>
|