Thanks for the Internet Archive pointer. Hadn't thought of it (probably
because of a few past unsuccessful attempts to find archived pages.)
Tried BadgerFish (
http://libx.lib.vt.edu/services/code4lib/lccnrelay3/2004022563 which proxies
lccn.loc.gov's marcxml) and it meets the requirements of faithful
reproduction of the XML, albeit in a very verbose way that doesn't attempt
to do any minimization.
That leaves, indeed, two independent problems:
a) a free converter to GData's JSON format or another less redundant
convention than badgerfish. Looking at this for 1 second, I'm wondering if
this is even possible to implement without knowing the schema of the XML
document. It says, for instance, to use arrays [] for elements that may
occur more than once.
b) something MARC-specific to express MARC records in JSON. I talked to
Nathan Trail from LOC at code4lib, and they're revamping their lccn server
this year to scale up and also serve more formats. Presumably, this effort
could lead to a de-facto standard of how to serve MARC in JSON.
Thinking out loud about this for a minute, I'm wondering if part a) is
really a worthwhile goal. Aside from the impromptu prototyping of an
XML-to-JSON gateway, I don't see any production use for a XML to JSON
converter that is agnostic to the schema; for performance reasons alone.
- Godmar
|