LISTSERV 16.5 - CODE4LIB Archives

MARC records break parsing far too frequently. Apart from requiring no
truly specialized tools, MARCXML should—should!—eliminate many of
those problems. That's not to mention that MARC character sets vary a
lot (DanMARC anyone?), and more even in practice than in theory.

From my perspective the problem is simply that MARCXML isn't as
ubiquitous as MARC. For what we do, at least, there's no point. We'd
need to parse non-XML MARC data anyway. So if we're going to do it, we
might as well do it for everything.

Best,
Tim

On Mon, Oct 25, 2010 at 2:38 PM, Nate Vack <[log in to unmask]> wrote:
> Hi all,
>
> I've just spent the last couple of weeks delving into and decoding a
> binary file format. This, in turn, got me thinking about MARCXML.
>
> In a nutshell, it looks like it's supposed to contain the exact same
> data as a normal MARC record, except in XML form. As in, it should be
> round-trippable.
>
> What's the advantage to this? I can see using a human-readable format
> for poorly-documented file formats -- they're relatively easy to read
> and understand. But MARC is well, well-documented, with more than one
> free implementation in cursory searching. And once you know a binary
> file's format, it's no harder to parse than XML, and the data's
> smaller and processing faster.
>
> So... why the XML?
>
> Curious,
> -Nate
>



-- 
Check out my library at http://www.librarything.com/profile/timspalding