Print

Print


Alex,

I think the problem is data like this:

http://lccn.loc.gov/96516389/marcxml

And while we can probably figure out a pattern to get the semantics
out this record, there is no telling how many other variations exist
within our collections.

So we've got lots of this data that is both hard to parse and,
frankly, hard to find (since it has practically zero machine readable
data in fields we actually use) and it needs to coexist with some
newer, semantically richer format.

What I'm saying is that the library's legacy data problem is almost to
the point of being existential.  This is certainly a detriment to
forward progress.

Analogously (although at a much smaller scale), my wife and I have
been trying for about 2 years to move our checking account from our
out of state bank to something local.  The problem is that we have
built up a lot of infrastructure around our old bank (direct deposit
and lots of automatic bill pay, etc.):  migration would not only be
time consuming, any mistakes made could potentially be quite expensive
and we have a lot of uncertainty of how long it would actually take to
migrate (and how that might affect the flow of payments, etc.).  It's
been, to date, easier for us just to drive across the state line
(despite the fact that it's way out of our way to anywhere) rather
than actually deal with it.  In the meantime, more direct bill pay
things have been set up and whatnot making our eventual migration that
much more difficult.

I do think it would be useful to figure out what exactly in our legacy
data is found only in libraries (that is, we could ditch this shoddy
"The Last Waltz" record and pull the data from LinkedMDB or Freebase
or somewhere) and determine the scale of the problem that only we can
address, but even just this environmental scan is a fairly large
undertaking.

-Ross.

On Mon, Oct 25, 2010 at 10:10 PM, Alexander Johannesen
<[log in to unmask]> wrote:
> On Tue, Oct 26, 2010 at 12:48 PM, Bill Dueber <[log in to unmask]> wrote:
>> Here, I think you're guilty of radically underestimating "lots of people
>> around the library world." No one thinks MARC is a good solution to
>> our modern problems, and no one who actually knows what MARC
>> is has trouble understanding MARC-XML as an XML serialization of
>> the same old data -- certainly not anyone capable of meaningful
>> contribution to work on an alternative.
>
> Slow down, Tex. "Lots of people in the library world" is not the same
> as developers, or even good developers, or even good XML developers,
> or even good XML developers who knows what the document model imposes
> to a data-centric approach.
>
>> The problem we're dealing with is *hard*. Mind-numbingly hard.
>
> This is no justification for not doing things better. (And I'd love to
> know what the hard bits are; always interesting to hear from various
> people as to what they think are the *real* problems of library
> problems, as opposed to any other problem they have)
>
>> The library world has several generations of infrastructure built
>> around MARC (by which I mean AACR2), and devising data
>> structures and standards that are a big enough improvement over
>>  MARC to warrant replacing all that infrastructure is an engineering
>>  and political nightmare.
>
> Political? For sure. Engineering? Not so much. This is just that whole
> "blinded by MARC" issue that keeps cropping up from time to time, and
> rightly so; it is truly a beast - at least the way we have come to
> know it through AACR2 and all its friends and its death-defying focus
> on all things bibliographic - that has paralyzed library innovation,
> probably to the point of making libraries almost irrelevant to the
> world.
>
>> I'm happy to take potshots at the RDA stuff from the sidelines, but I never
>> forget that I'm on the sidelines, and that the people active in the game are
>> among the best and brightest we have to offer, working on a problem that
>>  invariably seems more intractable the deeper in you go.
>
> Well, that's a pretty scary sentence, for all sorts of reasons, but I
> think I shall not go there.
>
>> If you think MARC-XML is some sort of an actual problem
>
> What, because you don't agree with me the problem doesn't exist? :)
>
>> and that people
>> just need to be shouted at to realize that and do something about it, then,
>> well, I think you're just plain wrong.
>
> Fair enough, although you seem to be under the assumption that all of
> the stuff I'm saying is a figment of my imagination (I've been
> involved in several projects lambasted because managers think MARCXML
> is solving some imaginary problem; this is not bullshit, but pain and
> suffering from the battlefields of library development), that I'm not
> one of those developers (or one of you, although judging from this
> discussion it's clear that I am not), that the things I say somehow
> doesn't apply because you don't agree with, umm, what I'm assuming is
> my somewhat direct approach to stating my heretic opinions.
>
>
> Alex
> --
>  Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
> --- http://shelter.nu/blog/ ----------------------------------------------
> ------------------ http://www.google.com/profiles/alexander.johannesen ---
>