Print

Print


Bill,

You have hit the nail on the head!!!!!  This is EXACTLY what I am trying to do!  It's the underlying stuff that I am trying to get at.   Looking at RDF may yield some good ideas.  But I am not thinking in terms of RDF or XML, triples, or MARC, standards, or any of that stuff that gets thrown around here.  Even the Internet is not terribly necessary.  I am thinking in terms of data structures, pointers, sparse matrices, relationships between objects and yes, set theory too -- things like that.  The former is pretty much cruft that lies upon the latter, and it mostly just gets in the way.  Noise, as you put it, Bill!

A big problem here is that Libraryland has a bad habit of getting itself lost in the details and going off on all kinds of tangents.  As I said before, the biggest prison is between the ears!!!!  Throw out all that junk in there and just start over!  When I begin programming this thing my only tools will be a programming language (C or Java) a text editor (vi) and my head.  But before I really start that, right now I am writing a paper that explains how this stuff works at a very low level.  It's mostly an effort to get my thoughts down clearly, but I will share a draft of it with y'all on here soon.

Peter Schlumpf


-----Original Message-----
>From: Bill Dueber <[log in to unmask]>
>Sent: Apr 9, 2009 10:37 PM
>To: [log in to unmask]
>Subject: Re: [CODE4LIB] Something completely different
>
>On Thu, Apr 9, 2009 at 10:26 AM, Mike Taylor <[log in to unmask]> wrote:
>
>> I'm not sure what to make of this except to say that Yet Another XML
>> Bibliographic Format is NOT the answer!
>>
>
>I recognize that you're being flippant, and yet think there's an important
>nugget in here.
>
>When you say it that way, it makes it sound as if folks are debating the
>finer points of OAI-MARC vs MARC-XML -- that it's simply syntactic sugar
>(although I'm certainly one to argue for the importance of syntactic sugar)
>over the top of what we already have.
>
>What's actually being discussed, of course, is the underlying data model.
>E-R pairs primarily analyzed by set theory, triples forming directed graphs,
>whether or not links between data elements can themselves have attributes --
>these are all possible characteristics of the fundamental underpinning of a
>data model to describe the data we're concerned with.
>
>The fact that they all have common XML representations is noise, and
>referencing the currently-most-common xml schema for these things is just
>convenient shorthand in a community that understands the exemplars. The fact
>that many in the library community don't understand that syntax is not the
>same as a data model is how we ended up with RDA.  (Mike: I don't know your
>stuff, but I seriously doubt you're among that group. I'm talkin' in
>general, here.)
>
>Bibliographic data is astoundingly complex, and I believe wholeheartedly
>that modeling it sufficiently is a very, very hard task. But no matter the
>underlying model, we should still insist on starting with the basics that
>computer science folks have been using for decades now: uids  (and, these
>days, guids) for the important attributes, separation of data and display,
>definition of sufficient data types and reuse of those types whenever
>possible, separation of identity and value, full normalization of data, zero
>ambiguity in the relationship diagram as a fundamental tenet, and a rigorous
>mathematical model to describe how it all fits together.
>
>This is hard stuff. But it's worth doing right.
>
>
>
>
>-- 
>Bill Dueber
>Library Systems Programmer
>University of Michigan Library