Print

Print


I completely agree with Karen regarding how FRBR falls short in not 
allowing for more relationships between Group 1-2 and Group 3 entities. 
FRBRoo fleshes out some of these things, but in a woefully unweildy way, 
IMO. Conversely, FRBR in RDF (at http://vocab.org/frbr) consolidates 
some classes and properties (e.g. Responsible entity, a superclass of 
Person, Family and Corporate body), and to me approaches the kind of 
extensibility we need. Unfortunately, it does not include data 
properties, which I agree are problematic, as Karen illustrates.

I do maintain that FRBR is the kind of *conceptual* model that, for the 
most part, can guide the development of effective data structures. 
However, it is far too abstract to be implemented verbatim. This is what 
I think RDA is trying to do with attributes like "Title for the work" I 
wonder: why is there not an ontology expert on the JSC?? (If I'm wrong 
and there is, someone please correct me)

Casey

Karen Coyle wrote:
> Extensibility as absolutely key. I know that some people consider XML 
> to be inherently extensible, but I'm concerned that the conceptual 
> model presented by FRBR doesn't support extensibility. For example, 
> the FRBR entity "Place" represents only the place as a subject. If you 
> want to represent places anywhere else in the record, you are SOL. 
> Ditto the "Event" entity. The attributes in FRBR have no inherent 
> structure, so you have, say, Manifestation with a whole page of 
> attributes that are each defined at the most detailed level. You have 
> "reduction ratio (microform)" but no "reproduction info" field that 
> you could extend for another physical format. You have "date of 
> publication" but no general "date" property that could be extended to 
> other dates that are needed (in fact, the various date fields have no 
> relation to each other).
>
> To have an extensible data structure we need to have some foundation 
> "classes" that we can build on, and nothing in FRBR, RDA, or MARC 
> gives us that.
>
> kc
>
> Casey A Mullin wrote:
>> (Attention: lurker emerging)
>>
>> To me what it comes down to is neither simplicity nor complexity, but 
>> extensibility. In a perfect world, our data models should be capable 
>> of representing very sophisticated and robust relationships at a high 
>> level of granularity, while still accommodating ease of metadata 
>> production and contribution (especially by non-experts and those 
>> outside the library community).
>>
>> I agree that none of our existing data structures/syntaxes are /a 
>> priori /fundamental or infallible. But what is promising to me about 
>> RDF is its intuitive mode of expression and extensibility (exactly 
>> the kind I advocate above).
>>
>> Casey
>>
>> Han, Yan wrote:
>>> Bill and Peter,
>>>
>>> Very nice posts. XML, RDF, MARC and DC are all different ways to 
>>> present information in a way (of course, XML, RDF, and DC are easier 
>>> to read/processed by machine).
>>> However, down the fundamentals, I think that it can go deeper, 
>>> basically data structure and algorithms making things works. RDF 
>>> (with triples) is a directed graph. Graph is a powerful (the most 
>>> powerful?) data structure that you can model everything. However, 
>>> some of the graph theory/problems are NP-hard problems. In 
>>> fundamental we are talking about Math. So a balance needs to be 
>>> made. (between how complex the model is and how easy(or possible) to 
>>> get it implemented). As computing power grows, complex data modeling 
>>> and data mining are on the horizon.
>>>
>>> Yan
>>>
>>> -----Original Message-----
>>> From: Code for Libraries [mailto:[log in to unmask]] On Behalf 
>>> Of Peter Schlumpf
>>> Sent: Thursday, April 09, 2009 10:09 PM
>>> To: [log in to unmask]
>>> Subject: [CODE4LIB] You got it!!!!! Re: [CODE4LIB] Something 
>>> completely different
>>>
>>> Bill,
>>>
>>> You have hit the nail on the head!!!!!  This is EXACTLY what I am 
>>> trying to do!  It's the underlying stuff that I am trying to get 
>>> at.   Looking at RDF may yield some good ideas.  But I am not 
>>> thinking in terms of RDF or XML, triples, or MARC, standards, or any 
>>> of that stuff that gets thrown around here.  Even the Internet is 
>>> not terribly necessary.  I am thinking in terms of data structures, 
>>> pointers, sparse matrices, relationships between objects and yes, 
>>> set theory too -- things like that.  The former is pretty much cruft 
>>> that lies upon the latter, and it mostly just gets in the way.  
>>> Noise, as you put it, Bill!
>>>
>>> A big problem here is that Libraryland has a bad habit of getting 
>>> itself lost in the details and going off on all kinds of tangents.  
>>> As I said before, the biggest prison is between the ears!!!!  Throw 
>>> out all that junk in there and just start over!  When I begin 
>>> programming this thing my only tools will be a programming language 
>>> (C or Java) a text editor (vi) and my head.  But before I really 
>>> start that, right now I am writing a paper that explains how this 
>>> stuff works at a very low level.  It's mostly an effort to get my 
>>> thoughts down clearly, but I will share a draft of it with y'all on 
>>> here soon.
>>>
>>> Peter Schlumpf
>>>
>>>
>>> -----Original Message-----
>>>  
>>>> From: Bill Dueber <[log in to unmask]>
>>>> Sent: Apr 9, 2009 10:37 PM
>>>> To: [log in to unmask]
>>>> Subject: Re: [CODE4LIB] Something completely different
>>>>
>>>> On Thu, Apr 9, 2009 at 10:26 AM, Mike Taylor <[log in to unmask]> 
>>>> wrote:
>>>>
>>>>   
>>>>> I'm not sure what to make of this except to say that Yet Another XML
>>>>> Bibliographic Format is NOT the answer!
>>>>>
>>>>>       
>>>> I recognize that you're being flippant, and yet think there's an 
>>>> important
>>>> nugget in here.
>>>>
>>>> When you say it that way, it makes it sound as if folks are 
>>>> debating the
>>>> finer points of OAI-MARC vs MARC-XML -- that it's simply syntactic 
>>>> sugar
>>>> (although I'm certainly one to argue for the importance of 
>>>> syntactic sugar)
>>>> over the top of what we already have.
>>>>
>>>> What's actually being discussed, of course, is the underlying data 
>>>> model.
>>>> E-R pairs primarily analyzed by set theory, triples forming 
>>>> directed graphs,
>>>> whether or not links between data elements can themselves have 
>>>> attributes --
>>>> these are all possible characteristics of the fundamental 
>>>> underpinning of a
>>>> data model to describe the data we're concerned with.
>>>>
>>>> The fact that they all have common XML representations is noise, and
>>>> referencing the currently-most-common xml schema for these things 
>>>> is just
>>>> convenient shorthand in a community that understands the exemplars. 
>>>> The fact
>>>> that many in the library community don't understand that syntax is 
>>>> not the
>>>> same as a data model is how we ended up with RDA.  (Mike: I don't 
>>>> know your
>>>> stuff, but I seriously doubt you're among that group. I'm talkin' in
>>>> general, here.)
>>>>
>>>> Bibliographic data is astoundingly complex, and I believe 
>>>> wholeheartedly
>>>> that modeling it sufficiently is a very, very hard task. But no 
>>>> matter the
>>>> underlying model, we should still insist on starting with the 
>>>> basics that
>>>> computer science folks have been using for decades now: uids  (and, 
>>>> these
>>>> days, guids) for the important attributes, separation of data and 
>>>> display,
>>>> definition of sufficient data types and reuse of those types whenever
>>>> possible, separation of identity and value, full normalization of 
>>>> data, zero
>>>> ambiguity in the relationship diagram as a fundamental tenet, and a 
>>>> rigorous
>>>> mathematical model to describe how it all fits together.
>>>>
>>>> This is hard stuff. But it's worth doing right.
>>>>
>>>>
>>>>
>>>>
>>>> -- 
>>>> Bill Dueber
>>>> Library Systems Programmer
>>>> University of Michigan Library
>>>>     
>>
>
>

-- 
Casey A. Mullin
Discovery Metadata Librarian
Metadata Development Unit
Stanford University Libraries
650-736-0849 
[log in to unmask]
http://www.caseymullin.com