LISTSERV 16.5 - CODE4LIB Archives

> -----Original Message-----
> From: Code for Libraries [mailto:[log in to unmask]] On Behalf Of Karen Coyle
> Sent: Sunday, December 11, 2011 3:47 PM
> To: [log in to unmask]
> Subject: Re: [CODE4LIB] Namespace management, was Models of MARC in RDF
> 
> Quoting Richard Wallis <[log in to unmask]>:
> 
> 
> > You get the impression that the BL "chose a subset of their current
> > bibliographic data to expose as LD" - it was kind of the other way around.
> > Having modeled the 'things' in the British National Bibliography
> > domain (plus those in related domain vocabularis such as VIAF, LCSH,
> > Geonames, Bio, etc.), they then looked at the information held in
> > their [Marc] bib records to identify what could be extracted to populate it.
> 
> Richard, I've been thinking of something along these lines myself, especially as I see the number of
> "translating X to RDF" projects go on. I begin to wonder what there is in library data that is
> *unique*, and my conclusion is: not much. Books, people, places, topics: they all exist independently
> of libraries, and libraries cannot take the credit for creating any of them. So we should be able to
> say quite a bit about the resources in libraries using shared data points -- and by that I mean, data
> points that are also used by others. So once you decide on a model (as BL did), then it is a matter of
> looking *outward* for the data to re-use.

Trying to synthesize what Karen, Richard and Simon have bombarded us with here, leads me to conclude that linking to existing (or to be created) external data (ontologies and representations) is a matter of: being sure what you’re the system's current user's context is, and being able to modify the external data brought into the users virtual EMU(see below *** before reading further). I think Simon is right that "records" will increasingly become virtual in that they are composed as needed by this user for this purpose at this time. We already see this in practice in many uses from adding cover art to book MARC records to just adding summary information to a "management level" report. Being able to link from a "book" record to foaf:person and a bib:person records and extract data elements from each as they are needed right now should not be too difficult. As well as a knowledge of the current need, it requires a semantically based mapping of the different elements of those "people" representations. The neat part is that the total representation for that person may be expressed through both foaf: and bib: facets from a single EMU which contains all things known about that person, and so our two requests for linked data may, in fact should, be mining the same resource, which will translate the data to the format we ask for each time, and then we will combine those representations back to a collapsed single data set.

I think Simon (maybe Richard, maybe all of you) was working towards a single unique EMU for the entity which holds all unique information about it for a number of different uses/scenarios/facets/formats. Of course deciding on what is unique and what is obtained from some more granular breakdown is another issue. (Some experience with this "onion skin" modeling lies deep in my past, and may need dredging up.)

It is also important, IMHO, to think about the repository from of entity data (the EMU) and the transmission form (the data sent to a requesting system when it asks for "foaf:person" data). They are different and have different requirements. If you are going to allow all these entity data elements to be viewed through a "format filter" then we have a mixed model, but basically a whole-part between the EMU and the transmission form. (e.g. the full data set contains the person's current address, but the transmitted response sends only the city). Argue amongst yourselves about whether an address is a separate entity and is linked to or not - it makes a simple example to consider it as part of the EMU.

All of this requires that we think of the web of data as being composed not of static entities with a description which is fixed at any snapshot in time, but being dynamic in that what two users see of the same entity maybe different at exactly the same instant. So not only a descriptive model structure, but also a set of semantic mappings, a context resolution transformation, and the system to implement it each time a link to related data is followed.

> 
> I maintain, however, as per my LITA Forum talk [1] that the subject headings (without talking about
> quality thereof) and classification designations that libraries provide are an added value, and we
> should do more to make them useful for discovery.
> 
> 
> >
> > I know it is only semantics (no pun intended), but we need to stop
> > using the word 'record' when talking about the future description of 'things' or
> > entities that are then linked together.   That word has so many built in
> > assumptions, especially in the library world.
> 
> I'll let you battle that one out with Simon :-), but I am often at a loss for a better term to
> describe the unit of metadata that libraries may create in the future to describe their resources.
> Suggestions highly welcome.

*** I suggest (and use above) the Entity Metadata Unit = EMU. This contains the totality of unique information stored about this entity in this single logical location. 

> 
> kc
> [1] http://kcoyle.net/presentations/lita2011.html
> 
> 
> 
> 
> 
> --
> Karen Coyle
> [log in to unmask] http://kcoyle.net
> ph: 1-510-540-7596
> m: 1-510-435-8234
> skype: kcoylenet