LISTSERV 16.5 - CODE4LIB Archives

Karen,

On 11 December 2011 15:18, Karen Coyle <[log in to unmask]> wrote:

> Quoting Richard Wallis <[log in to unmask]>:
>
>
>  I agree with your sentiment here but, from what you imply at
>> http://futurelib.pbworks.com/**w/page/29114548/MARC%**20elements<http://futurelib.pbworks.com/w/page/29114548/MARC%20elements>
>> ,
>> transformation in to something that would be recognisable by the
>> originators of the source Marc will be difficult - and yes ugly.
>>
>> The refreshing thing about the work done by the BL is that they stepped
>> away from the 'record', modeled the things that make up the BnB domain.
>> Then they implemented processes to extract rich data from the source Marc,
>> enrich it with external links, and load it to an RDF representation of the
>> model.
>>
>
> Richard, this is an interesting statement about the BL data. Are you
> saying that they chose a subset of their current bibliographic data to
> expose as LD? (I haven't found anything yet that describes the process
> used, so if there is a document I missed, please send link!)


There is no document I am aware of, but I can point you at the blog post by
Tim Hodson [
http://consulting.talis.com/2011/07/british-library-data-model-overview/]
who helped the BL get to grips with and start thinking Linked Data.
Another by the BL's Neil Wilson [
http://consulting.talis.com/2011/10/establishing-the-connection/] filling
in the background around his recent presentations about their work.

You get the impression that the BL "chose a subset of their current
bibliographic data to expose as LD" - it was kind of the other way around.
Having modeled the 'things' in the British National Bibliography domain
(plus those in related domain vocabularis such as VIAF, LCSH, Geonames,
Bio, etc.), they then looked at the information held in their [Marc] bib
records to identify what could be extracted to populate it.



> This almost sounds like the FRBR process, BTW - modeling the domain, which
> is also step one of the Singapore Framework/Dublin Core Application Profile
> process, then selecting data elements for the domain. [1] FRBR,
> unfortunately, has perceived problems as model (which I am attempting to
> gather up here [2] but may move to the LLD community wiki space to give it
> more visibility).
>

The BL will tell you that their model is designed to add to the
conversation around how to progress the modelling bibliographic information
as Linked Data.  There is still a way to go.  They are currently looking at
how to model multi-part works in the current model and hope to enhance it
to bring in other concepts such as FRBR.


> The work that I'm doing is not based on the assumption that all of MARC
> will be carried forward. The reason I began my work is that I don't think
> we know what is in the MARC record -- there is similar data scattered all
> over, some data that changes meaning as indicators are applied, etc. There
> is no implication that a future record would have all of those data
> elements, ...


I know it is only semantics (no pun intended), but we need to stop using
the word 'record' when talking about the future description of 'things' or
entities that are then linked together.   That word has so many built in
assumptions, especially in the library world.


> Concern shared.   I would however lower my sights slightly by setting the
>> current objective to be 'Publishing bibliographic information as Linked
>> Data to become a valuable and useful part of a Web of Data'.   Using the
>> Semantic Web as a goal introduces even more vagueness and baggage.  I
>> firmly believe that establishing a linked web of data will eventually
>> underpin a Semantic Web, but  there is still a few steps to go before we
>> get anywhere near that.
>>
>
> My concern is the creation of LD silos. BL data uses some known namespaces
> (BIBO, FOAF, BIO), which in fact is a way to "join" the web of data that
> many others are participating in, because your "foaf:Person" can interact
> with anyone else's "foaf:Person." But there are a great number of efforts
> that are modeling current records (FRBRer, ISBD, MODS, RDA) and are
> entirely silo'd - there is nothing that would connect the data to anyone
> else's data (and the ones mentioned would not even connect to each other).
> So I don't know what you mean by "part of a Web of data" but to me using
> non-silo'd properties is enough to meet that criterion. Another possibility
> is to create links from your properties to properties outside of your silo,
> e.g. from RDA:Person to foaf:Person, for sharing and discoverability.
>

There a couple of ways that your domain can link in to the wider web of
data.  Firstly, as you identify, by sharing vocabularies.  There is a small
example in the middle of the BL model, where a Resource is both a
dct:BiblographicResource and also (when appropriate) a bibo:Book.

In Linked Data there is nothing wrong in mixing ontologies within one
domain.  If the thing you are modelling is identified as being a
foaf:person, there is no reason why it can not also be defined as a
schema.org Person.

Secondly by what you link to. There is no reason why your thing which has a
foaf:name attribute of "Ransome, Arthur", can not have a sameAs
relationship with <http://viaf.org/viaf/67261752/, and a sameAs
relationship with <http://dbpedia.org/reource/Arthur_Ransome>


> I'm more concerned than you are about the issue of cataloging rules. A
> huge effort has gone into RDA and will now go into the "new bibliographic
> framework." RDA will soon have occupied a decade of scarce library
> community effort, and the new framework will be based on it, just as RDA is
> based on FRBR. We've been going in this direction for over 20 years.
> Meanwhile, look at how much has changed in the world around us. We're
> moving much more slowly than the world we need to be working within.
>
>
This concerns me too.

There is much valuable work that has gone in to RDA which, at least, should
go in to providing one of the detailed vocabularies used by the library
community to help us describe our resources.  Vocabularies that should be
mingled with the more generic vocabularies (foaf, DC, etc.) to allow our
resources to be linked and understood by data consumers on the wider web.

As to the amount of time, and keeping up issues - I have always found
looking back and calculating how much effort was expounded getting here
should not influence where we go next.  You are right to identify that we
are going too slow and risk the emergence of a de facto way of describing
bibliographic data that we are not happy with and little influence upon -
bypassed by an impatient web.

~Richard.


Richard Wallis
Technology Evangelist, Talis
http://consulting.talis.com
Tel: +44 (0)7767 886 005

Linkedin: http://www.linkedin.com/in/richardwallis
Skype: richard.wallis1
Twitter: @rjw
IM: [log in to unmask]