Print

Print


Good afternoon,

We're in the process of planning our new Digital Library website, which will provide a fair bit of Linked Open Data through the features available in Drupal 7 (RDF, RDFx, SPARQL, etc)

Aside from standard bibliographic data, one of the large chunks of what we are going to provide is a dataset of botanists and their publications. 

I'd like to start a discussion on if and how you are going about linking your data out to other sources on the web, such as WorldCat or other RDF sources, etc. 

D7 makes it almost a trivial matter to set up the vocabularies and map predicates to the data, so that part is taken care of. We can also interlink our data amongst themselves, but the thing that's stumping me is the best way to identify other identifiers and how much of a priority it is to link to them.

It seems to me that if I have 4,000 books that I know are available in WorldCat that I should provide an "owl:sameAs" triple to link my identifier with something more commonly used. Is this appropriate? Desired? Or does it matter? I don't think that I can expect someone to hit my site via SPARQL and expect to magically find my identifier for Darwin's "Origin of Species", for example. 

Assuming the answer to these is "yes", then does anyone have any clever ways of doing this? The only thing that comes to mind for me is a brute force "let's try to match the titles in some reliable manner". 

--Joel

Joel Richard
IT Specialist, Web Services Department
Smithsonian Institution Libraries | http://www.sil.si.edu/
(202) 633-1706 | [log in to unmask]