Hello all,
A little context: the MODS and RDF Descriptive Metadata Subgroup
(https://wiki.duraspace.org/display/hydra/MODS+and+RDF+Descriptive+Metadata+Subgroup)
is a group of cultural institutions working together to model MODS XML
as RDF.
Our project diverges from previous efforts in this domain in that we're
trying to come up with a model that takes more advantage of widely-used
vocabularies and namespaces, avoiding blank nodes at all costs.
As we work through the list of MODS elements, we've been stumbling on a
few thorny issues, and with our goal of making our data as shareable as
possible, we agreed that it would be helpful to try and get the input of
folks who have more experience in harvesting and parsing RDF from the
proliferation of data providers existing in the real world (see
https://datahub.io/dataset for a great list).
Specifically, when consuming RDF from a new data source, how big of a
problem are the following issues:
#1. Triples where the object may be a string literal or a URI
For example, the predicate 'dc:subject' from the Dublin Core Elements
vocabulary has no defined range, which means it can be used with both
literal and non-literal values
(http://wiki.dublincore.org/index.php/User_Guide/Publishing_Metadata#dc:subject).
So one could have both in a data store:
ex:myObject1 dc:subject "aircraft" .
ex:myObject2 dc:subject
<http://id.loc.gov/authorities/subjects/sh85002782> .
... versus ...
#2. Using multiple predicates with similar/overlapping definitions,
depending on the value of the object
For example, when expressing the subject of a work, using different
predicates depending on whether there is an existing URI for a topic or not:
ex:myObject1 dc:subject "aircraft" .
ex:myObject2 dcterms:subject
<http://id.loc.gov/authorities/subjects/sh85002782> .
We're wondering which approach is less problematic from a Linked
Data-harvesting standpoint. Issue #1 requires that the parser be
prepared to handle different types of values from the same predicate,
but issue #2 involves parsing an additional namespace and predicate, etc.
Any thoughts, suggestions, or comments would be greatly appreciated.
Thanks,
Eben
--
Eben English | Boston Public Library
Web Services Developer
617-859-2238 [log in to unmask]
|