LISTSERV 16.5 - CODE4LIB Archives

I'll take a stab at this, also.  I was very intrigued by Chris' work
modelling MODS as an ontology.  AACR2 and MODS emerged from a business
model of standardizing descriptive practice in a way that could be readily
applied and thus shared across disparate organizations.  There are some
native relationships, particularly in topical areas, but not that many.

The real value of modelling MODS into an ontology, IMO, is the ability to
build better relationships with other metadata schemas and data models. 
Metadata mapping is still in its infancy and still relies to a large
extent on a "hard mapping" of one data element to another, without mapping
the inherent data model behind each schema, with the elements and their
relationships to one another.  MODS has fairly complex subject elements
with hierarchical relationships among temporal and topical, for example,
that don't map well to less hierarchical schema, for example.  Converting
MODS to an ontology allows the exposure of relationships and thus comes
closer at least to a model-to-model mapping.  ontologyX uses an ontology
behind the <indecs> schema to map the metadata of DOI participants.  I
don't know how well this works, since much of the work is proprietary and
closed.  Diane Hillman has done some interesting work with the NSDL
Metadata Registries project in this area, also.

Another dicey issue in metadata mapping is mapping the values within the
elements, which are often prescribed vocabularies that require use of the
controlled vocabulary to conform to the schema.  If we can get element
mapping truly working, it should be fairly easy to develop text mining
algorithms that compare vocabulary values against dictionaries to identify
and relate equivalent, broader and narrower terms.  Not trivial, but
fairly easy.

Grace Agnew
Rutgers University Libraries


> Chris, I've been following your blog posts for the last couple weeks
> or so and have been trying to come up with a coherent and useful
> reply, but I guess I'll take a stab here.
>
> I'll kick things off with saying that I'm skeptical of what advantage
> or utility a MODS ontology would bring (for many of the same reasons I
> have reservations regarding the RDA vocabulary/ies) - to me it feels
> like like it's just dressing up the same old library documents as LOD
> resources without really investing the time and energy to model them
> in the most appropriate way.  That being said, I understand that you
> have this data modeled this way already (so you have an incentive) and
> once I've seen that, maybe I will see the light and can be convinced
> that this is not a bad way to go.
>
> Also, you're probably aware of this, but the Simile project had an
> RDFizer for MODS:  http://simile.mit.edu/wiki/MARC/MODS_RDFizer
>
> A few of the things that have come to me are:
>
> 1) What kind of things would these resources be?
> <http://example.org/ex/1> <rdf:type> <http://purl.org/MODS/Record> ?
> Or would MODS have subclasses?  If so, what?  These:
> http://www.loc.gov/standards/mods/mods-outline.html#typeOfResource ?
> Would you actually be able to figure out anything about the resource
> merely from identifying its type?  Does MODS define a formal data
> model (a la
> http://lackoftalent.org/michael/blog/2009/08/10/is-marc-a-data-model/)?
>
> 2) MODS documents are generally containers that carry several discrete
> resources:  bibliographic data regarding the primary resource (let's
> say, a "book"); authors; publishers; subjects; record metadata
> (source, language of the metadata, creation date, etc.) and so on.
> Would a MODS ontology try to model the entirety of the graph expressed
> by a document?  If so, can the components stand independently?  Can I
> have just a MODS:Name?  Would I *want* just a MODS:Name (or
> MODS:Subject or MODS:RecordInfo, etc.)?
>
> 3) What, specifically, is missing from DCTerms that would make a MODS
> ontology needed?  What, specifically, is missing from Bibliontology or
> MusicOntology or FOAF or SKOS, etc. that justifies a new and, in many
> places, overlapping vocabulary?  Would time be better spent trying to
> improve the existing vocabularies?
>
> 4) What is compelling about MODS that makes it desirable to serialize
> as RDF?  Is it the structure?  The relationships?  Would it be
> possible that a desirable outcome of an rdf-ized MODS be merely a
> small set of properties (for example) that glues together a set of
> external vocabularies into something that would work as an acceptable
> surrogate to a MODS XML document?
>
> My interpretation of the crux of your argument is "our stuff is either
> MODS or can easily be transformed to MODS".  It just seems to me that
> once you've really atomized the record data into its component parts
> you will have something that will be enough of a departure from MODS
> that it will be difficult to see the resemblance.
>
> So I'll kick things off with that and see where that leads.
>
> Thanks,
> -Ross.
>
> On Tue, Aug 11, 2009 at 6:23 PM, Chris Frymann<[log in to unmask]> wrote:
>> Hi All,
>>
>> I just recently subscribed to this list and have been watching for a
>> few days, expecting that I would do so for a while longer before
>> jumping in.  However I couldn't help but take special note of recent
>> posts with mention of MARCXML and MODS and discussion, at least
>> indirectly, of how those formats "play" with "linked-data" standards.
>> Since that is an area close to where I have been working lately, I
>> thought I'd offer a comment and also ask for some friendly feedback.
>>
>> First my comment:
>>
>> Here at UC San Diego Libraries, where I work, we have been generating
>> RDF data for a couple of years now, and more recently working with
>> triplestores and SPARQL.  We also, no surprise, have lots of MARC
>> data, and have developed some local strategies for migrating MARC to
>> MODS to RDF with a very local conversion scheme.  In order to learn
>> more about OWL and ontologies, and possibly to create a more generally
>> useful/acceptable expression of our MARC/MODS data as RDF I launched
>> into a project to convert the
>>
>>        Library of Congress MODS XML schema
>>                http://www.loc.gov/standards/mods/v3/mods-3-3.xsd
>>
>> into a formal OWL ontology.  At one level this can be approached as a
>> rather mechanical process, on the other hand, I made some adjustments
>> to MODS predicate naming, with the intent of providing more meaning to
>> individual MODS-based RDF triples.  I won't try to explain that
>> further here, but if anyone has additional interest, more information
>> is available on my effort to produce and provide validity for a MODS
>> ontology on my blog, starting at a post entitled:
>>
>>    Another Step Toward Lifting Library Metadata into the Cloud
>>        http://www.chrisfrymann.com/2009/07/22/mods-ontology-2/
>>
>> and in following posts with comments and replies from and to Bruce
>> D'Arcus, especially regarding Bibliographic Ontology.
>>
>> That's the end of my comment.  So now my question(s), or request for
>> feedback.
>>
>> Can we identify, some generally agreed on automateable strategy for
>> converting MARC/MODS to RDF (without having to limit to Dublin Core).
>> Or, in case I'm missing something, what work has already been done in
>> that direction?
>>
>> As a corollary, I would appreciate thoughts any of you have on the
>> value of continuing the effort to develop a MODS ontology?  I attended
>> the Semantic Technology Conference recently where I was a speaker in
>> a:
>>
>>        Session on Digital Libraries
>>                http://www.semantic-conference.com/session/1990/
>>
>> and received quite a bit of interest at the conference, though I met
>> very few from the library community there.
>>
>> I had hoped to provide something that could:
>>
>>        * Potentially be more universal than our current local approach
>> to
>> expressing MODS in RDF
>>
>>        * Assign class and predicate names in an attempt to make dealing
>> with
>> blank noes and SPARQL queries simpler and more natural, given the (to
>> me) somewhat complicated structure of MODS.
>>
>>        * Provide a formal OWL base for assigning owl:sameAs
>> relationships,
>> alternate rdfs:label values, etc.
>>
>> However, I am very mindful of (and sympathetic to) thoughts such as
>> the following from Ed Summers, regarding:
>>
>> "...taking a more organic approach to vocabulary selection, mixing and
>> matching vocabulary elements rather than imposing a particular
>> metadata world-view"
>>
>> That would make sense to me if there was a generally accepted way to
>> automate the conversion.
>>
>> Sorry for the somewhat long introductory comment and thanks in advance
>> for any helpful thoughts or suggestions.
>>
>> Chris Frymann
>> Digital Library Architect
>> University of California San Diego Libraries
>>
>> Email: [log in to unmask]
>> Blog: http://chrisfrymann.com
>>
>