I'd just like to say a word of thanks for everyone who has contributed so far on this thread. The viewpoints raised certainly help clarify at least my understanding of some of the issues and concepts involved.
> MARCXML is a step in the right direction. MODS goes even further. Neither really go far enough.
And that succinctly, Eric manages to summarize my (and I strongly suspect, many others') sentiment on the issue at hand. Of course, the natural follow-on question is "go far enough for *what* exactly", and this is where my original question came from.
It sounds like once again we have the issue that our current tools (MODS, DCTERMS) "aren't good enough", which means we either have to:
a) stop doing things while we build new, better tools like Karen's MARC-in-triples (which seems like a really interesting idea)
or
b) start building imperfect — perhaps highly flawed — things with our current, imperfect tools
I'm not nearly smart enough to do a) so my intent is to take a stab at b), or else sit back and consider a new line of work entirely (which happens distressingly often, usually after reading enough discouraging statements from librarians in a given day).
> I think there's a fundamental difference between MODS and DCTERMS that make this nearly impossible. I've sometimes described this as the difference between "metadata as record format" (MARC, oai_dc, MODS, etc) and "metadata as vocabulary" (DCTERMS, DCAM, & RDF Vocabs in general).
This is a great clarification, and one of the main frustrations I have with MODS: it is bound nearly inseparably to XML as a format (and this is coming from someone who knows and loves XML dearly). The idea of DCTERMS/DC/etc as a format-independent model seems like a step in the right direction, IMO.
> RDF's grammar comes from the RDF Data Model, and DC's comes from DCAM as well as directly from RDF. The process that Karen Coyle describes is really the only way forward in making a good faith effort to "put" MARC (the bibliographic data) onto the Semantic Web.
Fair enough. But I would contend that "putting MARC / bib data on the Semantic Web" is just one use case; even though I realize that to Semantic Web advocates that it's the *only* use case worth considering.
I find it difficult to imagine that "building a record format from just a list of words" is completely useless, especially given that right now there's next to *zero* access to bibliographic data from libraries. Maybe the way to go is to just make the MARCXML available via OAI-PMH and OpenSearch and leave it at that.
> A more rational approach, IMO, would create a general description set (probably numbering 20-50), then expanding that for more detail and for different materials. Users of the sets could define the "zones" they wish to use in an application profile, so no one would have to carry around data elements that they are sure they will not use. It would also provide a simple but compatible set for folks who don't want to do the whole "library description" bit.
I agree with this 100%, and conceptually that's what DC and DCTERMS seemed to be the basis of, at least to me. This seems to parallel the MARC approach to refinement, which can be expressed as either a hierarchy or a set of independent assertions. Moreover, it's format-independent, so it could be serialized as XML, or RDF, or JSON for that matter. Is this what the RDA entities are supposed to achieve?
Let me give another example: the Open Library API returns a JSON tree, eg. http://openlibrary.org/books/OL1M.json
But what schema is this? And if it doesn't conform to a standard schema, does that make it useless? If it were based on DCTERMS, at least I'd have a reference at http://dublincore.org/documents/dcmi-terms/ to define the semantics being used (and an RDF namespace at http://purl.org/dc/terms/ to boot).
MJ
|