On Thu, Apr 30, 2009 at 6:37 PM, Peter Noerr <[log in to unmask]> wrote: > Some further observations. So far this threadling has mentioned only trying to unify two different sets of identifiers. However there are a much larger number of them out there (and even larger numbers of schemas and other "standard-things-that-everyone-should-use-so-we-all-know-what-we-are-talking-about") and the problem exists for any of these things (identifiers, etc.) where there are more than one of them. So really unifying two sets of identifiers, while very useful, is not actually going to solve much. Well, that wasn't really my intention (although I thought it wouldn't be a bad start). What I would really prefer is that we compile these into a single vocabulary that could be used as a reference point. > > Is there any broader methodology we could approach which potentially allows multiple unifications or (my favourite) cross-walks. (Complete unification requires everybody agrees and sticks to it, and human history is sort of not on that track...) And who (people and organizations) would undertake this? Realistically, we could achieve this via the NSDL MetadataRegistry and SKOS. We could have something like: <http://purl.org/DataFormat/marcxml> . <skos:prefLabel> "MARC21 XML" . . <skos:notation> "info:srw/schema/1/marcxml-v1.1" . . <skos:notation> "info:ofi/fmt:xml:xsd:MARC21" . . <skos:notation> "http://www.loc.gov/MARC21/slim" . . <skos:broader> http://purl.org/DataFormat/marc . . <skos:description> "..." . Or maybe those skos:notations should be owl:sameAs -- anyway, that's not really the point. The point is that all of these various identifiers would be valid, but we'd have a real way of knowing what they actually mean. Maybe this is what you mean by a crosswalk. > > Ross' point about a lightweight approach is necessary for any sort of adoption, but this is a problem (which plagues all we do in federated search) which cannot just be solved by another registry. Somebody/organisation has to look at the identifiers or whatever and decide that two of them are identical or, worse, only partially overlap and hence scope has to be defined. In a syntax that all understand of course. Already in this thread we have the sub/super case question from Karen (in a post on the openurl (or Z39.88 <sigh> - identifiers!) listserv). And the various identifiers for MARC (below) could easily be for MARC-XML, MARC21-ISO2709, MARCUK-ISO2709. Now explain in words of one (computer understandable) syllable what the differences are. This is indeed a valid point. However, the two registries that already exist have this sort of granularity there (hence why they weren't exactly describing the *same* ONIX version). I guess I'm not really as worried about this problem because I think if people actually use it, and the system is flexible and editable the semantics will be worked out. > > I'm not trying to make problems. There are problems and this is only a small subset of them, and they confound us every day. I would love to adopt standard definitions for these things, but which Standard? Because anyone can produce any identifier they like, we have decided that the unification of them has to be kept internal where we at least have control of the unifications, even if they change pretty frequently. Right, which is why I'm feeling less discriminatory on which one is "right". -Ross.