Hi Rob, You wrote: >> A format should be described with a schema (XML Schema, OWL etc.) or at >> least a standard. Mostly this schema already has a namespace or similar >> identifier that can be used for the whole format. > > This is unfortunately not the case. It is mostly the case - but people like to misinterpret schemas and tailor them to their needs. >> For instance MODS Version 3 (currently 3.0, 3.1, 3.2, 3.4) has the XML >> Namespace http://www.loc.gov/mods/v3 so this is the best identifier to >> identify MODS. > > And this is a perfect example of why this is not the case. > > The same mods schema (let alone namespace) defines TWO formats, mods and > modsCollection. That's your interpretation. According to the schema, the MODS format *is* either a single mods-element or a modsCollection-element. That's exactely what you can refer to with the namespace identifier http://www.loc.gov/mods/v3. If you need to identify the specific element 'mods' of the format only, then you need another identifer. Up to now there is no default way to create an identifier for a specific element in an XML format, see http://www.w3.org/TR/webarch/#xml-fragids But if the MODS specification defines that you can refer to any element with an URI fragment identifier, then the right identifier would be http://www.loc.gov/mods/v3#mods You wrote: > I totally agree that it's an awful design choice. However it's a > demonstration that XML namespaces _do not identify format_. And > hence, we need another identifier which is not the namespace of > the top level element. The namespace http://www.loc.gov/mods/v3 of the top level element 'mods' does not identify the top level element but the MODS *format* (in any of the versions 3.0-3.4) itself. This format *includes* the top level element 'mods'. > Also consider the following more hypothetical, but perfectly feasible > situations: > > * One namespace is used to define two _totally_ separate sets of > elements. There's no reason why this can't be done. Ok, let A and B be two formats with two totally sets of elements (and rules how to use them). If you put them into one namespace, then you get a new format C that is the union of A and B. > * One namespace defines so many elements that it's meaningless to call > it a format at all. Even though the top level tag might be the same, > the contents are so varied that you're unable to realistically process > it. Sad but true: The word "format" in the context of library applications does not make sense anyway in most cases. Technically a format is just a set of possible instances, defined as a formal language or with any other type of specification. The problem of library formats is that many people refer to them without providing a proper specification. Coming back to the mods example: If the SRU Schema registry lists "info:srw/schema/1/mods-v3.3" as the identifier for "MODS Schema Version 3.3" with a pointer to the XML Schema "http://www.loc.gov/standards/mods/v3/mods-3-3.xsd" then *any* XML document that validates against this schema must be considered to be a MODS 3.3 document - either with 'mods' or with 'modsCollection' as root element. Greetings Jakob -- Jakob Voß <[log in to unmask]>, skype: nichtich Verbundzentrale des GBV (VZG) / Common Library Network Platz der Goettinger Sieben 1, 37073 Göttingen, Germany +49 (0)551 39-10242, http://www.gbv.de