On Mon, 2009-05-11 at 11:31 +0100, Jakob Voss wrote
> A format should be described with a schema (XML Schema, OWL etc.) or at
> least a standard. Mostly this schema already has a namespace or similar
> identifier that can be used for the whole format.
This is unfortunately not the case.
> For instance MODS Version 3 (currently 3.0, 3.1, 3.2, 3.4) has the XML
> Namespace http://www.loc.gov/mods/v3 so this is the best identifier to
> identify MODS.
And this is a perfect example of why this is not the case.
The same mods schema (let alone namespace) defines TWO formats, mods and
modsCollection.
To quote from the schema:
------------------------------------------------
***** An instance of this schema is
(1) a single MODS record:
-->
<xsd:element name="mods" type="modsType"/>
<!--
or
(2) a collection of MODS records:
-->
<xsd:element name="modsCollection">
<xsd:complexType>
<xsd:sequence>
<xsd:element ref="mods" maxOccurs="unbounded"/>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
<!--
***** End of "instance" definition
-------------------------------------------------
So you're using the same identifier to identify two different things at
the same time.
We discussed this a lot during the development of SRU and there simply
isn't an existing identifier for an XML 'format'.
Also consider the following more hypothetical, but perfectly feasible
situations:
* One namespace is used to define two _totally_ separate sets of
elements. There's no reason why this can't be done.
* One namespace defines so many elements that it's meaningless to call
it a format at all. Even though the top level tag might be the same,
the contents are so varied that you're unable to realistically process
it.
Rob
|