Print

Print


Rob is correct on all points.

Namespace URIs can, in some cases, be overloaded to function as schema
identifiers.  But they absolutely can't be used blindly in this way
for arbitrary formats -- there are all kinds of potential gotchas.
That being so, I think it is wiser and more explicit _always_ to
define a separate identitifier for a format.

 _/|_	 ___________________________________________________________________
/o ) \/  Mike Taylor    <[log in to unmask]>    http://www.miketaylor.org.uk
)_v__/\  "... currently trading under the name Gently for reasons which it
	 would be otiose, for the moment, to rehearse" -- Douglas Adams,
	 "Dirk Gently"


Rob Sanderson writes:
 > On Mon, 2009-05-11 at 14:53 +0100, Jakob Voss wrote:
 > 
 > > >> A format should be described with a schema (XML Schema, OWL etc.) or at 
 > > >> least a standard. Mostly this schema already has a namespace or similar 
 > > >> identifier that can be used for the whole format.
 > > > 
 > > > This is unfortunately not the case.
 > > 
 > > It is mostly the case - but people like to misinterpret schemas and 
 > > tailor them to their needs.
 > 
 > You're advocating an approach that "mostly" works, as opposed to one
 > that works in all cases?
 > 
 > 
 > > >> For instance MODS Version 3 (currently 3.0, 3.1, 3.2, 3.4) has the XML 
 > > >> Namespace http://www.loc.gov/mods/v3 so this is the best identifier to 
 > > >> identify MODS. 
 > > > 
 > > > And this is a perfect example of why this is not the case. 
 > > > The same mods schema (let alone namespace) defines TWO formats, mods and
 > > > modsCollection.
 > 
 > > That's your interpretation. According to the schema, the MODS format 
 > > *is* either a single mods-element or a modsCollection-element. 
 > 
 > According to the __schema__ yes.  Not according to the namespace. The
 > namespace is a collection of names only and says precisely nothing about
 > structure.
 > 
 > And, yes, given no definition of "format", my interpretation is that the
 > mods schema defines two formats, as it defines two top level elements
 > with different contents (eg one may contain the other).  This is
 > typically how people would define format in this context, I would say.  
 > 
 > This is, of course, tangential to the fact that you cannot use the __XML
 > Namespace__ as an identifier for the format, no matter how you define
 > it.
 > 
 > 
 > > That's 
 > > exactely what you can refer to with the namespace identifier 
 > > http://www.loc.gov/mods/v3.
 > 
 > No, that's a collection of elements, not a schema.
 > 
 > 
 > > If you need to identify the specific element 'mods' of the format only, 
 > > then you need another identifer.
 > 
 > Correct. I'm glad you agree with me.
 > 
 > Given that namespaces do not specify anything to do with structure, you
 > thus need a new identifier for EVERY element in a namespace as they
 > could be used as the top level tag of ANY schema.
 > 
 > There isn't a widely accepted identifier system for schemas, only schema
 > locations.  There are also many methods for defining schemas
 > (schematron, relax-ng, DTDs, xml schema) which can all define exactly
 > the same "format".
 > 
 > 
 > > But if the MODS specification defines that you can refer to any element 
 > > with an URI fragment identifier, then the right identifier would be 
 > > http://www.loc.gov/mods/v3#mods
 > 
 > That would be an identifier for the *element*.
 > 
 > > The namespace http://www.loc.gov/mods/v3 of the top level element 'mods' 
 > > does not identify the top level element but the MODS *format* (in any of 
 > > the versions 3.0-3.4) itself. This format *includes* the top level 
 > > element 'mods'.
 > 
 > No, it identifies a collection of names.  These names are structured
 > according to a schema, which is what we need an identifier for. Beyond
 > that, we may also need identifiers for which structure we mean within
 > the schema (eg mods vs modsCollection)
 > 
 > 
 > Rob