Print

Print


Houghton,Andrew wrote:
> Lets separate your argument into two pieces. Identification and
> resolution.  The DOI is the identifier and it inherently doesn't
> tie itself to any resolution mechanism.  So creating an info URI
> for it is meaningless, it's just another alias for the DOI.  I 
> can create an HTTP resolution mechanism for DOI's by doing:
>
> http://resolve.example.org/?doi=10.1111/j.1475-4983.2007.00728.x
>
> or
>
> http://resolve.example.org/?uri=info:doi/10.1111/j.1475-4983.2007.00728.x
>
> since the info URI contains the "natural" DOI identifier, wrapping it
> in a URI scheme has no value when I could have used the DOI identifier
> directly, as in the first HTTP resolution example.
>   

I disagree that wrapping it in a URI scheme has no value.  We have very 
much software and schemas that are built to store URIs, even if they 
don't know what the URI is or what can be done with it, we have 
infrastructure in place for dealing with URIs.

So there is value in wrapping a 'natural' identifier in a URI, even if 
that URI does not carry it's own resolution mechanism with it. I have 
run into this in several places in my own work.

I share Mike's concerns about tying resolution to identification in one 
mechanism.  As a sort of general principle or 'pattern' or design, 
trying to make one mechanism do two jobs at once is a 'bad smell'.  It's 
in fact (I hope this isn't too far afield) how I'd sum up much of the 
failure of AACR2/MARC, involving our 'controlled headings' (see me 
expanding on this in some blog posts at 
http://bibwild.wordpress.com/2008/01/17/identifiers-and-display-labels-again/).    


On the other hand, it is awfully _convenient_ to combine these two 
functions in one mechanism. And convenience does matter too.

I can see both sides. So I think we just do what feels right, and when 
we all disagree on what feels right, we pick one. I don't share the 
opinion of those who think it's obvious that everything should be an 
http uri, nor do I share the opinion of those who think it's obvious 
that this is a disaster.

DOI is definitely one good example of where One Canonical Resolution 
fails.  The DOI _resolution_ system fails for me -- it does not reliably 
or predictably deliver the right document for my users.  But a DOI as an 
identifier is still useful for me.  Even if that DOI were expressed in a 
URI as http://dx.doi.org/resolve/10.1111/j.1475-4983.2007.00728.x, I 
STILL wouldn't actually use the HTTP server at dx.doi.org to resolve 
it.  I'd extract the actual DOI out of it, and use a different 
resolution mechanism.

Another example to think about is what happens when the protocol for 
resolution changes?  Right now already we could find a resolution 
service starting to make available and/or insist upon https protocol 
resolution.  But all those existing identifiers expressed as http URIs 
should not change, they are meant to be persistent. So already it's 
possible for an identifier originally intended to describe it's own 
resolution to be slightly wrong.  Is this confusing? In the future, 
maybe we'll have something different than http entirely.


Jonathan