LISTSERV 16.5 - CODE4LIB Archives

> From: Code for Libraries [mailto:[log in to unmask]] On Behalf Of
> Mike Taylor
> Sent: Wednesday, April 01, 2009 10:17 AM
> To: [log in to unmask]
> Subject: Re: [CODE4LIB] registering info: uris?
> 
> Houghton,Andrew writes:
>  > Again we have moved the discussion to a specific resolution
> mechanism,
>  > e.g., OpenURL.  OpenURL could have been defined differently, such
>  > that rft_id and rft_idScheme were available and you used the actual
>  > DOI value and specified the scheme of the identifier.  Then the
> issue
>  > of extraction of the identifier value from the URI goes away,
> because
>  > there is no URI needed.
> 
> Yes, that would have been OK, too.  But no doubt there are other
> contexts where it's possible to pass in an identifier without also
> being able to say "and by the way, it's of type XYZ".  Surely you
> don't disagree that it's good for identifiers to be self-describing?

Ok, now we moved the discussion back to identifiers rather than
resolution mechanisms.  Absolutely agree that it's good for
identifiers to be self-describing, I wasn't saying otherwise.
However, lets take the following URIs:

http://any.identifier.org/?scheme=doi&id=10.1111/j.1475-4983.2007.00728.x
info:doi/10.1111/j.1475-4983.2007.00728.x
urn:doi:10.1111/j.1475-4983.2007.00728.x

All three are self describing URI.  The HTTP URI does exactly the same thing
as the info URI without having to create a new URI scheme, e.g., info, and
the argument made by IETF and W3C against the creation of info URIs.  Also,
since the info URI folks actually created a domain name for registering info 
URIs you could have easily changed "any.identifier.org" to "info-uri.info"
to achieve the same effect as the info URI.

> From: Code for Libraries [mailto:[log in to unmask]] On Behalf Of
> Mike Taylor
> Sent: Wednesday, April 01, 2009 10:44 AM
> To: [log in to unmask]
> Subject: Re: [CODE4LIB] registering info: uris?
>
> Imagine your web-browser extended by a plugin that knows how to
> resolve particularly kinds of info: URLs.  If you just paste the raw
> DOI into the URI bar, it won't have a clue what to do with it, but the
> wrapped-in-a-URI version stands alone and self-describes, so the
> plugin can pull it apart and say, "ah yes, this URI is a DOI, and I
> know how my user has configured me to resolve those."

Sure you can imagine a web-browser plugin, but these things never happen
due to a) the cost of developing or, b) in order for it to work you need
a plugin to work for every type of browser.  This is why the Architecture
of the Web document states:

  "While Web architecture allows the definition of new schemes, introducing 
   a new scheme is costly. Many aspects of URI processing are scheme-dependent, 
   and a large amount of deployed software already processes URIs of well-known 
   schemes. Introducing a new URI scheme requires the development and deployment 
   not only of client software to handle the scheme, but also of ancillary agents 
   such as gateways, proxies, and caches. See [RFC2718] for other considerations 
   and costs related to URI scheme design"

> What you seem to be suggesting (are you?) is that in the former case, the 
> resolver should recognise that the HTTP URL matches the regular expression
> 	^http://dx\.doi\.org\.(.*)$/
> and so extract the match and go off and do something else with it.

Back to resolution mechanisms... I'm not suggesting anything.  You are suggesting
a resolution mechanism implementation which uses regular expressions.  That is 
one of many ways a resolution mechanism can retrieve the embedded DOI or identifier
of choice.  URI Templates is another and given this URI:

http://any.identifier.org/?scheme=doi&id=10.1111/j.1475-4983.2007.00728.x

any Web library on the planet can pull the query parameters out of the URI.

> as the "actionable identifier" might be something uglier...

A URI is just a token with a predefined syntax, per RFC 3986, used to identify a
resource which can be an abstract "thing", e.g., Real World Object or a 
representation of a resource, e.g., a Web Document.  One could postulate that all 
URIs are ugly.  Whether a URI is ugly or not is irrelevant.


Andy.