LISTSERV 16.5 - CODE4LIB Archives

Hi Ray -

At Thu, 2 Apr 2009 13:48:19 -0400,
Ray Denenberg, Library of Congress wrote:
> 
> You're right, if there were a "web:"  URI scheme, the world would be a 
> better place.   But it's not, and the world is worse off for it.

Well, the original concept of the ‘web’ was, as I understand it, to
bring together all the existing protocols (gopher, ftp, etc.), with
the new one in addition (HTTP), with one unifying address scheme, so
that you could have this ‘web browser’ that you could use for
everything. So web: would have been nice, but probably wouldn’t have
been accepted.

As it turns out, HTTP won overwhelmingly, and the older protocols died
off.

> It shouldn't surprise anyone that I am sympathetic to Karen's
> criticisms. Here is some of my historical perspective (which may
> well differ from others').
> 
> Back in the old days, URIs (or URLs) were protocol based. The ftp
> scheme was for retrieving documents via ftp. The telnet scheme was
> for telnet. And so on. Some of you may remember the ZIG (Z39.50
> Implementors Group) back when we developed the z39.50 URI scheme,
> which was around 1995. Most of us were not wise to the ways of the
> web that long ago, but we were told, by those who were, that
> "z39.50r:" and "z39.50s:" at the beginning of a URL are explicit
> indications that the URI is to be resolved by Z39.50.
> 
> A few years later the semantic web was conceived and alot of SW
> people began coining all manner of http URIs that had nothing to do
> with the http protocol. By the time the rest of the world noticed,
> there were so many that it was too late to turn back. So instead,
> history was altered. The company line became "we never told you that
> the URI scheme was tied to a protocol".
> 
> Instead, they should have bit the bullet and coined a new scheme.  They 
> didn't, and that's why we're in the mess we're in.

Not knowing the details of the history, your account seems correct to
me, except that I don’t think the web people tried to alter history.

I think of the web of having been a learning experience for all of us.
Yes, we used to think that the URI was tied to the protocol. But we
have learned that it doesn’t need to be, that HTTP URIs can be just
identifiers which happen to be dereferencable at the moment using the
HTTP protocol.

And it became useful to begin identifying lots of things, people and
places and so on, using identifiers, and it also seemed useful to use
a protocol that existed (HTTP), instead of coming up with the
Person-Metadata Transfer Protocol and inventing a new URI scheme
(pmtp://...) to resolve metadata about persons. Because HTTP doesn’t
care what kind of data it is sending down the line; it can happily
send metadata about people.

But that is how things grow; the http:// at the beginning of a URI may
eventually be a spandrel, when HTTP is dead and buried. And people
will wonder why the address http://dx.doi.org/10.1111/xxx has those
funny characters in front of it. And doi.org will be long gone,
because they ran out of money, and their domain was taken over by
squatters, so we all had to agree to alter our browsers to include an
override to not use DNS to resolve the dx.doi.org domain but instead
point to a new, distributed system of DOI resolution.

We will need to fix these problems as they arise.

In my opinion, if we are interested in identifier persistent, clarity
about the difference between things and information about things,
creating a more useful web (of data), and the other things we ought to
be interested in, our time is best spent worrying about these things,
and how they can be built on top of the web. Our time is not well
spent in coming up with new ways to do things that web already does
for us.

For instance: if there is concern that HTTP URIs are not seen as being
persistent, it would be useful to try to add a method to HTTP which
indicated the persistence of an identifier. This way browsers could
display a little icon that indicated that the URI was persistent. A
user could click on this icon and get information about the
institution which claimed persistence for the URI, what the level of
support was, what other institution could back up that claim, etc.

Our time would not be well spent coming up with an elaborate scheme
for phttp:// URIs, creating a better DNS, with name control by a
better institution, and a better HTTP, with metadata, and a better
caching system, and so on. This is a lot of work and you forget what
you were trying to do in the first place, which is make HTTP URIs
persistent.

best,
Erik