I'm confused about the supposed distinction between content negotiation and explicit content request in a URL. The reason I'm confused is that the response to content negotiation is supposed to be a content location header with a URL that is guaranteed to return the negotiated content. In other words, there *must* be a form of the URL that bypasses content negotiation. If you can do content negotiation, then you should have a URL form that doesn't require content negotiation.
Ralph
________________________________________
From: Code for Libraries <[log in to unmask]> on behalf of Robert Sanderson <[log in to unmask]>
Sent: Friday, November 29, 2013 2:44 PM
To: [log in to unmask]
Subject: Re: The lie of the API
(posted in the comments on the blog and reposted here for further
discussion, if interest)
While I couldn't agree more with the post's starting point -- URIs identify
(concepts) and use HTTP as your API -- I couldn't disagree more with the
"use content negotiation" conclusion.
I'm with Dan Cohen in his comment regarding using different URIs for
different representations for several reasons below.
It's harder to implement Content Negotiation than your own API, because you
get to define your own API whereas you have to follow someone else's rules
when you implement conneg. You can't get your own API wrong. I agree with
Ruben that HTTP is better than rolling your own proprietary API, we
disagree that conneg is the correct solution. The choice is between conneg
or regular HTTP, not conneg or a proprietary API.
Secondly, you need to look at the HTTP headers and parse quite a complex
structure to determine what is being requested. You can't just put a file
in the file system, unlike with separate URIs for distinct representations
where it just works, instead you need server side processing. This also
makes it much harder to cache the responses, as the cache needs to
determine whether or not the representation has changed -- the cache also
needs to parse the headers rather than just comparing URI and content. For
large scale systems like DPLA and Europeana, caching is essential for
quality of service.
How do you find our which formats are supported by conneg? By reading the
documentation. Which could just say "add .json on the end". The Vary header
tells you that negotiation in the format dimension is possible, just not
what to do to actually get anything back. There isn't a way to find this
out from HTTP automatically,so now you need to read both the site's docs
AND the HTTP docs. APIs can, on the other hand, do this. Consider
OAI-PMH's ListMetadataFormats and SRU's Explain response.
Instead you can have a separate URI for each representation and link them
with Link headers, or just a simple rule like add '.json' on the end. No
need for complicated content negotiation at all. Link headers can be added
with a simple apache configuration rule, and as they're static are easy to
cache. So the server side is easy, and the client side is trivial.
Compared to being difficult at both ends with content negotiation.
It can be useful to make statements about the different representations,
and especially if you need to annotate the structure or content. Or share
it -- you can't email someone a link that includes the right Accept headers
to send -- as in the post, you need to send them a command line like curl
with -H.
An experiment for fans of content negotiation: Have both .json and 302
style conneg from your original URI to that .json file. Advertise both. See
how many people do the conneg. If it's non-zero, I'll be extremely
surprised.
And a challenge: Even with libraries there's still complexity to figuring
out how and what to serve. Find me sites that correctly implement * based
fallbacks. Or even process q values. I'll bet I can find 10 that do content
negotiation wrong, for every 1 that does it correctly. I'll start:
dx.doi.org touts its content negotiation for metadata, yet doesn't
implement q values or *s. You have to go to the documentation to figure out
what Accept headers it will do string equality tests against.
Rob
On Fri, Nov 29, 2013 at 6:24 AM, Seth van Hooland <[log in to unmask]>
wrote:
>
> Dear all,
>
> I guess some of you will be interested in the blogpost of my colleague
and co-author Ruben regarding the misunderstandings on the use and abuse of
APIs in a digital libraries context, including a description of both good
and bad practices from Europeana, DPLA and the Cooper Hewitt museum:
>
> http://ruben.verborgh.org/blog/2013/11/29/the-lie-of-the-api/
>
> Kind regards,
>
> Seth van Hooland
> Président du Master en Sciences et Technologies de l'Information et de la
Communication (MaSTIC)
> Université Libre de Bruxelles
> Av. F.D. Roosevelt, 50 CP 123 | 1050 Bruxelles
> http://homepages.ulb.ac.be/~svhoolan/
> http://twitter.com/#!/sethvanhooland
> http://mastic.ulb.ac.be
> 0032 2 650 4765
> Office: DC11.102
|