On 2/23/2012 3:53 PM, Karen Coyle wrote:
> Jonathan, while having these thoughts your Umlaut service did come to
> mind. If you ever have time to expand on how it could work in a wide
> open web environment, I'd love to hear it. (I know you explain below,
> but I don't know enough about link resolvers to understand what it
> really means from a short
> explanation. Diagrams are always welcome!)
I'm not entirely sure what is meant by 'wide open web environment.'
I mean, part of the current environment is that there's lots of stuff on
the web that is NOT free/open access, it's only available to certain
licensed people. AND that libraries license a lot of this stuff on
behalf of their user group. (Not just content, but sometimes services
too). It's really that environment Umlaut is focused on, if that
changed, what would be required would have little to do with Umlaut as
it is now, I think.
But I don't think anyone anticipates that changing anytime soon, I don't
think that's what Karen means by 'wide open web environment.'
So if that continues to be the case.... I think Umlaut has a role
working pretty much as it does now, it would work how it works. (Maybe
I'm not sufficiently forward-thinking).
I will admit that, while I come accross lots of barriers in implementing
Umlaut, I have yet to come accross anything that makes me think "this
would be a lot easier if only there was more RDF." Maybe it's a failure
of imagination on my part. More downloadable data, sure. More http
APIs, even more so. And Umlaut already takes advantage of such things,
especially the API's more than the downloadable data (it turns out it's
a lot more 'expensive' to try to download data and do something with it
yourself, compared to using an API someone else provides to do the heavy
lifting for you). But has it ever been much of a problem that the data
is in some format other than RDF, such that it would be easier in RDF?
Not from my perspective, not really. (In some ways, RDF is harder to
deal with than other formats, from where I'm coming from. If a service
does offer data in RDF triples as well as something else, I'm likely to
choose the something else).
This may be ironic because Umlaut is very concerned with 'linking data',
in the sense of figuring out whether this record from the local catalog
represents 'the same thing' as this record from Amazon, as this record
from Google Books, or HathiTrust. If this citation that came in as an
OpenURL represents the 'same thing' as a record in a vendor database, or
mendeley, or whatever.
There are real barriers in making this determination; they wouldn't be
solved if everything was just in RDF, but they _would_ be solved if
there were more consistent use of identifiers, for sure. I DO think
"this would be easier if only there were more consistent use of
identifiers" all the time.
That experience with Umlaut is also what leads me to believe that the
WEMI ontology is not only not contradictory to "linked data
applications", but _crucial_ for it. Realizing that without it, it's
very hard to tell when something is "the same thing". There are lots of
times Umlaut ends up saying "Okay, I found something that I _think_ is
at least an edition of the same thing you care about, but I really can't
tell you if it's the _same_ edition you are interested in or not."
So, yeah, Umlaut would work _better_ with more widespread use
identifiers, and even better with consistent use of common identifiers.
I guess that's maybe where RDF could come in, in expressing
determinations people have made of "this identifier in system X
represents the same 'thing' as this other identifier in system Y"
(someone would still have to MAKE those determinations, RDF would just
be one way to then convey that determination, and I wouldn't
particularly care if it was conveyed in RDF or something else). So
anyway, it would work better with some of that stuff, but would it work
substantially _differently_? Not so much.
Ah, if web pages started having more embedded machine readable data with
citations and identifiers of "what is being looked at" (microdata, RDFa,
whatever), that would make it easier to get a user from some random web
page _to_ an institution's Umlaut, that's one thing that would be nice.
You may (or may not) find the "What is Umlaut, Anyway?" article on the
Umlaut wiki helpful.
https://github.com/team-umlaut/umlaut/wiki/What-is-Umlaut-anyway
And there's really not much to understand about 'link resolvers' for
these purposes, except that there's this thing called OpenURL (really
bad name), which is really just a way for one website to hyperlink to
another website and pass a machine-readable citation to it. This
application receiving the machine readable citation then tries to get
the user to appropriate access or services for it, with regard to
institutional entitlements. That's about it, if you understand that, you
understand enough. Except that most commercially available 'link
resolvers' do a so-so job with scholarly article citations and full
text, and don't even try much with anything else. (In part, because it's
_hard_, especially to provide an out of the box solution, because of
libraries diverse messed up infrastructures).
|