Hi,
I've been following this thread carefully, and am very interested. At UCLA, we have the Frontera collection (http://frontera.library.ucla.edu/) and we have a local set of authorities because the performers and publishers are more ephemeral than what's usually in LCNAF. So, we're thinking of providing these values to others via API or something to help share what we know and get input from others. So, that's our use case for publishing out. Curious about everyone's thoughts.
Best,
Lisa
On Feb 1, 2013, at 9:44 AM, "Jason Ronallo" <[log in to unmask]<mailto:[log in to unmask]>> wrote:
Ed,
Thank you for the detailed response. That was very helpful. Yes, it
seems like good Web architecture is the API. Sounds like it would be
easy enough to start somewhere and add features over time.
I could see how exposing this data in a crawlable way could provide
some nice indexed landing pages to help improve discoverability of
related collections. I wonder though if this begs the question of who
other than my own institution would use such local authorities? Would
there really be other consumers? What's the likelihood that other
institutions will need to reuse my local name authorities?
Is the idea that if enough of us publish our local data in this way
that there could be aggregators or other means to make it easier to
reuse from a single source?
I can see the use case for a local authorities app. While I think it
would be cool to expose our local data to the world in this way, I'm
still trying to grasp at the larger value proposition.
Jason
On Thu, Jan 31, 2013 at 5:59 AM, Ed Summers <[log in to unmask]<mailto:[log in to unmask]>> wrote:
Hi Jason,
Heh, sorry for the long response below. You always ask interesting questions :-D
I would highly recommend that vocabulary management apps like this
assign an identifier to each entity, that can be expressed as a URL.
If there is any kind of database backing the app you will get the
identifier for free (primary key, etc). So for example let's say you
have a record for John Chapman, who is on the faculty at OSU, which
has a primary key of 123 in the database, you would have a
corresponding URL for that record:
http://id.library.osu.edu/person/123
When someone points their browser at that URL they get back a nice
HTML page describing John Chapman. I would strongly recommend that
schema.org<http://schema.org> microdata and/or opengraph protocol RDFa be layered into
the page for SEO purposes, as well as anyone who happens to be doing
scraping. I would also highly recommend adding a sitemap to enable
discovery, and synchronization.
Having that URL is handy because you could add different machine
readable formats that hang off of it, which you can express as <link>s
in your HTML, for example lets say you want to have JSON, RDF and XML
representations:
http://id.library.osu.edu/person/123.json
http://id.library.osu.edu/person/123.xml
http://id.library.osu.edu/person/123.rdf
If you want to get fancy you can content negotiate between the generic
url and the format specific URLs, e.g.
curl -i --header "Accept: application/json"
http://id.library.osu.edu/person/123
HTTP/1.1 303 See Other
date: Thu, 31 Jan 2013 10:47:44 GMT
server: Apache/2.2.14 (Ubuntu)
location: http://id.library.osu.edu/person/123
vary: Accept-Encoding
But that's gravy.
What exactly you put in these representations is a somewhat open
question I think. I'm a bit biased towards SKOS for the RDF because
it's lightweight, this is exactly its use case, it is flexible (you
can layer other assertions in easily), and (full disclosure) I helped
with the standardization of it. If you did do this you could use
JSON-LD for the JSON, or just come up with something that works.
Likewise for the XML. You might want to consider supporting JSON-P for
the JSON representation, so that it can be used from JavaScript in
other people's applications.
It might be interesting to come up with some norms here for
interoperability on a Wiki somewhere, or maybe a prototype of some
kind. But the focus should be on what you need to actual use it in
some app that needs vocabulary management. Focusing on reusing work
that has already been done helps a lot too. I think that helps ground
things significantly. I would be happy to discuss this further if you
want.
Whatever the format, I highly recommend you try to have the data link
out to other places on the Web that are useful. So for example the
record for John Chapman could link to his department page, blog, VIAF,
Wikipedia, Google Scholar Profile, etc. This work tends to require
human eyes, even if helped by a tool (Autosuggest, etc), so what you
do may have to be limited, or at least an ongoing effort. Managing
them (link scrubbing) is an ongoing effort too. But fitting your stuff
into the larger context of the Web will mean that other people will
want to use your identifiers. It's the dream of Linked Data I guess.
Lastly I recommend you have an OpenSearch API, which is pretty easy,
almost trivial, to put together. This would allow people to write
software to search for "John Chapman" and get back results (there
might be more than one) in Atom, RSS or JSON. OpenSearch also has a
handy AutoSuggest format, which some JavaScript libraries work with.
The nice thing about OpenSearch is that Browsers search boxes support
it too.
I guess this might sound like an information architecture more than an
API. Hopefully it makes sense. Having a page that documents all this,
with "API" written across the top, that hopefully includes terms of
service, can help a lot with use by others.
//Ed
PS. I should mention that Jon Phipps and Diane Hillman's work on the
Metadata Registry [2] did a lot to inform my thinking about the use of
URLs to identify these things. The metadata registry is used for
making the RDA and IFLA's FRBR vocabulary. It handles lots of stuff
like versioning, etc ... which might be nice to have. Personally I
would probably start small before jumping to installing the Metadata
Registry, but it might be an option for you.
[1] http://www.opensearch.org
[2] http://trac.metadataregistry.org/
On Wed, Jan 30, 2013 at 3:47 PM, Jason Ronallo <[log in to unmask]<mailto:[log in to unmask]>> wrote:
Ed,
Any suggestions or recommendations on what such an API would look
like, what response format(s) would be best, and how to advertise the
availability of a local name authority API? Who should we expect would
use our local name authority API? Are any of the examples from the big
authority databases like VIAF ones that would be good to follow for
API design and response formats?
Jason
On Wed, Jan 30, 2013 at 3:15 PM, Ed Summers <[log in to unmask]<mailto:[log in to unmask]>> wrote:
On Tue, Jan 29, 2013 at 5:19 PM, Kyle Banerjee <[log in to unmask]<mailto:[log in to unmask]>> wrote:
This would certainly be a possibility for other projects, but the use case
we're immediately concerned with requires an authority file that's
maintained by our local archives. It contains all kinds of information
about people (degrees, nicknames, etc) as well as terminology which is not
technically kosher but which we know people use.
Just as an aside really, I think there's a real opportunity for
libraries and archives to make their local thesauri and name indexes
available for integration into other applications both inside and
outside their institutional walls. Wikipedia, Freebase, VIAF are
great, but their notability guidelines don't always the greatest match
for cultural heritage organizations. So seriously consider putting a
little web app around the information you have, using it for
maintaining the data, making it available programatically (API), and
linking it out to other databases (VIAF, etc) as needed.
A briefer/pithier way of saying this is to quote Mark Matienzo [1]
Sooner or later, everyone needs a vocabulary management app.
:-)
//Ed
PS. I think Mark Phillips has done some interesting work in this area
at UNT. But I don't have anything to point you at, maybe Mark is tuned
in, and can chime in.
[1] https://twitter.com/anarchivist/status/269654403701682176
On Thu, Jan 31, 2013 at 5:59 AM, Ed Summers <[log in to unmask]<mailto:[log in to unmask]>> wrote:
Hi Jason,
Heh, sorry for the long response below. You always ask interesting questions :-D
I would highly recommend that vocabulary management apps like this
assign an identifier to each entity, that can be expressed as a URL.
If there is any kind of database backing the app you will get the
identifier for free (primary key, etc). So for example let's say you
have a record for John Chapman, who is on the faculty at OSU, which
has a primary key of 123 in the database, you would have a
corresponding URL for that record:
http://id.library.osu.edu/person/123
When someone points their browser at that URL they get back a nice
HTML page describing John Chapman. I would strongly recommend that
schema.org<http://schema.org> microdata and/or opengraph protocol RDFa be layered into
the page for SEO purposes, as well as anyone who happens to be doing
scraping. I would also highly recommend adding a sitemap to enable
discovery, and synchronization.
Having that URL is handy because you could add different machine
readable formats that hang off of it, which you can express as <link>s
in your HTML, for example lets say you want to have JSON, RDF and XML
representations:
http://id.library.osu.edu/person/123.json
http://id.library.osu.edu/person/123.xml
http://id.library.osu.edu/person/123.rdf
If you want to get fancy you can content negotiate between the generic
url and the format specific URLs, e.g.
curl -i --header "Accept: application/json"
http://id.library.osu.edu/person/123
HTTP/1.1 303 See Other
date: Thu, 31 Jan 2013 10:47:44 GMT
server: Apache/2.2.14 (Ubuntu)
location: http://id.library.osu.edu/person/123
vary: Accept-Encoding
But that's gravy.
What exactly you put in these representations is a somewhat open
question I think. I'm a bit biased towards SKOS for the RDF because
it's lightweight, this is exactly its use case, it is flexible (you
can layer other assertions in easily), and (full disclosure) I helped
with the standardization of it. If you did do this you could use
JSON-LD for the JSON, or just come up with something that works.
Likewise for the XML. You might want to consider supporting JSON-P for
the JSON representation, so that it can be used from JavaScript in
other people's applications.
It might be interesting to come up with some norms here for
interoperability on a Wiki somewhere, or maybe a prototype of some
kind. But the focus should be on what you need to actual use it in
some app that needs vocabulary management. Focusing on reusing work
that has already been done helps a lot too. I think that helps ground
things significantly. I would be happy to discuss this further if you
want.
Whatever the format, I highly recommend you try to have the data link
out to other places on the Web that are useful. So for example the
record for John Chapman could link to his department page, blog, VIAF,
Wikipedia, Google Scholar Profile, etc. This work tends to require
human eyes, even if helped by a tool (Autosuggest, etc), so what you
do may have to be limited, or at least an ongoing effort. Managing
them (link scrubbing) is an ongoing effort too. But fitting your stuff
into the larger context of the Web will mean that other people will
want to use your identifiers. It's the dream of Linked Data I guess.
Lastly I recommend you have an OpenSearch API, which is pretty easy,
almost trivial, to put together. This would allow people to write
software to search for "John Chapman" and get back results (there
might be more than one) in Atom, RSS or JSON. OpenSearch also has a
handy AutoSuggest format, which some JavaScript libraries work with.
The nice thing about OpenSearch is that Browsers search boxes support
it too.
I guess this might sound like an information architecture more than an
API. Hopefully it makes sense. Having a page that documents all this,
with "API" written across the top, that hopefully includes terms of
service, can help a lot with use by others.
//Ed
PS. I should mention that Jon Phipps and Diane Hillman's work on the
Metadata Registry [2] did a lot to inform my thinking about the use of
URLs to identify these things. The metadata registry is used for
making the RDA and IFLA's FRBR vocabulary. It handles lots of stuff
like versioning, etc ... which might be nice to have. Personally I
would probably start small before jumping to installing the Metadata
Registry, but it might be an option for you.
[1] http://www.opensearch.org
[2] http://trac.metadataregistry.org/
On Wed, Jan 30, 2013 at 3:47 PM, Jason Ronallo <[log in to unmask]<mailto:[log in to unmask]>> wrote:
Ed,
Any suggestions or recommendations on what such an API would look
like, what response format(s) would be best, and how to advertise the
availability of a local name authority API? Who should we expect would
use our local name authority API? Are any of the examples from the big
authority databases like VIAF ones that would be good to follow for
API design and response formats?
Jason
On Wed, Jan 30, 2013 at 3:15 PM, Ed Summers <[log in to unmask]<mailto:[log in to unmask]>> wrote:
On Tue, Jan 29, 2013 at 5:19 PM, Kyle Banerjee <[log in to unmask]<mailto:[log in to unmask]>> wrote:
This would certainly be a possibility for other projects, but the use case
we're immediately concerned with requires an authority file that's
maintained by our local archives. It contains all kinds of information
about people (degrees, nicknames, etc) as well as terminology which is not
technically kosher but which we know people use.
Just as an aside really, I think there's a real opportunity for
libraries and archives to make their local thesauri and name indexes
available for integration into other applications both inside and
outside their institutional walls. Wikipedia, Freebase, VIAF are
great, but their notability guidelines don't always the greatest match
for cultural heritage organizations. So seriously consider putting a
little web app around the information you have, using it for
maintaining the data, making it available programatically (API), and
linking it out to other databases (VIAF, etc) as needed.
A briefer/pithier way of saying this is to quote Mark Matienzo [1]
Sooner or later, everyone needs a vocabulary management app.
:-)
//Ed
PS. I think Mark Phillips has done some interesting work in this area
at UNT. But I don't have anything to point you at, maybe Mark is tuned
in, and can chime in.
[1] https://twitter.com/anarchivist/status/269654403701682176
On Thu, Jan 31, 2013 at 5:59 AM, Ed Summers <[log in to unmask]<mailto:[log in to unmask]>> wrote:
Hi Jason,
Heh, sorry for the long response below. You always ask interesting questions :-D
I would highly recommend that vocabulary management apps like this
assign an identifier to each entity, that can be expressed as a URL.
If there is any kind of database backing the app you will get the
identifier for free (primary key, etc). So for example let's say you
have a record for John Chapman, who is on the faculty at OSU, which
has a primary key of 123 in the database, you would have a
corresponding URL for that record:
http://id.library.osu.edu/person/123
When someone points their browser at that URL they get back a nice
HTML page describing John Chapman. I would strongly recommend that
schema.org<http://schema.org> microdata and/or opengraph protocol RDFa be layered into
the page for SEO purposes, as well as anyone who happens to be doing
scraping. I would also highly recommend adding a sitemap to enable
discovery, and synchronization.
Having that URL is handy because you could add different machine
readable formats that hang off of it, which you can express as <link>s
in your HTML, for example lets say you want to have JSON, RDF and XML
representations:
http://id.library.osu.edu/person/123.json
http://id.library.osu.edu/person/123.xml
http://id.library.osu.edu/person/123.rdf
If you want to get fancy you can content negotiate between the generic
url and the format specific URLs, e.g.
curl -i --header "Accept: application/json"
http://id.library.osu.edu/person/123
HTTP/1.1 303 See Other
date: Thu, 31 Jan 2013 10:47:44 GMT
server: Apache/2.2.14 (Ubuntu)
location: http://id.library.osu.edu/person/123
vary: Accept-Encoding
But that's gravy.
What exactly you put in these representations is a somewhat open
question I think. I'm a bit biased towards SKOS for the RDF because
it's lightweight, this is exactly its use case, it is flexible (you
can layer other assertions in easily), and (full disclosure) I helped
with the standardization of it. If you did do this you could use
JSON-LD for the JSON, or just come up with something that works.
Likewise for the XML. You might want to consider supporting JSON-P for
the JSON representation, so that it can be used from JavaScript in
other people's applications.
It might be interesting to come up with some norms here for
interoperability on a Wiki somewhere, or maybe a prototype of some
kind. But the focus should be on what you need to actual use it in
some app that needs vocabulary management. Focusing on reusing work
that has already been done helps a lot too. I think that helps ground
things significantly. I would be happy to discuss this further if you
want.
Whatever the format, I highly recommend you try to have the data link
out to other places on the Web that are useful. So for example the
record for John Chapman could link to his department page, blog, VIAF,
Wikipedia, Google Scholar Profile, etc. This work tends to require
human eyes, even if helped by a tool (Autosuggest, etc), so what you
do may have to be limited, or at least an ongoing effort. Managing
them (link scrubbing) is an ongoing effort too. But fitting your stuff
into the larger context of the Web will mean that other people will
want to use your identifiers. It's the dream of Linked Data I guess.
Lastly I recommend you have an OpenSearch API, which is pretty easy,
almost trivial, to put together. This would allow people to write
software to search for "John Chapman" and get back results (there
might be more than one) in Atom, RSS or JSON. OpenSearch also has a
handy AutoSuggest format, which some JavaScript libraries work with.
The nice thing about OpenSearch is that Browsers search boxes support
it too.
I guess this might sound like an information architecture more than an
API. Hopefully it makes sense. Having a page that documents all this,
with "API" written across the top, that hopefully includes terms of
service, can help a lot with use by others.
//Ed
PS. I should mention that Jon Phipps and Diane Hillman's work on the
Metadata Registry [2] did a lot to inform my thinking about the use of
URLs to identify these things. The metadata registry is used for
making the RDA and IFLA's FRBR vocabulary. It handles lots of stuff
like versioning, etc ... which might be nice to have. Personally I
would probably start small before jumping to installing the Metadata
Registry, but it might be an option for you.
[1] http://www.opensearch.org
[2] http://trac.metadataregistry.org/
On Wed, Jan 30, 2013 at 3:47 PM, Jason Ronallo <[log in to unmask]<mailto:[log in to unmask]>> wrote:
Ed,
Any suggestions or recommendations on what such an API would look
like, what response format(s) would be best, and how to advertise the
availability of a local name authority API? Who should we expect would
use our local name authority API? Are any of the examples from the big
authority databases like VIAF ones that would be good to follow for
API design and response formats?
Jason
On Wed, Jan 30, 2013 at 3:15 PM, Ed Summers <[log in to unmask]<mailto:[log in to unmask]>> wrote:
On Tue, Jan 29, 2013 at 5:19 PM, Kyle Banerjee <[log in to unmask]<mailto:[log in to unmask]>> wrote:
This would certainly be a possibility for other projects, but the use case
we're immediately concerned with requires an authority file that's
maintained by our local archives. It contains all kinds of information
about people (degrees, nicknames, etc) as well as terminology which is not
technically kosher but which we know people use.
Just as an aside really, I think there's a real opportunity for
libraries and archives to make their local thesauri and name indexes
available for integration into other applications both inside and
outside their institutional walls. Wikipedia, Freebase, VIAF are
great, but their notability guidelines don't always the greatest match
for cultural heritage organizations. So seriously consider putting a
little web app around the information you have, using it for
maintaining the data, making it available programatically (API), and
linking it out to other databases (VIAF, etc) as needed.
A briefer/pithier way of saying this is to quote Mark Matienzo [1]
Sooner or later, everyone needs a vocabulary management app.
:-)
//Ed
PS. I think Mark Phillips has done some interesting work in this area
at UNT. But I don't have anything to point you at, maybe Mark is tuned
in, and can chime in.
[1] https://twitter.com/anarchivist/status/269654403701682176
|