Hi, I've been following this thread carefully, and am very interested. At UCLA, we have the Frontera collection (http://frontera.library.ucla.edu/) and we have a local set of authorities because the performers and publishers are more ephemeral than what's usually in LCNAF. So, we're thinking of providing these values to others via API or something to help share what we know and get input from others. So, that's our use case for publishing out. Curious about everyone's thoughts. Best, Lisa On Feb 1, 2013, at 9:44 AM, "Jason Ronallo" <[log in to unmask]<mailto:[log in to unmask]>> wrote: Ed, Thank you for the detailed response. That was very helpful. Yes, it seems like good Web architecture is the API. Sounds like it would be easy enough to start somewhere and add features over time. I could see how exposing this data in a crawlable way could provide some nice indexed landing pages to help improve discoverability of related collections. I wonder though if this begs the question of who other than my own institution would use such local authorities? Would there really be other consumers? What's the likelihood that other institutions will need to reuse my local name authorities? Is the idea that if enough of us publish our local data in this way that there could be aggregators or other means to make it easier to reuse from a single source? I can see the use case for a local authorities app. While I think it would be cool to expose our local data to the world in this way, I'm still trying to grasp at the larger value proposition. Jason On Thu, Jan 31, 2013 at 5:59 AM, Ed Summers <[log in to unmask]<mailto:[log in to unmask]>> wrote: Hi Jason, Heh, sorry for the long response below. You always ask interesting questions :-D I would highly recommend that vocabulary management apps like this assign an identifier to each entity, that can be expressed as a URL. If there is any kind of database backing the app you will get the identifier for free (primary key, etc). So for example let's say you have a record for John Chapman, who is on the faculty at OSU, which has a primary key of 123 in the database, you would have a corresponding URL for that record: http://id.library.osu.edu/person/123 When someone points their browser at that URL they get back a nice HTML page describing John Chapman. I would strongly recommend that schema.org<http://schema.org> microdata and/or opengraph protocol RDFa be layered into the page for SEO purposes, as well as anyone who happens to be doing scraping. I would also highly recommend adding a sitemap to enable discovery, and synchronization. Having that URL is handy because you could add different machine readable formats that hang off of it, which you can express as <link>s in your HTML, for example lets say you want to have JSON, RDF and XML representations: http://id.library.osu.edu/person/123.json http://id.library.osu.edu/person/123.xml http://id.library.osu.edu/person/123.rdf If you want to get fancy you can content negotiate between the generic url and the format specific URLs, e.g. curl -i --header "Accept: application/json" http://id.library.osu.edu/person/123 HTTP/1.1 303 See Other date: Thu, 31 Jan 2013 10:47:44 GMT server: Apache/2.2.14 (Ubuntu) location: http://id.library.osu.edu/person/123 vary: Accept-Encoding But that's gravy. What exactly you put in these representations is a somewhat open question I think. I'm a bit biased towards SKOS for the RDF because it's lightweight, this is exactly its use case, it is flexible (you can layer other assertions in easily), and (full disclosure) I helped with the standardization of it. If you did do this you could use JSON-LD for the JSON, or just come up with something that works. Likewise for the XML. You might want to consider supporting JSON-P for the JSON representation, so that it can be used from JavaScript in other people's applications. It might be interesting to come up with some norms here for interoperability on a Wiki somewhere, or maybe a prototype of some kind. But the focus should be on what you need to actual use it in some app that needs vocabulary management. Focusing on reusing work that has already been done helps a lot too. I think that helps ground things significantly. I would be happy to discuss this further if you want. Whatever the format, I highly recommend you try to have the data link out to other places on the Web that are useful. So for example the record for John Chapman could link to his department page, blog, VIAF, Wikipedia, Google Scholar Profile, etc. This work tends to require human eyes, even if helped by a tool (Autosuggest, etc), so what you do may have to be limited, or at least an ongoing effort. Managing them (link scrubbing) is an ongoing effort too. But fitting your stuff into the larger context of the Web will mean that other people will want to use your identifiers. It's the dream of Linked Data I guess. Lastly I recommend you have an OpenSearch API, which is pretty easy, almost trivial, to put together. This would allow people to write software to search for "John Chapman" and get back results (there might be more than one) in Atom, RSS or JSON. OpenSearch also has a handy AutoSuggest format, which some JavaScript libraries work with. The nice thing about OpenSearch is that Browsers search boxes support it too. I guess this might sound like an information architecture more than an API. Hopefully it makes sense. Having a page that documents all this, with "API" written across the top, that hopefully includes terms of service, can help a lot with use by others. //Ed PS. I should mention that Jon Phipps and Diane Hillman's work on the Metadata Registry [2] did a lot to inform my thinking about the use of URLs to identify these things. The metadata registry is used for making the RDA and IFLA's FRBR vocabulary. It handles lots of stuff like versioning, etc ... which might be nice to have. Personally I would probably start small before jumping to installing the Metadata Registry, but it might be an option for you. [1] http://www.opensearch.org [2] http://trac.metadataregistry.org/ On Wed, Jan 30, 2013 at 3:47 PM, Jason Ronallo <[log in to unmask]<mailto:[log in to unmask]>> wrote: Ed, Any suggestions or recommendations on what such an API would look like, what response format(s) would be best, and how to advertise the availability of a local name authority API? Who should we expect would use our local name authority API? Are any of the examples from the big authority databases like VIAF ones that would be good to follow for API design and response formats? Jason On Wed, Jan 30, 2013 at 3:15 PM, Ed Summers <[log in to unmask]<mailto:[log in to unmask]>> wrote: On Tue, Jan 29, 2013 at 5:19 PM, Kyle Banerjee <[log in to unmask]<mailto:[log in to unmask]>> wrote: This would certainly be a possibility for other projects, but the use case we're immediately concerned with requires an authority file that's maintained by our local archives. It contains all kinds of information about people (degrees, nicknames, etc) as well as terminology which is not technically kosher but which we know people use. Just as an aside really, I think there's a real opportunity for libraries and archives to make their local thesauri and name indexes available for integration into other applications both inside and outside their institutional walls. Wikipedia, Freebase, VIAF are great, but their notability guidelines don't always the greatest match for cultural heritage organizations. So seriously consider putting a little web app around the information you have, using it for maintaining the data, making it available programatically (API), and linking it out to other databases (VIAF, etc) as needed. A briefer/pithier way of saying this is to quote Mark Matienzo [1] Sooner or later, everyone needs a vocabulary management app. :-) //Ed PS. I think Mark Phillips has done some interesting work in this area at UNT. But I don't have anything to point you at, maybe Mark is tuned in, and can chime in. [1] https://twitter.com/anarchivist/status/269654403701682176 On Thu, Jan 31, 2013 at 5:59 AM, Ed Summers <[log in to unmask]<mailto:[log in to unmask]>> wrote: Hi Jason, Heh, sorry for the long response below. You always ask interesting questions :-D I would highly recommend that vocabulary management apps like this assign an identifier to each entity, that can be expressed as a URL. If there is any kind of database backing the app you will get the identifier for free (primary key, etc). So for example let's say you have a record for John Chapman, who is on the faculty at OSU, which has a primary key of 123 in the database, you would have a corresponding URL for that record: http://id.library.osu.edu/person/123 When someone points their browser at that URL they get back a nice HTML page describing John Chapman. I would strongly recommend that schema.org<http://schema.org> microdata and/or opengraph protocol RDFa be layered into the page for SEO purposes, as well as anyone who happens to be doing scraping. I would also highly recommend adding a sitemap to enable discovery, and synchronization. Having that URL is handy because you could add different machine readable formats that hang off of it, which you can express as <link>s in your HTML, for example lets say you want to have JSON, RDF and XML representations: http://id.library.osu.edu/person/123.json http://id.library.osu.edu/person/123.xml http://id.library.osu.edu/person/123.rdf If you want to get fancy you can content negotiate between the generic url and the format specific URLs, e.g. curl -i --header "Accept: application/json" http://id.library.osu.edu/person/123 HTTP/1.1 303 See Other date: Thu, 31 Jan 2013 10:47:44 GMT server: Apache/2.2.14 (Ubuntu) location: http://id.library.osu.edu/person/123 vary: Accept-Encoding But that's gravy. What exactly you put in these representations is a somewhat open question I think. I'm a bit biased towards SKOS for the RDF because it's lightweight, this is exactly its use case, it is flexible (you can layer other assertions in easily), and (full disclosure) I helped with the standardization of it. If you did do this you could use JSON-LD for the JSON, or just come up with something that works. Likewise for the XML. You might want to consider supporting JSON-P for the JSON representation, so that it can be used from JavaScript in other people's applications. It might be interesting to come up with some norms here for interoperability on a Wiki somewhere, or maybe a prototype of some kind. But the focus should be on what you need to actual use it in some app that needs vocabulary management. Focusing on reusing work that has already been done helps a lot too. I think that helps ground things significantly. I would be happy to discuss this further if you want. Whatever the format, I highly recommend you try to have the data link out to other places on the Web that are useful. So for example the record for John Chapman could link to his department page, blog, VIAF, Wikipedia, Google Scholar Profile, etc. This work tends to require human eyes, even if helped by a tool (Autosuggest, etc), so what you do may have to be limited, or at least an ongoing effort. Managing them (link scrubbing) is an ongoing effort too. But fitting your stuff into the larger context of the Web will mean that other people will want to use your identifiers. It's the dream of Linked Data I guess. Lastly I recommend you have an OpenSearch API, which is pretty easy, almost trivial, to put together. This would allow people to write software to search for "John Chapman" and get back results (there might be more than one) in Atom, RSS or JSON. OpenSearch also has a handy AutoSuggest format, which some JavaScript libraries work with. The nice thing about OpenSearch is that Browsers search boxes support it too. I guess this might sound like an information architecture more than an API. Hopefully it makes sense. Having a page that documents all this, with "API" written across the top, that hopefully includes terms of service, can help a lot with use by others. //Ed PS. I should mention that Jon Phipps and Diane Hillman's work on the Metadata Registry [2] did a lot to inform my thinking about the use of URLs to identify these things. The metadata registry is used for making the RDA and IFLA's FRBR vocabulary. It handles lots of stuff like versioning, etc ... which might be nice to have. Personally I would probably start small before jumping to installing the Metadata Registry, but it might be an option for you. [1] http://www.opensearch.org [2] http://trac.metadataregistry.org/ On Wed, Jan 30, 2013 at 3:47 PM, Jason Ronallo <[log in to unmask]<mailto:[log in to unmask]>> wrote: Ed, Any suggestions or recommendations on what such an API would look like, what response format(s) would be best, and how to advertise the availability of a local name authority API? Who should we expect would use our local name authority API? Are any of the examples from the big authority databases like VIAF ones that would be good to follow for API design and response formats? Jason On Wed, Jan 30, 2013 at 3:15 PM, Ed Summers <[log in to unmask]<mailto:[log in to unmask]>> wrote: On Tue, Jan 29, 2013 at 5:19 PM, Kyle Banerjee <[log in to unmask]<mailto:[log in to unmask]>> wrote: This would certainly be a possibility for other projects, but the use case we're immediately concerned with requires an authority file that's maintained by our local archives. It contains all kinds of information about people (degrees, nicknames, etc) as well as terminology which is not technically kosher but which we know people use. Just as an aside really, I think there's a real opportunity for libraries and archives to make their local thesauri and name indexes available for integration into other applications both inside and outside their institutional walls. Wikipedia, Freebase, VIAF are great, but their notability guidelines don't always the greatest match for cultural heritage organizations. So seriously consider putting a little web app around the information you have, using it for maintaining the data, making it available programatically (API), and linking it out to other databases (VIAF, etc) as needed. A briefer/pithier way of saying this is to quote Mark Matienzo [1] Sooner or later, everyone needs a vocabulary management app. :-) //Ed PS. I think Mark Phillips has done some interesting work in this area at UNT. But I don't have anything to point you at, maybe Mark is tuned in, and can chime in. [1] https://twitter.com/anarchivist/status/269654403701682176 On Thu, Jan 31, 2013 at 5:59 AM, Ed Summers <[log in to unmask]<mailto:[log in to unmask]>> wrote: Hi Jason, Heh, sorry for the long response below. You always ask interesting questions :-D I would highly recommend that vocabulary management apps like this assign an identifier to each entity, that can be expressed as a URL. If there is any kind of database backing the app you will get the identifier for free (primary key, etc). So for example let's say you have a record for John Chapman, who is on the faculty at OSU, which has a primary key of 123 in the database, you would have a corresponding URL for that record: http://id.library.osu.edu/person/123 When someone points their browser at that URL they get back a nice HTML page describing John Chapman. I would strongly recommend that schema.org<http://schema.org> microdata and/or opengraph protocol RDFa be layered into the page for SEO purposes, as well as anyone who happens to be doing scraping. I would also highly recommend adding a sitemap to enable discovery, and synchronization. Having that URL is handy because you could add different machine readable formats that hang off of it, which you can express as <link>s in your HTML, for example lets say you want to have JSON, RDF and XML representations: http://id.library.osu.edu/person/123.json http://id.library.osu.edu/person/123.xml http://id.library.osu.edu/person/123.rdf If you want to get fancy you can content negotiate between the generic url and the format specific URLs, e.g. curl -i --header "Accept: application/json" http://id.library.osu.edu/person/123 HTTP/1.1 303 See Other date: Thu, 31 Jan 2013 10:47:44 GMT server: Apache/2.2.14 (Ubuntu) location: http://id.library.osu.edu/person/123 vary: Accept-Encoding But that's gravy. What exactly you put in these representations is a somewhat open question I think. I'm a bit biased towards SKOS for the RDF because it's lightweight, this is exactly its use case, it is flexible (you can layer other assertions in easily), and (full disclosure) I helped with the standardization of it. If you did do this you could use JSON-LD for the JSON, or just come up with something that works. Likewise for the XML. You might want to consider supporting JSON-P for the JSON representation, so that it can be used from JavaScript in other people's applications. It might be interesting to come up with some norms here for interoperability on a Wiki somewhere, or maybe a prototype of some kind. But the focus should be on what you need to actual use it in some app that needs vocabulary management. Focusing on reusing work that has already been done helps a lot too. I think that helps ground things significantly. I would be happy to discuss this further if you want. Whatever the format, I highly recommend you try to have the data link out to other places on the Web that are useful. So for example the record for John Chapman could link to his department page, blog, VIAF, Wikipedia, Google Scholar Profile, etc. This work tends to require human eyes, even if helped by a tool (Autosuggest, etc), so what you do may have to be limited, or at least an ongoing effort. Managing them (link scrubbing) is an ongoing effort too. But fitting your stuff into the larger context of the Web will mean that other people will want to use your identifiers. It's the dream of Linked Data I guess. Lastly I recommend you have an OpenSearch API, which is pretty easy, almost trivial, to put together. This would allow people to write software to search for "John Chapman" and get back results (there might be more than one) in Atom, RSS or JSON. OpenSearch also has a handy AutoSuggest format, which some JavaScript libraries work with. The nice thing about OpenSearch is that Browsers search boxes support it too. I guess this might sound like an information architecture more than an API. Hopefully it makes sense. Having a page that documents all this, with "API" written across the top, that hopefully includes terms of service, can help a lot with use by others. //Ed PS. I should mention that Jon Phipps and Diane Hillman's work on the Metadata Registry [2] did a lot to inform my thinking about the use of URLs to identify these things. The metadata registry is used for making the RDA and IFLA's FRBR vocabulary. It handles lots of stuff like versioning, etc ... which might be nice to have. Personally I would probably start small before jumping to installing the Metadata Registry, but it might be an option for you. [1] http://www.opensearch.org [2] http://trac.metadataregistry.org/ On Wed, Jan 30, 2013 at 3:47 PM, Jason Ronallo <[log in to unmask]<mailto:[log in to unmask]>> wrote: Ed, Any suggestions or recommendations on what such an API would look like, what response format(s) would be best, and how to advertise the availability of a local name authority API? Who should we expect would use our local name authority API? Are any of the examples from the big authority databases like VIAF ones that would be good to follow for API design and response formats? Jason On Wed, Jan 30, 2013 at 3:15 PM, Ed Summers <[log in to unmask]<mailto:[log in to unmask]>> wrote: On Tue, Jan 29, 2013 at 5:19 PM, Kyle Banerjee <[log in to unmask]<mailto:[log in to unmask]>> wrote: This would certainly be a possibility for other projects, but the use case we're immediately concerned with requires an authority file that's maintained by our local archives. It contains all kinds of information about people (degrees, nicknames, etc) as well as terminology which is not technically kosher but which we know people use. Just as an aside really, I think there's a real opportunity for libraries and archives to make their local thesauri and name indexes available for integration into other applications both inside and outside their institutional walls. Wikipedia, Freebase, VIAF are great, but their notability guidelines don't always the greatest match for cultural heritage organizations. So seriously consider putting a little web app around the information you have, using it for maintaining the data, making it available programatically (API), and linking it out to other databases (VIAF, etc) as needed. A briefer/pithier way of saying this is to quote Mark Matienzo [1] Sooner or later, everyone needs a vocabulary management app. :-) //Ed PS. I think Mark Phillips has done some interesting work in this area at UNT. But I don't have anything to point you at, maybe Mark is tuned in, and can chime in. [1] https://twitter.com/anarchivist/status/269654403701682176