LISTSERV mailing list manager LISTSERV 16.5

Help for CODE4LIB Archives


CODE4LIB Archives

CODE4LIB Archives


CODE4LIB@LISTS.CLIR.ORG


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Monospaced Font

LISTSERV Archives

LISTSERV Archives

CODE4LIB Home

CODE4LIB Home

CODE4LIB  February 2013

CODE4LIB February 2013

Subject:

Re: Adding authority control to IR's that don't have it built in

From:

"McAulay, Elizabeth" <[log in to unmask]>

Reply-To:

Code for Libraries <[log in to unmask]>

Date:

Fri, 1 Feb 2013 17:57:46 +0000

Content-Type:

text/plain

Parts/Attachments:

Parts/Attachments

text/plain (459 lines)

Hi,

I've been following this thread carefully, and am very interested. At UCLA, we have the Frontera collection (http://frontera.library.ucla.edu/) and we have a local set of authorities because the performers and publishers are more ephemeral than what's usually in LCNAF. So, we're thinking of providing these values to others via API or something to help share what we know and get input from others. So, that's our use case for publishing out. Curious about everyone's thoughts.

Best,
Lisa

On Feb 1, 2013, at 9:44 AM, "Jason Ronallo" <[log in to unmask]<mailto:[log in to unmask]>> wrote:

Ed,

Thank you for the detailed response. That was very helpful. Yes, it
seems like good Web architecture is the API. Sounds like it would be
easy enough to start somewhere and add features over time.

I could see how exposing this data in a crawlable way could provide
some nice indexed landing pages to help improve discoverability of
related collections. I wonder though if this begs the question of who
other than my own institution would use such local authorities? Would
there really be other consumers? What's the likelihood that other
institutions will need to reuse my local name authorities?

Is the idea that if enough of us publish our local data in this way
that there could be aggregators or other means to make it easier to
reuse from a single source?

I can see the use case for a local authorities app. While I think it
would be cool to expose our local data to the world in this way, I'm
still trying to grasp at the larger value proposition.

Jason

On Thu, Jan 31, 2013 at 5:59 AM, Ed Summers <[log in to unmask]<mailto:[log in to unmask]>> wrote:
Hi Jason,

Heh, sorry for the long response below. You always ask interesting questions :-D

I would highly recommend that vocabulary management apps like this
assign an identifier to each entity, that can be expressed as a URL.
If there is any kind of database backing the app you will get the
identifier for free (primary key, etc). So for example let's say you
have a record for John Chapman, who is on the faculty at OSU, which
has a primary key of 123 in the database, you would have a
corresponding URL for that record:

 http://id.library.osu.edu/person/123

When someone points their browser at that URL they get back a nice
HTML page describing John Chapman. I would strongly recommend that
schema.org<http://schema.org> microdata and/or opengraph protocol RDFa be layered into
the page for SEO purposes, as well as anyone who happens to be doing
scraping. I would also highly recommend adding a sitemap to enable
discovery, and synchronization.

Having that URL is handy because you could add different machine
readable formats that hang off of it, which you can express as <link>s
in your HTML, for example lets say you want to have JSON, RDF and XML
representations:

 http://id.library.osu.edu/person/123.json
 http://id.library.osu.edu/person/123.xml
 http://id.library.osu.edu/person/123.rdf

If you want to get fancy you can content negotiate between the generic
url and the format specific URLs, e.g.

 curl -i --header "Accept: application/json"
http://id.library.osu.edu/person/123
 HTTP/1.1 303 See Other
 date: Thu, 31 Jan 2013 10:47:44 GMT
 server: Apache/2.2.14 (Ubuntu)
 location: http://id.library.osu.edu/person/123
 vary: Accept-Encoding

But that's gravy.

What exactly you put in these representations is a somewhat open
question I think. I'm a bit biased towards SKOS for the RDF because
it's lightweight, this is exactly its use case, it is flexible (you
can layer other assertions in easily), and (full disclosure) I helped
with the standardization of it. If you did do this you could use
JSON-LD for the JSON, or just come up with something that works.
Likewise for the XML. You might want to consider supporting JSON-P for
the JSON representation, so that it can be used from JavaScript in
other people's applications.

It might be interesting to come up with some norms here for
interoperability on a Wiki somewhere, or maybe a prototype of some
kind. But the focus should be on what you need to actual use it in
some app that needs vocabulary management. Focusing on reusing work
that has already been done helps a lot too. I think that helps ground
things significantly. I would be happy to discuss this further if you
want.

Whatever the format, I highly recommend you try to have the data link
out to other places on the Web that are useful. So for example the
record for John Chapman could link to his department page, blog, VIAF,
Wikipedia, Google Scholar Profile, etc. This work tends to require
human eyes, even if helped by a tool (Autosuggest, etc), so what you
do may have to be limited, or at least an ongoing effort. Managing
them (link scrubbing) is an ongoing effort too. But fitting your stuff
into the larger context of the Web will mean that other people will
want to use your identifiers. It's the dream of Linked Data I guess.

Lastly I recommend you have an OpenSearch API, which is pretty easy,
almost trivial, to put together. This would allow people to write
software to search for "John Chapman" and get back results (there
might be more than one) in Atom, RSS or JSON. OpenSearch also has a
handy AutoSuggest format, which some JavaScript libraries work with.
The nice thing about OpenSearch is that Browsers search boxes support
it too.

I guess this might sound like an information architecture more than an
API. Hopefully it makes sense. Having a page that documents all this,
with "API" written across the top, that hopefully includes terms of
service, can help a lot with use by others.

//Ed

PS. I should mention that Jon Phipps and Diane Hillman's work on the
Metadata Registry [2] did a lot to inform my thinking about the use of
URLs to identify these things. The metadata registry is used for
making the RDA and IFLA's FRBR vocabulary. It handles lots of stuff
like versioning, etc ... which might be nice to have. Personally I
would probably start small before jumping to installing the Metadata
Registry, but it might be an option for you.

[1] http://www.opensearch.org
[2] http://trac.metadataregistry.org/

On Wed, Jan 30, 2013 at 3:47 PM, Jason Ronallo <[log in to unmask]<mailto:[log in to unmask]>> wrote:
Ed,

Any suggestions or recommendations on what such an API would look
like, what response format(s) would be best, and how to advertise the
availability of a local name authority API? Who should we expect would
use our local name authority API? Are any of the examples from the big
authority databases like VIAF ones that would be good to follow for
API design and response formats?

Jason

On Wed, Jan 30, 2013 at 3:15 PM, Ed Summers <[log in to unmask]<mailto:[log in to unmask]>> wrote:
On Tue, Jan 29, 2013 at 5:19 PM, Kyle Banerjee <[log in to unmask]<mailto:[log in to unmask]>> wrote:
This would certainly be a possibility for other projects, but the use case
we're immediately concerned with requires an authority file that's
maintained by our local archives. It contains all kinds of information
about people (degrees, nicknames, etc) as well as terminology which is not
technically kosher but which we know people use.

Just as an aside really, I think there's a real opportunity for
libraries and archives to make their local thesauri and name indexes
available for integration into other applications both inside and
outside their institutional walls. Wikipedia, Freebase, VIAF are
great, but their notability guidelines don't always the greatest match
for cultural heritage organizations. So seriously consider putting a
little web app around the information you have, using it for
maintaining the data, making it available programatically (API), and
linking it out to other databases (VIAF, etc) as needed.

A briefer/pithier way of saying this is to quote Mark Matienzo [1]

 Sooner or later, everyone needs a vocabulary management app.

:-)

//Ed

PS. I think Mark Phillips has done some interesting work in this area
at UNT. But I don't have anything to point you at, maybe Mark is tuned
in, and can chime in.

[1] https://twitter.com/anarchivist/status/269654403701682176


On Thu, Jan 31, 2013 at 5:59 AM, Ed Summers <[log in to unmask]<mailto:[log in to unmask]>> wrote:
Hi Jason,

Heh, sorry for the long response below. You always ask interesting questions :-D

I would highly recommend that vocabulary management apps like this
assign an identifier to each entity, that can be expressed as a URL.
If there is any kind of database backing the app you will get the
identifier for free (primary key, etc). So for example let's say you
have a record for John Chapman, who is on the faculty at OSU, which
has a primary key of 123 in the database, you would have a
corresponding URL for that record:

 http://id.library.osu.edu/person/123

When someone points their browser at that URL they get back a nice
HTML page describing John Chapman. I would strongly recommend that
schema.org<http://schema.org> microdata and/or opengraph protocol RDFa be layered into
the page for SEO purposes, as well as anyone who happens to be doing
scraping. I would also highly recommend adding a sitemap to enable
discovery, and synchronization.

Having that URL is handy because you could add different machine
readable formats that hang off of it, which you can express as <link>s
in your HTML, for example lets say you want to have JSON, RDF and XML
representations:

 http://id.library.osu.edu/person/123.json
 http://id.library.osu.edu/person/123.xml
 http://id.library.osu.edu/person/123.rdf

If you want to get fancy you can content negotiate between the generic
url and the format specific URLs, e.g.

 curl -i --header "Accept: application/json"
http://id.library.osu.edu/person/123
 HTTP/1.1 303 See Other
 date: Thu, 31 Jan 2013 10:47:44 GMT
 server: Apache/2.2.14 (Ubuntu)
 location: http://id.library.osu.edu/person/123
 vary: Accept-Encoding

But that's gravy.

What exactly you put in these representations is a somewhat open
question I think. I'm a bit biased towards SKOS for the RDF because
it's lightweight, this is exactly its use case, it is flexible (you
can layer other assertions in easily), and (full disclosure) I helped
with the standardization of it. If you did do this you could use
JSON-LD for the JSON, or just come up with something that works.
Likewise for the XML. You might want to consider supporting JSON-P for
the JSON representation, so that it can be used from JavaScript in
other people's applications.

It might be interesting to come up with some norms here for
interoperability on a Wiki somewhere, or maybe a prototype of some
kind. But the focus should be on what you need to actual use it in
some app that needs vocabulary management. Focusing on reusing work
that has already been done helps a lot too. I think that helps ground
things significantly. I would be happy to discuss this further if you
want.

Whatever the format, I highly recommend you try to have the data link
out to other places on the Web that are useful. So for example the
record for John Chapman could link to his department page, blog, VIAF,
Wikipedia, Google Scholar Profile, etc. This work tends to require
human eyes, even if helped by a tool (Autosuggest, etc), so what you
do may have to be limited, or at least an ongoing effort. Managing
them (link scrubbing) is an ongoing effort too. But fitting your stuff
into the larger context of the Web will mean that other people will
want to use your identifiers. It's the dream of Linked Data I guess.

Lastly I recommend you have an OpenSearch API, which is pretty easy,
almost trivial, to put together. This would allow people to write
software to search for "John Chapman" and get back results (there
might be more than one) in Atom, RSS or JSON. OpenSearch also has a
handy AutoSuggest format, which some JavaScript libraries work with.
The nice thing about OpenSearch is that Browsers search boxes support
it too.

I guess this might sound like an information architecture more than an
API. Hopefully it makes sense. Having a page that documents all this,
with "API" written across the top, that hopefully includes terms of
service, can help a lot with use by others.

//Ed

PS. I should mention that Jon Phipps and Diane Hillman's work on the
Metadata Registry [2] did a lot to inform my thinking about the use of
URLs to identify these things. The metadata registry is used for
making the RDA and IFLA's FRBR vocabulary. It handles lots of stuff
like versioning, etc ... which might be nice to have. Personally I
would probably start small before jumping to installing the Metadata
Registry, but it might be an option for you.

[1] http://www.opensearch.org
[2] http://trac.metadataregistry.org/

On Wed, Jan 30, 2013 at 3:47 PM, Jason Ronallo <[log in to unmask]<mailto:[log in to unmask]>> wrote:
Ed,

Any suggestions or recommendations on what such an API would look
like, what response format(s) would be best, and how to advertise the
availability of a local name authority API? Who should we expect would
use our local name authority API? Are any of the examples from the big
authority databases like VIAF ones that would be good to follow for
API design and response formats?

Jason

On Wed, Jan 30, 2013 at 3:15 PM, Ed Summers <[log in to unmask]<mailto:[log in to unmask]>> wrote:
On Tue, Jan 29, 2013 at 5:19 PM, Kyle Banerjee <[log in to unmask]<mailto:[log in to unmask]>> wrote:
This would certainly be a possibility for other projects, but the use case
we're immediately concerned with requires an authority file that's
maintained by our local archives. It contains all kinds of information
about people (degrees, nicknames, etc) as well as terminology which is not
technically kosher but which we know people use.

Just as an aside really, I think there's a real opportunity for
libraries and archives to make their local thesauri and name indexes
available for integration into other applications both inside and
outside their institutional walls. Wikipedia, Freebase, VIAF are
great, but their notability guidelines don't always the greatest match
for cultural heritage organizations. So seriously consider putting a
little web app around the information you have, using it for
maintaining the data, making it available programatically (API), and
linking it out to other databases (VIAF, etc) as needed.

A briefer/pithier way of saying this is to quote Mark Matienzo [1]

 Sooner or later, everyone needs a vocabulary management app.

:-)

//Ed

PS. I think Mark Phillips has done some interesting work in this area
at UNT. But I don't have anything to point you at, maybe Mark is tuned
in, and can chime in.

[1] https://twitter.com/anarchivist/status/269654403701682176


On Thu, Jan 31, 2013 at 5:59 AM, Ed Summers <[log in to unmask]<mailto:[log in to unmask]>> wrote:
Hi Jason,

Heh, sorry for the long response below. You always ask interesting questions :-D

I would highly recommend that vocabulary management apps like this
assign an identifier to each entity, that can be expressed as a URL.
If there is any kind of database backing the app you will get the
identifier for free (primary key, etc). So for example let's say you
have a record for John Chapman, who is on the faculty at OSU, which
has a primary key of 123 in the database, you would have a
corresponding URL for that record:

 http://id.library.osu.edu/person/123

When someone points their browser at that URL they get back a nice
HTML page describing John Chapman. I would strongly recommend that
schema.org<http://schema.org> microdata and/or opengraph protocol RDFa be layered into
the page for SEO purposes, as well as anyone who happens to be doing
scraping. I would also highly recommend adding a sitemap to enable
discovery, and synchronization.

Having that URL is handy because you could add different machine
readable formats that hang off of it, which you can express as <link>s
in your HTML, for example lets say you want to have JSON, RDF and XML
representations:

 http://id.library.osu.edu/person/123.json
 http://id.library.osu.edu/person/123.xml
 http://id.library.osu.edu/person/123.rdf

If you want to get fancy you can content negotiate between the generic
url and the format specific URLs, e.g.

 curl -i --header "Accept: application/json"
http://id.library.osu.edu/person/123
 HTTP/1.1 303 See Other
 date: Thu, 31 Jan 2013 10:47:44 GMT
 server: Apache/2.2.14 (Ubuntu)
 location: http://id.library.osu.edu/person/123
 vary: Accept-Encoding

But that's gravy.

What exactly you put in these representations is a somewhat open
question I think. I'm a bit biased towards SKOS for the RDF because
it's lightweight, this is exactly its use case, it is flexible (you
can layer other assertions in easily), and (full disclosure) I helped
with the standardization of it. If you did do this you could use
JSON-LD for the JSON, or just come up with something that works.
Likewise for the XML. You might want to consider supporting JSON-P for
the JSON representation, so that it can be used from JavaScript in
other people's applications.

It might be interesting to come up with some norms here for
interoperability on a Wiki somewhere, or maybe a prototype of some
kind. But the focus should be on what you need to actual use it in
some app that needs vocabulary management. Focusing on reusing work
that has already been done helps a lot too. I think that helps ground
things significantly. I would be happy to discuss this further if you
want.

Whatever the format, I highly recommend you try to have the data link
out to other places on the Web that are useful. So for example the
record for John Chapman could link to his department page, blog, VIAF,
Wikipedia, Google Scholar Profile, etc. This work tends to require
human eyes, even if helped by a tool (Autosuggest, etc), so what you
do may have to be limited, or at least an ongoing effort. Managing
them (link scrubbing) is an ongoing effort too. But fitting your stuff
into the larger context of the Web will mean that other people will
want to use your identifiers. It's the dream of Linked Data I guess.

Lastly I recommend you have an OpenSearch API, which is pretty easy,
almost trivial, to put together. This would allow people to write
software to search for "John Chapman" and get back results (there
might be more than one) in Atom, RSS or JSON. OpenSearch also has a
handy AutoSuggest format, which some JavaScript libraries work with.
The nice thing about OpenSearch is that Browsers search boxes support
it too.

I guess this might sound like an information architecture more than an
API. Hopefully it makes sense. Having a page that documents all this,
with "API" written across the top, that hopefully includes terms of
service, can help a lot with use by others.

//Ed

PS. I should mention that Jon Phipps and Diane Hillman's work on the
Metadata Registry [2] did a lot to inform my thinking about the use of
URLs to identify these things. The metadata registry is used for
making the RDA and IFLA's FRBR vocabulary. It handles lots of stuff
like versioning, etc ... which might be nice to have. Personally I
would probably start small before jumping to installing the Metadata
Registry, but it might be an option for you.

[1] http://www.opensearch.org
[2] http://trac.metadataregistry.org/

On Wed, Jan 30, 2013 at 3:47 PM, Jason Ronallo <[log in to unmask]<mailto:[log in to unmask]>> wrote:
Ed,

Any suggestions or recommendations on what such an API would look
like, what response format(s) would be best, and how to advertise the
availability of a local name authority API? Who should we expect would
use our local name authority API? Are any of the examples from the big
authority databases like VIAF ones that would be good to follow for
API design and response formats?

Jason

On Wed, Jan 30, 2013 at 3:15 PM, Ed Summers <[log in to unmask]<mailto:[log in to unmask]>> wrote:
On Tue, Jan 29, 2013 at 5:19 PM, Kyle Banerjee <[log in to unmask]<mailto:[log in to unmask]>> wrote:
This would certainly be a possibility for other projects, but the use case
we're immediately concerned with requires an authority file that's
maintained by our local archives. It contains all kinds of information
about people (degrees, nicknames, etc) as well as terminology which is not
technically kosher but which we know people use.

Just as an aside really, I think there's a real opportunity for
libraries and archives to make their local thesauri and name indexes
available for integration into other applications both inside and
outside their institutional walls. Wikipedia, Freebase, VIAF are
great, but their notability guidelines don't always the greatest match
for cultural heritage organizations. So seriously consider putting a
little web app around the information you have, using it for
maintaining the data, making it available programatically (API), and
linking it out to other databases (VIAF, etc) as needed.

A briefer/pithier way of saying this is to quote Mark Matienzo [1]

 Sooner or later, everyone needs a vocabulary management app.

:-)

//Ed

PS. I think Mark Phillips has done some interesting work in this area
at UNT. But I don't have anything to point you at, maybe Mark is tuned
in, and can chime in.

[1] https://twitter.com/anarchivist/status/269654403701682176

Top of Message | Previous Page | Permalink

Advanced Options


Options

Log In

Log In

Get Password

Get Password


Search Archives

Search Archives


Subscribe or Unsubscribe

Subscribe or Unsubscribe


Archives

March 2024
February 2024
January 2024
December 2023
November 2023
October 2023
September 2023
August 2023
July 2023
June 2023
May 2023
April 2023
March 2023
February 2023
January 2023
December 2022
November 2022
October 2022
September 2022
August 2022
July 2022
June 2022
May 2022
April 2022
March 2022
February 2022
January 2022
December 2021
November 2021
October 2021
September 2021
August 2021
July 2021
June 2021
May 2021
April 2021
March 2021
February 2021
January 2021
December 2020
November 2020
October 2020
September 2020
August 2020
July 2020
June 2020
May 2020
April 2020
March 2020
February 2020
January 2020
December 2019
November 2019
October 2019
September 2019
August 2019
July 2019
June 2019
May 2019
April 2019
March 2019
February 2019
January 2019
December 2018
November 2018
October 2018
September 2018
August 2018
July 2018
June 2018
May 2018
April 2018
March 2018
February 2018
January 2018
December 2017
November 2017
October 2017
September 2017
August 2017
July 2017
June 2017
May 2017
April 2017
March 2017
February 2017
January 2017
December 2016
November 2016
October 2016
September 2016
August 2016
July 2016
June 2016
May 2016
April 2016
March 2016
February 2016
January 2016
December 2015
November 2015
October 2015
September 2015
August 2015
July 2015
June 2015
May 2015
April 2015
March 2015
February 2015
January 2015
December 2014
November 2014
October 2014
September 2014
August 2014
July 2014
June 2014
May 2014
April 2014
March 2014
February 2014
January 2014
December 2013
November 2013
October 2013
September 2013
August 2013
July 2013
June 2013
May 2013
April 2013
March 2013
February 2013
January 2013
December 2012
November 2012
October 2012
September 2012
August 2012
July 2012
June 2012
May 2012
April 2012
March 2012
February 2012
January 2012
December 2011
November 2011
October 2011
September 2011
August 2011
July 2011
June 2011
May 2011
April 2011
March 2011
February 2011
January 2011
December 2010
November 2010
October 2010
September 2010
August 2010
July 2010
June 2010
May 2010
April 2010
March 2010
February 2010
January 2010
December 2009
November 2009
October 2009
September 2009
August 2009
July 2009
June 2009
May 2009
April 2009
March 2009
February 2009
January 2009
December 2008
November 2008
October 2008
September 2008
August 2008
July 2008
June 2008
May 2008
April 2008
March 2008
February 2008
January 2008
December 2007
November 2007
October 2007
September 2007
August 2007
July 2007
June 2007
May 2007
April 2007
March 2007
February 2007
January 2007
December 2006
November 2006
October 2006
September 2006
August 2006
July 2006
June 2006
May 2006
April 2006
March 2006
February 2006
January 2006
December 2005
November 2005
October 2005
September 2005
August 2005
July 2005
June 2005
May 2005
April 2005
March 2005
February 2005
January 2005
December 2004
November 2004
October 2004
September 2004
August 2004
July 2004
June 2004
May 2004
April 2004
March 2004
February 2004
January 2004
December 2003
November 2003

ATOM RSS1 RSS2



LISTS.CLIR.ORG

CataList Email List Search Powered by the LISTSERV Email List Manager