This sounds like a great way to "translate" from library forms to
wikipedia name forms. But for on-the-fly use I wonder if it wouldn't
be more efficient to eliminate the "middle man." Karen, can you say a
little about what it took to link library names to WP? Was it a
one-step, two-step, etc.?
There is a script that I've seen used, although it doesn't seem to be
One interesting note from the OL experience of linking to WP:
generally you need to "re-reverse" the names to get a match: from
Twain, Mark to Mark Twain. But for some names that isn't the case:
Mao, Tse-Tung. Edward Betts used Wikipedia to determine which names do
not get "re-reversed".
The OL code for its wikipedia lookup is at:
It, however, runs against dumps rather than an API.
Quoting Karen Coombs <[log in to unmask]>:
> I'd advocate using WorldCat Identities to get to the appropriate url
> for dbpedia. Each Identity record has a wikipedia element in it that
> you could use to link to either Wikipedia or dbpedia.
> If you want to see an example of this in action you can check out the
> Author Info demo I did for code4lib 2010 here -
> The code for this demo is available for download at -
> You'll want the author_info folder and identity_info.php
> Karen A. Coombs
> Product Manager
> OCLC Developer Network
> [log in to unmask]
> On Thu, May 19, 2011 at 4:40 AM, graham <[log in to unmask]> wrote:
>> I need to be able to take author data from a catalogue record and use it
>> to look up the author on Wikipedia on the fly. So I may have birth date
>> and possibly year of death in addition to (one spelling of) the name,
>> the title of one book the author wrote etc.
>> I know there are various efforts in progress that will improve the
>> current situation, but as things stand at the moment what is the best*
>> way to do this?
>> 1. query wikipedia for as much as possible, parse and select the best
>> fitting result
>> 2. go via dbpedia/freebase and work back from there
>> 3. use VIAF and/or OCLC services
>> 4. Other?
>> (I have no experience of 2-4 yet :-(
>> * 'best' being constrained by:
>> - need to do this in real-time
>> - need to avoid dependence on services which may be taken away
>> or charged for
>> - being able to justify to librarians as reasonably accurate :-)
[log in to unmask] http://kcoyle.net