In terms of linking from a catalog record to an author in Wikipedia,
you can also link into Identities using an OpenURL that contains an
OCLC Number and a name. This will get you a cleaner match then using
doing a name search because it uses both the author and the book. I
just haven't changed my code to use that method.
I inquired about how we add links to Wikipedia into Identities and
VIAF. Below is a summary of the answer I got
Once a year OCLC downloads Wikipedia and then we extract as much
information from it as we can. This generally involves reading through
their current information for templates, etc. Then we try to figure
out which pages are people. Within the people pages we look for birth
dates, death dates, work titles, ISBNs, oclc numbers, worldcat
identity links, LCCNs ... anything that we have in VIAF for matching
purposes. Then we build marc-ish records for each of the extracted
person. After that the records go through the normal VIAF matching
The process gets changed and tweaked each year.
I hope this helps.
On Thu, May 19, 2011 at 3:48 PM, Graham Seaman <[log in to unmask]> wrote:
> Hi Karen
> Thanks for the code. As far as I can see though it doesn't actually
> solve my disambiguation problem - since identity_info.php just takes a
> name as input, it can't guess which of the people with this name is
> meant other than by using the most commonly referenced one, which in the
> OCLC data actually seems to often be an amalgam of several people with
> the name; for example
> is William Morris, the 18th century African-American engineer whose most
> widely held works include News from Nowhere, Introduction to Fly
> Fishing, and Ancient Slavery Disapproved of by God - ie an amalgamation
> of the various most famous people known by this name.
> I guess this is just a hard problem overall.
> On 05/19/11 14:56, Karen Coombs wrote:
>> I'd advocate using WorldCat Identities to get to the appropriate url
>> for dbpedia. Each Identity record has a wikipedia element in it that
>> you could use to link to either Wikipedia or dbpedia.
>> If you want to see an example of this in action you can check out the
>> Author Info demo I did for code4lib 2010 here -
>> The code for this demo is available for download at -
>> You'll want the author_info folder and identity_info.php
>> Karen A. Coombs
>> Product Manager
>> OCLC Developer Network
>> [log in to unmask]
>> On Thu, May 19, 2011 at 4:40 AM, graham <[log in to unmask]> wrote:
>>> I need to be able to take author data from a catalogue record and use it
>>> to look up the author on Wikipedia on the fly. So I may have birth date
>>> and possibly year of death in addition to (one spelling of) the name,
>>> the title of one book the author wrote etc.
>>> I know there are various efforts in progress that will improve the
>>> current situation, but as things stand at the moment what is the best*
>>> way to do this?
>>> 1. query wikipedia for as much as possible, parse and select the best
>>> fitting result
>>> 2. go via dbpedia/freebase and work back from there
>>> 3. use VIAF and/or OCLC services
>>> 4. Other?
>>> (I have no experience of 2-4 yet :-(
>>> * 'best' being constrained by:
>>> - need to do this in real-time
>>> - need to avoid dependence on services which may be taken away
>>> or charged for
>>> - being able to justify to librarians as reasonably accurate :-)