We look for common work titles and common dates primarily. I believe there
are also sometimes actual links to other authority records.
Ralph
On Mon, May 23, 2011 at 12:00 PM, Ya'aqov Ziso <[log in to unmask]> wrote:
> *Oh yes, your clarification helps, Ralph. *
> *
> *
> ***So WP data ends up in a cluster (more than one entity) for a certain
> string that applies to more than one person/heading (therefore it is
> ambiguous). What processes is VIAF running to dis-ambiguate THAT heading?*
> *Ya'aqov*
>
>
> On Mon, May 23, 2011 at 8:04 AM, LeVan,Ralph <[log in to unmask]> wrote:
>
> > I think you misunderstood that Ya'aqov.
> >
> > What we do is make local authority records out of the Wikipedia records
> > that we've identified as names. So the adding dates and stuff is to the
> > local authority record of that Wikipedia record. We then use our usual
> VIAF
> > matching technology between those Wikipedia authority records and the
> other
> > authority records in VIAF. Wikipedia records that end up in a VIAF
> cluster
> > get kept and the others get dropped as not matching anything we have in
> > VIAF.
> >
> > I hope that helps!
> >
> > Ralph
> >
> > > -----Original Message-----
> > > From: Code for Libraries [mailto:[log in to unmask]] On Behalf
> Of
> > > Ya'aqov Ziso
> > > Sent: Sunday, May 22, 2011 5:15 PM
> > > To: [log in to unmask]
> > > Subject: Re: [CODE4LIB] wikipedia/author disambiguation
> > >
> > > Thanks Karen, but you don't indicate yet, how you solve disambiguation?
> > >
> > > You indicate how you use WP as a resource for adding dates and subjects
> > > when
> > > they are missing.
> > > You don't indicate when/how you are resolving ambiguities with WP data.
> > >
> > > Again, please use Morris William as an example,
> > > *Ya'aqov*
> > >
> > >
> > >
> > >
> > > > *Once a year OCLC downloads Wikipedia and then we extract as much
> > > > information from it as we can. This generally involves reading
> through
> > > > their current information for templates, etc. Then we try to figure
> > > > out which pages are people. Within the people pages we look for birth
> > > > dates, death dates, work titles, ISBNs, oclc numbers, worldcat
> > > > identity links, LCCNs ... anything that we have in VIAF for matching
> > > > purposes. Then we build marc-ish records for each of the extracted
> > > > person. After that the records go through the normal VIAF matching
> > > > processes.
> > > >
> > > > The process gets changed and tweaked each year.*
> >
>
>
>
> --
> *ya'aqov**ZISO | **[log in to unmask] **| 856 217 3456
>
> *
>
|