Publishers' names and other corporate names are problematic because they change all the time! LoC/BIBFRAME Kevin Ford did some work on this last year -
It’s called BF providers -
This is a piece of a report for ALA Midwner 2019 - https://cdn.ymaws.com/www.musiclibraryassoc.org/resource/resmgr/BCC_ALA_Reports/2019_ALA-Midwinter_MAC.pdf
Kevin Ford, LCU
Update on LC’s work with BIBFRAME and streamlining LC’s BF dataset What LC has done since ALA Annual: continued pilot work, refined conversion (on Github); collaborations with SINOPIA group, and authorities group to extract metadata from id.loc.gov; BF Editor updates (cloning works and instances, bettter interaction with database and editor); trying to reduce verbosity in RDF and trying to reduce blank nodes (anonymous resources in RDF)
Re blank nodes, resources identified with blank nodes lack URIs that Candice be shared easily. They’re unavoidable in RDF, are written into the spec for RDF.
Part of the processing. Should everything have URIs (“URIs are commitments”)?KevinFord’s current bugaboo. Results in a lot duplicatation; less efficient scaling.
Example from providers in BF: Blank nodes for “United States” and “Columbia Pictures Home Entertainment” strings. They worked with an experimental Provider file. A data analysis showed that out of ca 15 million records contained only 1.2M had unique strings. Out of 1.2M providers they came up with ca 800K providersafter parsing agents in ID.LOC, loaded into ID.LOC, larger than many other files there.
The test file can be accessed at:
http://id.loc.gov/search/?q=memberOf:http://id.loc.gov/bfentities/providers/collection_Providers
(For an example of clustering and reducing blank nodes:http://id.loc.gov/bfentities/providers/4599ff4baa77b72ddd0b65a9972c8b15.html)
These are NOT MEANT TO BE AUTHORITY RECORDS
> On Sep 17, 2020, at 8:14 AM, [log in to unmask] wrote:
>
> Hello,
>
> I am trying with openrefine to fix all the different versions of
> publishers' names we have in our records. I would like to reconcile, but i
> have not found yet a reconciliation service that knows most or all of the
> name variants,
>
> any ideas?
>
> Thank you in advance
[log in to unmask]
Debra Shapiro
The iSchool at UW-Madison
Helen C. White Hall, Rm. 4282
600 N. Park St.
Madison WI 53706
608 262 9195
mobile 608 712 6368
https://ischool.wisc.edu/blog/staff/shapiro-debra/
pronouns she | her | hers
|