What constitutes a reasonable path depends on the time, labor, and skills
you have access to. Having said that, my gut reaction would be to keep
things simple as Kylene suggests. If key data points from your original
spreadsheet are put side by side with the corresponding ones from the OCLC
records to mark the OCLC records you want to be downloaded and separate out
the ones that will be more work. Eyeballing and marking 17K records isn't
as crazy as it sounds if the match strategy is decent and the work process
is sufficiently ergonomic/efficient.
It might be worthwhile to a more precise strategy on an initial pass and
progressing to dirtier methods on subsequent passes. One advantage of the
multiple passes method is that it will help you develop a good sense of
what your and OCLC's data are like and the best way to line things up.
On Tue, Jan 7, 2020 at 6:33 AM Harper, Cynthia <[log in to unmask]> wrote:
> I have 17,000 records at this point. The data is forced into MARC format
> from a spreadsheet. I don't know how reliable the music number is. I tried
> the sample record below and find not match in OCLC, although there are
> other scores with an 028 that matches this number (not the same
> composer/title though).
> -----Original Message-----
> From: Code for Libraries <[log in to unmask]> On Behalf Of Kyle
> Sent: Monday, January 06, 2020 5:55 PM
> To: [log in to unmask]
> Subject: Re: [CODE4LIB] matching brief cataloging to OCLC records - scores
> Hi Cindy,
> Could you say a bit more about your project -- i.e. how many items you're
> dealing with, uniqueness of records you need to match against, reliability
> of the individual data points you're using for your key?
> As an abstract proposition, dirty matches are tricky. The basic approaches
> are to create a similarity quotient or to use fragments of fields you can
> reliably expect to find (kind of like old school OCLC derived searches).
> Unless you truly have a lot of records, I wouldn't rule out manual methods
> for a substantial amount of the work -- for a second pass, humans are
> prettty efficient at cutting through thousands of records.
> On Mon, Jan 6, 2020 at 11:07 AM Harper, Cynthia <[log in to unmask]> wrote:
> > Hello - I want to match some brief cataloging records for music
> > scores to OCLC record numbers. I am inconversation with OCLC about
> > doing this with a reclamation/datasync type process, but I'm not
> > optimistic about that, because there's going to be a lot of mismatch in
> the details.
> > Here's a sample record:
> > =LDR \\\\\ncm a22\\\\\ a 4500
> > =935 \\$aLAiO|Adams, Leslie|Hosanna to|2927|Walton Music|1976|Vocal
> > Score For Satb Chorus And Piano
> > =100 1\$aAdams, Leslie.
> > =245 \\$aHosanna to the Son of David.
> > =028 32$a2927
> > =264 \1$aNew York : $bWalton Music Corporation , $c1976.
> > =300 \\$aScore, not-stapled, soft-cover; $c26.6 cm
> > =336 \\$anotated music$bntm$2rdacontent
> > =337 \\$aunmediated$bn$2rdamedia
> > =338 \\$avolume$bnc$2rdacarrier
> > =500 \\$aVocal score for SATB chorus and piano
> > =700 1\
> > =700 1\
> > =700 1\
> > =910 \$aLewis Collection - Anthems in Octavo
> > =910 \$aFile Cabinet 5, drawer 1
> > That 935 is my attempt to cobble a "Unique" key from the records. I'm
> > thinking those blank 710s (editor, arranger, translator) will be
> > removed by MarcEdit.
> > I think what would be best would be to search OCLC API by title +
> > composer
> > + mat-type=score, and check through these for a matching 028 and
> > + matching
> > significant words in the 500 note, and produce an output of multiple
> > rows for any matching records, to be combed through via the next set
> > of automated process.
> > What software exists to do this? Your wisdom is appreciated.
> > Cindy Harper
> > E-services and periodicals librarian
> > Virginia Theological Seminary
> > Bishop Payne Library, VTS Box 159
> > 3737 Seminary Road
> > Alexandria VA 22304
> > [log in to unmask]<mailto:[log in to unmask]>
> > 703-461-1794