I would suggest using Connexion's Batch Search to pull records as a first pass. Something like ti:title and au:creator and yr:year and dt:sco and lv:(b or i or 1) When looking for the best record with more than one result with such a large project, I sometimes use z39.50 to grab the OCLC records through MarcEdit. Doing so will include a field that records the number of institutions using that record - the hope being that the more popular a record is, the more likely it is to be accurate. On Tue, Jan 7, 2020 at 9:33 AM Harper, Cynthia <[log in to unmask]> wrote: > I have 17,000 records at this point. The data is forced into MARC format > from a spreadsheet. I don't know how reliable the music number is. I tried > the sample record below and find not match in OCLC, although there are > other scores with an 028 that matches this number (not the same > composer/title though). > > Thanks, > Cindy > > -----Original Message----- > From: Code for Libraries <[log in to unmask]> On Behalf Of Kyle > Banerjee > Sent: Monday, January 06, 2020 5:55 PM > To: [log in to unmask] > Subject: Re: [CODE4LIB] matching brief cataloging to OCLC records - scores > > Hi Cindy, > > Could you say a bit more about your project -- i.e. how many items you're > dealing with, uniqueness of records you need to match against, reliability > of the individual data points you're using for your key? > > As an abstract proposition, dirty matches are tricky. The basic approaches > are to create a similarity quotient or to use fragments of fields you can > reliably expect to find (kind of like old school OCLC derived searches). > > Unless you truly have a lot of records, I wouldn't rule out manual methods > for a substantial amount of the work -- for a second pass, humans are > prettty efficient at cutting through thousands of records. > > kyle > > On Mon, Jan 6, 2020 at 11:07 AM Harper, Cynthia <[log in to unmask]> wrote: > > > Hello - I want to match some brief cataloging records for music > > scores to OCLC record numbers. I am inconversation with OCLC about > > doing this with a reclamation/datasync type process, but I'm not > > optimistic about that, because there's going to be a lot of mismatch in > the details. > > > > Here's a sample record: > > =LDR \\\\\ncm a22\\\\\ a 4500 > > =935 \\$aLAiO|Adams, Leslie|Hosanna to|2927|Walton Music|1976|Vocal > > Score For Satb Chorus And Piano > > =100 1\$aAdams, Leslie. > > =245 \\$aHosanna to the Son of David. > > =028 32$a2927 > > =264 \1$aNew York : $bWalton Music Corporation , $c1976. > > =300 \\$aScore, not-stapled, soft-cover; $c26.6 cm > > =336 \\$anotated music$bntm$2rdacontent > > =337 \\$aunmediated$bn$2rdamedia > > =338 \\$avolume$bnc$2rdacarrier > > =500 \\$aVocal score for SATB chorus and piano > > =700 1\ > > =700 1\ > > =700 1\ > > =910 \$aLewis Collection - Anthems in Octavo > > =910 \$aFile Cabinet 5, drawer 1 > > > > That 935 is my attempt to cobble a "Unique" key from the records. I'm > > thinking those blank 710s (editor, arranger, translator) will be > > removed by MarcEdit. > > > > I think what would be best would be to search OCLC API by title + > > composer > > + mat-type=score, and check through these for a matching 028 and > > + matching > > significant words in the 500 note, and produce an output of multiple > > rows for any matching records, to be combed through via the next set > > of automated process. > > > > What software exists to do this? Your wisdom is appreciated. > > > > > > Cindy Harper > > E-services and periodicals librarian > > Virginia Theological Seminary > > Bishop Payne Library, VTS Box 159 > > 3737 Seminary Road > > Alexandria VA 22304 > > [log in to unmask]<mailto:[log in to unmask]> > > 703-461-1794 > > >