LISTSERV 16.5 - CODE4LIB Archives

Hi Michael:

Thanks for your email.  No we haven't implemented any merging system.
Our software currently just tries to do clustering of
similar/identical records.  We may consider creating a merge algorithm
that is generic, which might then be customized to make some of your
pointed canonicalizations eas(ier) to do.  As for integrating it with
marc4j, currently we don't have specific plans for this (although we'd
appreciate any interested folks who'd like to help).

> So back to the de-dup thing (things got busy here). Has anyone
> implemented a merging algorithm like this one:
> http://www.kcoyle.net/temp/merge.html
>
> It's the referred to via openlibrary here:
> http://openlibrary.org/about/lib
>
> Putting something like this in marc4j would be sweet.
> Mike Beccaria
> Systems Librarian
> Head of Digital Initiatives
> Paul Smith's College
> 518.327.6376
> [log in to unmask]

Cheers,

Min

--
Min-Yen KAN (Dr) :: Assistant Professor :: National University of
Singapore :: School of Computing, AS6 05-12, Law Link, Singapore
117590 :: 65-6516 1885(DID) :: 65-6779 4580 (Fax) ::
[log in to unmask] (E) :: www.comp.nus.edu.sg/~kanmy (W)

Important: This email is confidential and may be privileged. If you
are not the intended recipient, please delete it and notify us
immediately; you should not copy or use it for any purpose, nor
disclose its contents to any other person. Thank you.