Kyle Banerjee wrote:
> On Mon, Oct 25, 2010 at 12:38 PM, Tim Spalding <[log in to unmask]> wrote:
>
>> Does processing speed of something matter anymore? You'd have to be
>> doing a LOT of processing to care, wouldn't you?
>>
>
> Data migrations and data dumps are a common use case. Needing to break or
> make hundreds of thousands or millions of records is not uncommon.
>
> kyle
To make this concrete, we processes the MARC records from 14 separate
ILS's throughout the University of Wisconsin System. We extract, sort on
OCLC number, dedup and merge pieces from any campus that has a record
for the work. The MARC that we then index and display here
http://forward.library.wisconsin.edu/catalog/ocm37443537?school_code=WU
is not identical to the version of the MARC record from any of the 4
schools that hold it.
We extract 13 million records and dedup down to 8 million every week.
Speed is paramount.
-sm
--
Stephen Meyer
Library Application Developer
UW-Madison Libraries
436 Memorial Library
728 State St.
Madison, WI 53706
[log in to unmask]
608-265-2844 (ph)
"Just don't let the human factor fail to be a factor at all."
- Andrew Bird, "Tables and Chairs"
|