On May 9, 2008, at 1:42 PM, Jonathan Rochkind wrote: > The Blacklight code is not currently using XML or XSLT. It's indexing > binary MARC files. I don't know it's speed, but I hear it's pretty > fast. Right, I'm talking about the java indexer we're working on, which we're hoping to turn into a plugin contrib module for solr. It processes binary marc files. We're getting times of about 150 records / second, but that's on an unfortunately throttled server and we're munging each record significantly (replacing musical instrument and language codes with their English language equivalents, calculating composition era, etc). Casey, you say you're getting indexing times of 1000 records / second? That's amazing! I really have to take a closer look at MarcThing. Could pymarc really be that much faster than marc4j? Or are we comparing apples to oranges since we haven't normalized for the kinds of mapping we're doing and the hardware it's running on? Bess Elizabeth (Bess) Sadler Research and Development Librarian Digital Scholarship Services Box 400129 Alderman Library University of Virginia Charlottesville, VA 22904 [log in to unmask] (434) 243-2305