On Dec 13, 2007, at 10:33 AM, Eric Lease Morgan wrote: >> Put another way, if I want to use repository using >> NET::OAI::Harvester to read repository data in a >> form other than DC will I need to write an additional >> module such as NET::OAI::Record::MARCXML? > > But I'm lazy, and even though it is not the best solution, I will > explore another option. Specifically, I will use oai_dump (which > comes with N::O::H), change the metadata scheme from oai_dc to > marc21, run the script, and parse the resulting XML. If I'm lucky > my parser will able to be written as a SAX filter that can be added > to the N::O::H distribution. In the meantime, at least I will have > the data. Wish me luck. After getting most of my MARCXML/SAX parser written, Ed Summmers presented me with a couple of Perl modules allowing me to return MARC::Record objects from the harvest of OAI repositories supporting the marc21 metadata schema. This is originally what I wanted to do. Using this technology I was able to harvest the metadata (MARC records) of 70,000 University of Michigan digitized books (MBooks). I then fed them to an indexer -- Zebra -- that reads raw MARC very well, and provided a rudimentary interface to the index via SRU: http://infomotions.com/ii/ In the end the process was almost trivial and can easily be expanded to include other types of content. Thank you to all who helped along the way! -- Eric Lease Morgan University Libraries of Notre Dame (574) 631-8604