I have been filling up a MyLibrary instance with OAI-accessible
content. So far I have stuffed about 270,000 items into the system and
provided searchable indexes to subsets of the data. See:
http://mylibrary.ockham.org/
My goal is two-fold: First, I want to provide better access to National
Science Foundation Digital Library content. Two, I want to see how far
I can push MyLibrary.
To explore the first goal, I wrote an OAI harvesting program, crawled
the NSDL OAI Repository, and stuffed everything I found into MyLibrary
while cataloging/classifying every item with various facet/term
combinations. Next, I wrote a report against MyLibrary that created an
XML stream. I fed this stream to an indexer (swish-e), and thus created
seven distinct search engines including:
* a journal index
* a theses and dissertations index
* an article index
* an ebook index
* an index of library science items
* an index of life science items
* an index of mathematics items
I'm still exploring the second goal, but so far the database and the
object oriented Perl modules are holding up and scaling quite well.
Fun!
--
Eric Lease Morgan
University Libraries of Notre Dame
(574) 631-8604
|