[Please redistribute as appropriate]
Distributed Full Text Search of Math Books Now Available
The university libraries of Cornell, Göttingen, and Michigan are pleased to
announce the first public availability of a significant body of mathematical
monographs with access provided through a distributed full text search
protocol. The virtual collection, comprising more than 2,000 volumes of
significant historical mathematical material (nearly 600,000 pages), resides
at the three separate institutions and is provided through interfaces to the
three entirely different software systems. Public interfaces to the
collection may be found at:
http://www.hti.umich.edu/m/mathall/
and
http://mathbooks.library.cornell.edu/
These two public interfaces reflect different development efforts at
Michigan and Cornell, each with their own perspective on how to best mediate
the search through the protocol, and each based on the protocol.
The protocol for this distributed search was developed by the three
participating institutions over the last two and a half years, with generous
support provided by the National Science Foundation. Working from the roots
of the DIENST and the then-emergent OAI protocols, the project team focused
on creating a new protocol--dubbed CGM, for "Cornell, Göttingen,
Michigan"--that was consistent with OAI, borrowed from DIENST, and added
mechanisms for full text searching. The protocol and more project
information are available at http://www.library.cornell.edu/mathbooks/.
While our testing has found that network latency and the vicissitudes of
different production environments do present challenges, our results
indicate that a distributed full text search is certainly viable. We
believe that the CGM protocol is relatively unique in providing
production-level full text access to distributed collections.
We invite feedback on the effectiveness of the protocol from both users of
the materials and from digital library developers. Although essentially a
first prototype with significant needs for extension and refinement, we
believe that our progress to-date should be encouraging for digital library
developers interested in federating collections. And, further, the
collection itself is a rich resource for the study of mathematics history
and a number of related disciplines. The collections at Cornell and
Michigan are both fully searchable, and while the Göttingen collection
currently includes bibliographic information and page images, Göttingen is
actively seeking funding to create full text for its volumes.
We welcome all feedback. Please send comments to [log in to unmask]
We hope that the documentation on the protocol (currently found at
http://www.library.cornell.edu/mathbooks/cgmverbs.xml) will spur others to
add CGM-capability to their systems. The software created through this
NSF-funded grant will be made available in a number of ways. The API
developed by Göttingen, allowing them to provide access through the Agora
software, will soon be available to other Agora sites. The functionality
developed by Michigan will be included in release 11 of the DLXS digital
library software (September, 2003). And Cornell is exploring distribution
and support models for its electronic publishing software, DPubS, the system
also behind Project Euclid (http://ProjectEuclid.org). If you are
interested in using the raw protocol mechanisms at Cornell, Göttingen and
Michigan in your development efforts, please contact:
Andrea Rapp, Göttingen <[log in to unmask]>
David Ruddy, Cornell <[log in to unmask]>
John P. Wilkin, Michigan <[log in to unmask]>
John Price Wilkin, Associate University Librarian, LIT
|