Print

Print


Just for fun, try Ockham Spell just before it is retired and taken off the 'Net -- http://spell.ockham.org/?word=origami

In 2005 I worked with Jeremy Frumkin, Martin Halbert, and Ed Fox on a NSF grant to implement a set of Web Services for libraries. I implemented three services: an alerting service, an implementation of MyLibrary, and a spelling suggestion service.

Well, for various reasons it is long past time to take these services down, but before they go I'd like to highlight the spell service. Given a string (a word), the service will query locally configured dictionaries and return alternative spellings for the given word. Some of the dictionaries were pre-created. Others were generated from collected OAI-PMH servers. We used swish-e as an underlying indexer, and we used dict as the data store. What's really cool is that after all these years -- and with zero maintenance -- the service is still functional. Sure, no body ever uses it, but it works just the same. 

The code for the three services is located on Google code, and a couple of the services are documented in DLIB Magazine:

  * Ockham Alert (http://code.google.com/p/ockham-alert/) - Ockham Alert is written in Perl and prefers MySQL as its database back-end and swish-e as its indexer. Access to the index is through a Web Services protocol called SRU (Search/Retrieve via URL). Outputs from the service include HTML, RSS, and email messages. This system was implemented as a part of National Science Foundation grant (DUE-0333601) called OCKHAM Library Network, Integrating the NSDL into Traditional Library Services.

  * Ockham MyLibrary (http://code.google.com/p/ockham-mylibrary/) - Mylibrary@Ockham is a system for providing access to indexes of OAI-accessible content. It is written in Perl in conjunction with a number of other open source technologies including MySQL, GNU Aspell, and Wordnet. In a nutshell it works by harvesting data from OAI repositories, saving the resulting data to a (Mylibrary) database, indexing the content using Plucene, enhancing the database through a term frequency-inverse document frequency (TF-IDF) technique, providing access to the index via an SRU (Search/Retrieve via URL) server, and enhancing search results by providing alternative spellings and synonyms to query words. 

  * Ockham Spell (http://code.google.com/p/ockham-spell/) - Given a word and the name of a domain-specific dictionary, this system will return alternative spellings for the word. Results are returned in an XML stream, and they are expected to be fed to the front-end of an index for query expansion. This system was implemented as a part of National Science Foundation grant (DUE-0333601) called OCKHAM Library Network, Integrating the NSDL into Traditional Library Services.

  * Exploiting "Light-weight" Protocols and Open Source Tools to Implement Digital Library Collections and Services by Xiaorong Xiang and Eric Lease Morgan (http://www.dlib.org/dlib/october05/morgan/10morgan.html) - This article describes the design and implementation of two digital library collections and services using a number of "light-weight" protocols and open source tools. These protocols and tools include OAI-PMH (Open Archives Initiative-Protocol for Metadata Harvesting), SRU (Search/Retrieve via URL), Perl, MyLibrary, Swish-e, Plucene, ASPELL, and WordNet. More specifically, we describe how these protocols and tools are employed in the Ockham Alerting service and [log in to unmask] The services are illustrative examples of how the library community can actively contribute to the scholarly communications process by systematically and programmatically collecting, organizing, archiving, and disseminating information freely available on the Internet. Using the same techniques described here, other libraries could expose their own particular content for their specific needs and audiences.

Fun with code!

--
Eric Lease Morgan