Sweet!
I've been asked to create a "smart search" a.k.a "be like google without any
of the resources" ;) for our upcoming new library home page. One of the
things they're looking for is a "did you mean?" -- but plugging into the
google api only nets me 1,000 hits a day. Now, granted, I don't thing our
library home page will suffer that much from _only_ 1,000 hits, but I'd
rather build something that can scale than not.
If it's a web service, it'd be a snap to integrate into my app - so I'm very
interested =).
I did see something on #code4lib about an ockham server?
Would/should that be part of this?
Andrew
Andrew Forman
University of Iowa Libraries
ISST Development
319 335 9152
-----Original Message-----
From: Code for Libraries [mailto:[log in to unmask]] On Behalf Of Eric
Lease Morgan
Sent: Tuesday, September 13, 2005 8:45 AM
To: [log in to unmask]
Subject: [CODE4LIB] spelling server
What do y'all think of the idea of a spelling server -- a Web service taking
a word as input and returning a list of alternative spellings.
MyLibrary@Ockham has indexed about 430,000 OAI records. These records have
grossly classified into a number of domains such as mathematics, life
science, theses & dissertations, and a master domain consisting of all the
sub domains.
Taking a hint from Bill Mosely (of swish-e fame), I have read the indexes,
parsed out the individual words, and fed them to GNU ASPELL, a dictionary
program. It is then possible to query ASPELL and have it return alternative
spellings. We have incorporated this feature into [log in to unmask]
I could make this spell checking functionality available as a Web service.
The URL could look something like this:
http://spell.ockham.org/?word=origami
The output could look something like this:
<?xml version='1.0'?>
<spell>
<word>origami</word>
<spellings>
<spelling>origem</spelling>
<spelling>irrigam</spelling>
<spelling>obrigam</spelling>
<spelling>kirigami</spelling>
<spelling>ariguama</spelling>
</spellings>
</spell>
It would then be up to the client to do with the content of the spelling
elements as they desired. For example, the client could:
* spell check a document
* implement a Did You Mean? service a la Google
* incorporate the results into a Find More Like This One search
* enhance the results of an OPAC search
* feed selected words back to the spelling server
Alternative URL's might include:
http://spell.ockham.org/?word=origami&domain=master
http://spell.ockham.org/?word=origami&domain=master&version=1.0
http://spell.ockham.org/?
word=origami&domain=master&version=1.0&verbosity=5
Writing the underlying script would be easy. Articulating a XML stream as
output would be harder.
What do y'all thinque? It would be fun at the very least.
--
Eric Lease Morgan
University Libraries of Notre Dame
(574) 631-8604
|