A couple of points about Lucene features, in reply to Karen: Lucene does do stemming natively, in the analyser (use the PorterStemFilter class, which is part of the Lucene distribution). Fuzzy searching can be bizarre if you use the default level, but you can control the degree of desired fuzziness: if you're using the query parser, put a number between 0 and 1 after the tilde (e.g. "horse~0.6" is fuzzier than "horse~0.9"). I find 0.75 is about right (but I'm indexing raw multilingual OCR, so bizarre is good). For those who haven't seen it there's more on query syntax here: http://lucene.apache.org/java/docs/queryparsersyntax.html . I agree that what were once bells and whistles are now essential to the kind of search interface we need to build. My sense is that Lucene has the machinery to do just about everything you need in a good modern search interface, but it's up to the implementation to put all the pieces together. Peter -----Original Message----- From: Code for Libraries [mailto:[log in to unmask]] On Behalf Of K.G. Schneider Sent: Tuesday, May 30, 2006 9:16 PM To: [log in to unmask] Subject: Re: [CODE4LIB] fun with kinosearch > I'm not sure where stemming comes in (does Lucene do this?), it seems > faceted browsing could be handled by something like Carrot2. Rumor > has it Solr has faceting support somewhere, as well. At least, > according to the 9s project. http://www.nines.org/ > > -Ross. Lucene doesn't have native stemming; it does do fuzzy search, but you don't want that. (Trussssssst me, through a funky series of events I recently evaluated Lucene with fuzzy search enabled, and it was bizarre.) Lucene is used as a building block for other search engines. It does support quite a few capabilities. I have seen it used in conjunction with the Porter stemming algorithm and with spell-checkers of various flavors. But again--and probably only because I have been testing search engines for several months and am starting to get a little cabin fever--I want to clarify that I'm not piling on the fact that Kino can't do it all. As a component, it could be great, and that it's in a Perl is a biggy. I was (awkwardly) addressing my concern that some fundamental search capabilities appeared to have been labeled "creeping featureitis." I would just be careful about that kind of terminology. I doubt Eric meant anything seriously by it. I just know the long uphill battle it can be to provide quality search, and I wouldn't want someone as distinguished as Eric quoted in support of compromising the user experience. K.G. Schneider [log in to unmask]