Does Solr support Soundex? (Soundex was originally developed to
assist with alternate spellings of names)
Keith
On Mon, Jun 13, 2011 at 8:08 PM, Jonathan Rochkind <[log in to unmask]> wrote:
> In a Solr-based search, stemming is done at indexing time, into fields with stemmed tokens.
>
> It seems typical in library-catalog type applications based on Solr to have the default (or even only) searches be over these stemmed fields, thus 'auto-stemming' to the user. (Search for 'monkey', find 'monkeys' too, and vice versa).
>
> I am curious how many people, who have Solr based catalogs (that is, I'm interested in people who have search engines with majority or only content originally from MARC), use such stemmed fields ('auto-stemming') over their _author_ fields as well.
>
> There are pro's and con's to this. There are certainly some things in an author field that would benefit from stemming (mostly various kinds of corporate authors, some of whose endings end up looking like english language phrases). There are also very many things in an author field that would not benefit from stemming, and thus when stemming is done it sometimes(/often?) results in false matches, "pluralizing" an author's last name in an inappropriate way for instance.
>
> So, wanna say on the list, if you are using a Solr-based catalog, are you using stemmed fields for your author searches? Curious what people end up doing. If there are any other more complicated clever things you've done than just stem-or-not, let us know that too!
>
> Jonathan
>
|