On Feb 16, 2015, at 4:54 PM, Levy, Michael <[log in to unmask]> wrote:
> I think you can accomplish what you want by using ICUFoldingFilterFactory
> https://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.ICUFoldingFilterFactory
>
> which should simply perform ICU (cf http://site.icu-project.org/) based character folding (cf. http://www.unicode.org/reports/tr30/tr30-4.html)
>
> In schema.xml I generally have in both index and query:
>
> <tokenizer class="solr.StandardTokenizerFactory"/>
> <filter class="solr.ICUFoldingFilterFactory" />
For unknown reasons, I was unable to load the ICUFoldingFilterFactory, but nonetheless, my interface works as expected. And I was able to do this after a combination of things. First, I needed to tell the indexer my content was Spanish, and after doing so, Solr parses things correctly. Second, I needed to explicitly tell my Web browser that the search form and returned content were using UTF-8. This was done the HTTP content-type header, the HTML meta tag, and even in the HTML form. Geesh! Through this whole process I also learned about Solr’s edismax (extended dismax) handler. Edismax supports free form queries as well as Boolean logic. solr++ But also solr+- because Solr is getting more and more and more complicated. —Eric “Lost In Chicago” Morgan
|