How do I correctly index Spanish language documents using Solr? More specifically, I have tried two different "character folding" techniques to index non-ASCII characters, and neither one seem to work 100% of the time. Both techniques allow me to find some accented character but not others. For example, I use the ASCIIFoldingFilterFactory like this: <fieldType name="text_general" class="solr.TextField" positionIncrementGap="100" multiValued="true"> <analyzer type="index"> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.ASCIIFoldingFilterFactory"/> <filter class="solr.StopFilterFactory" words="stopwordsspanish.txt" ignoreCase="true"/> <filter class="solr.LowerCaseFilterFactory"/> </analyzer> <analyzer type="query"> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.ASCIIFoldingFilterFactory"/> <filter class="solr.StopFilterFactory" words="stopwordsspanish.txt" ignoreCase="true"/> <filter class="solr.SynonymFilterFactory" expand="true" ignoreCase="true" synonyms="synonyms.txt"/> <filter class="solr.LowerCaseFilterFactory"/> </analyzer> </fieldType> Or I use the MappingCharFilterFactory like this: <fieldType name="text_general" class="solr.TextField" positionIncrementGap="100" multiValued="true"> <analyzer type="index"> <charFilter class="solr.MappingCharFilterFactory" mapping="mapping-FoldToASCII.txt"/> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.StopFilterFactory" words="stopwordsspanish.txt" ignoreCase="true"/> <filter class="solr.LowerCaseFilterFactory"/> </analyzer> <analyzer type="query"> <charFilter class="solr.MappingCharFilterFactory" mapping="mapping-FoldToASCII.txt"/> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.StopFilterFactory" words="stopwordsspanish.txt" ignoreCase="true"/> <filter class="solr.SynonymFilterFactory" expand="true" ignoreCase="true" synonyms="synonyms.txt"/> <filter class="solr.LowerCaseFilterFactory"/> </analyzer> </fieldType> In both cases I can search for and find some words with non-ASCII characters and some not. For examnple, I can find documents with the word "presentará" but not necessarily all of them. I know my corpus includes the word "señor" but I can never find it. What might I be doing wrong? -- Eric Morgan