Print

Print


You could theoretically use Solr synonyms to expand the actual sharp 
character (♯) to BOTH "#" AND "sharp". At index time.  I guess at query 
time you'd need to expand it to just one or the other -- I think 
expanding to two things at query time is going to be a mess. I haven't 
tried this myself, tried using Solr synonyms to expand something to more 
than one alternative.  I think Solr synonym analyzer supports it, but I 
expect, based on what I know of how Solr works, that there will be some 
gotchas.

I _do_ notice actual sharp and flat symbols in my library MARC data for 
musical pieces, catalogers apparently do enter them sometimes. As most 
users probably don't know how to (or won't think to) enter sharp and 
flat characters directly, if it's important that these titles be 
findable including the sharp/flat part, it seems like something has to 
be done. But I haven't gotten to it yet. (Unless maybe all these library 
records already have alternate titles listed in 246 or whatever using 
straight ascii of some kind, I don't know).

In general, I've been able to avoid having to expand to multiple 
synonyms -- but cant' really do that with ♯, #, 'sharp', I think, 
precisely because '#' is not always a sharp sign, it can be other things 
too, so you don't want to collapse all....

Wait, maybe just map ♯ to "#"?  At both query and index time. Then user 
can't search for "F sharp", but they can search for either "F♯" or "F#", 
and both will match original source "F♯".  That seems the simplest 
solution. Although it would still be neat to play around with synonym 
expansion to see if you can make "F sharp" at query time match too.

On 5/31/2011 12:05 PM, Thomas Dowling wrote:
> Many thanks.
>
> I like the idea of catching the sharp and flat symbols - the only problem
> is that lazy music students tend to use "#" and "b".  ("Concerto in F#
> minor for Bb Bass Clarinet").
>
> Thomas
>
>
> On 05/31/2011 11:59 AM, Jonathan Rochkind wrote:
>> Multi-word synonyms are tricky.
>>
>> You probably want to make sure this synonym is only expanded at index
>> time, and not at search time. See some background in the
>> SynonymFilterFactory section of
>> http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters
>>
>> I think the synonym approach is a fine way to search for greek letters by
>> name; it's possible some of the new Unicode stuff in Solr 3.1 might expand
>> greek letters too, but I think actually probably not (because you don't
>> neccesarily want that in the general case), I think synonyms is probably
>> your best bet. (Same for things like expanding the musical sharp or flat
>> glyph to "sharp" or "flat", which I've considered).
>>