Till, Bess, Ralph,
>> assuming the algorithm for enriching the spellings of a word (from PinYin to
Chinese) exists, will the result include both forms, PinYin AND Non-PinYin
Chinese transliteration and BOTH forms will be indexed?
>> the principle of indexing different forms for spelling a vertain word exists
in name authority records, where a name (for ex. Pushkin) has over 30 forms of
different spellings. The string of different names can be expended (Ralph¹s work
with VIAF and with fuzzy logic in WorldCat/identities is definitely relevant
here).
maybe something is already underway (?)
>> how large will the resulting index be? managable for medium-small
installations of vuFIND?
Ya¹aqov Ziso, Electronic Resource Management Librarian, Rowan University 856
256 4804
On 10/29/09 11:34 AM, "Till Kinstler" <[log in to unmask]> wrote:
> Bess Sadler schrieb:
>
>> > So, thoughts? Anyone know more about this than I do and want to speak up?
>
> I'd second Demian's and Jonathan's statements: Do that in Solr by using
> a Filter (either at indexing or search time).
> You want to solve that using an algorithm that translates american
> transcription into chinese, correct? If you have that algorithm (is
> there one?), it's a perfect job for a filter and I guess there are use
> cases outside libraryland as well. It's not only us dealing with
> transcription of chinese...
> If I misunderstood your approach and you want to use a dictionary to map
> the different transcriptions, solr.SynonymFilterFactory could provide a
> solution.
> (http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilte
> rFactory)
>
>
> Till
|