I wonder if you could do it in Solr itself, instead of SolrMarc? As an
input filter (I forget the exact technical language for this, but
there's something you can set to do this task in Solr); which you don't
need to get it built into the Solr distro to do, Solr allows you to have
your own custom plugins like this in the lib directory, and specify in
them in the schema.xml and such. I think. Just a brainstorming idea,
don't know enough details about Solr to know if it would work well for
sure. But if it would, I wonder if putting it configuration in Solr
itself is preferable to SolrMarc, for wider sharing/utility of code with
anyone using Solr but not neccesarily using SolrMarc.
Bess Sadler wrote:
> By sending this here I hope I'm going to hit everyone on the
> blacklight, vufind, and solrmarc mailing lists, and maybe some other
> interested parties.
> Our East Asian Languages Librarian has approached me with a problem he
> wants to see solved. According to him, the typical North American
> library cataloging rules for constructing Pinyin transliterations are
> different from the rules that are used in China. What this means is
> that native Chinese speakers have a lot of trouble searching our
> catalog (it is "practically unusable" was his exact quote). His
> proposal, and I think it's a good one, is that since we're re-indexing
> our records into solr anyway, we could apply at index time an
> algorithm to convert North American Pinyin to Chinese rules Pinyin,
> index both values, and thus make the catalog much more useful to an
> under-served population. This seems like a great suggestion to me, but
> before I start devoting development cycles to it I wanted to poll the
> community... is there a more obvious answer that I'm not seeing? Has
> anyone solved this already?
> What's the right place for such a piece of code? Solrmarc seems the
> obvious place to me. As it has been described to me so far, this
> doesn't seem like an issue affecting people outside the library realm,
> which makes it seem too niche and community-specific to get it built
> into the lucene codebase, but I could be wrong about that. Maybe it
> would be better as a lucene contrib library?
> So, thoughts? Anyone know more about this than I do and want to speak up?