I'm about to embark on trying to write code to apply NACO normalization to
strings (not for field-to-field comparisons, but for correctly sorting
things). I was drivin to this by a complaint about how some Arabic
manuscript titles are sorting.

My end goal is a Solr filter, so I'm most interested in Java code.

It doesn't look "hard" so much as "long and error-prone" so I'm hoping
someone has already done this (or at least has a character map that I can
easily convert to java).

I've seen the code at the
but it's 10 years old and doesn't have a lot of the non-latin stuff in it.

Evergreen has a perl
that's probably where I'll start if no one has anything else.


Bill Dueber
Library Systems Programmer
University of Michigan Library