Very interesting, Ralph. Are you / OCLC offering that code under any particular license(s)? (The Evergreen code, for what it's worth, has a project-level license stating that Evergreen code is offered under the GPL v2 with the "or later" clause). >>> "LeVan,Ralph" <[log in to unmask]> 4/11/2012 12:04 PM >>> I'm pretty sure attachments don't work on the list, so I'm just pasting my NACO normalizer below. Note that there are 2007 versions of the normalize() method in there. This is used for all the VIAF and Identities indexing. Ralph /* * NacoNormalize.java * * Created on July 11, 2007, 10:52 AM * * To change this template, choose Tools | Template Manager * and open the template in the editor. */ package ORG.oclc.util; -----Original Message----- From: Code for Libraries [mailto:[log in to unmask]] On Behalf Of Bill Dueber Sent: Wednesday, April 11, 2012 11:27 AM To: [log in to unmask] Subject: Modern NACO Normalization (esp. in java?) I'm about to embark on trying to write code to apply NACO normalization to strings (not for field-to-field comparisons, but for correctly sorting things). I was drivin to this by a complaint about how some Arabic manuscript titles are sorting. My end goal is a Solr filter, so I'm most interested in Java code. It doesn't look "hard" so much as "long and error-prone" so I'm hoping someone has already done this (or at least has a character map that I can easily convert to java). I've seen the code at the OCLC<http://www.oclc.org/research/activities/naco/default.htm>, but it's 10 years old and doesn't have a lot of the non-latin stuff in it. Evergreen has a perl implementation<http://git.evergreen-ils.org/?p=Evergreen.git;a=blob;f=Op en-ILS/src/perlmods/lib/OpenILS/Utils/Normalize.pm>: that's probably where I'll start if no one has anything else. Anyone? -- Bill Dueber Library Systems Programmer University of Michigan Library