Yitzchak: This is great news! I'd love to see you share the code with the greater community. This may prove particularly useful for the automated addition of non-Roman data into authority records for NACO members (see [1]; see also [2]). As far as other algorithms go, you could try getting in touch with Dave Reser at LC's CPSO. You may also want to look at IITM's "Local Language Editor" [3]. [1] http://www.loc.gov/catdir/cpso/nonlatinfaq.html [2] http://www.loc.gov/catdir/cpso/nonlatin.pdf [3] http://acharya.iitm.ac.in/software/r2leditor.php Mark A. Matienzo Applications Developer, NYPL Labs The New York Public Library +1 (212) 592-7176 On Fri, Aug 15, 2008 at 10:32 AM, Yitzchak Schaffer <[log in to unmask]> wrote: > BS"D > > Greetings all: > > It occurs to me now that I might have checked for existing work on the lists > before I did this, but anyway -- we are in the finishing stages of creating > scripts that will automatically convert a library's existing Romanized MARC > Hebrew fields (e.g. "Sefer {dotb}Hatan Torah") into Hebrew-script, and add > them to the records already in the ILS. It's quite accurate; not > bulletproof, but at least it's a way to quickly get Hebrew script into > thousands of Roman-only records, where many Hebrew users (including staff) > may not understand the transliteration rules 100%. > > The Hebrew conversion itself is done by a PHP script (haven't finished > learning Perl) acting on a MARC dump of Roman-only Hebrew records in MRK > (broken MARCedit) format. This outputs two files of converted fields: an > XML file for proofing, and a tab-delimited text file for the inputting > script to devour. This inputting is done by an Expect script using the > character-based ILS client. > > We are an III shop. This could presumably be adapted easily enough for > another ILS, whether using Expect or direct manipulation of database tables. > (I'm not volunteering, though...) It would probably be easy enough to adapt > to another language also, assuming that language were at least as > predictable in MARC as Hebrew. (It's pretty good - my list of "manual > override" words that the auto-algorithm botches is now totaling about 35 in > preliminary testing.) > > Note that I can't imagine automating the other direction, Hebrew- to > Roman-script, unless there's some algorithm for this already floating around > out there. > > If anyone's interested, I'll clean up the code and open-source it. > > Cheers, Shabbat shalom, > > -- > Yitzchak Schaffer > Systems Librarian > Touro College Libraries > 33 West 23rd Street > New York, NY 10010 > Tel (212) 463-0400 x5230 > Fax (212) 627-3197 > [log in to unmask] >