Here is a bit of existing work I know of in this area.
MARC::Detrans De-transliterate text and MARC records
http://search.cpan.org/dist/MARC-Detrans/
There is a paper Cyril: expanding the horizons of MARC21 by Jacobs, Jane
W.; Summers, Ed; Ankersen, Elizabeth, Library Hi Tech, Volume 22, Number
1, 2004, pp. 8-17(10) Good discussion of the issues this creates.
The Cyril software doesn't seem to be available still.
Sincerely,
David Bigwood
[log in to unmask]
Catalogablog http://catalogablog.blogspot.com
Twitter LPI_Library
> Greetings all:
>
> It occurs to me now that I might have checked for existing work on the
lists
> before I did this, but anyway -- we are in the finishing stages of
creating
> scripts that will automatically convert a library's existing Romanized
MARC
> Hebrew fields (e.g. "Sefer {dotb}Hatan Torah") into Hebrew-script, and
add
> them to the records already in the ILS. It's quite accurate; not
> bulletproof, but at least it's a way to quickly get Hebrew script into
> thousands of Roman-only records, where many Hebrew users (including
staff)
> may not understand the transliteration rules 100%.
>
> The Hebrew conversion itself is done by a PHP script (haven't finished
> learning Perl) acting on a MARC dump of Roman-only Hebrew records in
MRK
> (broken MARCedit) format. This outputs two files of converted fields:
an
> XML file for proofing, and a tab-delimited text file for the inputting
> script to devour. This inputting is done by an Expect script using
the
> character-based ILS client.
>
> We are an III shop. This could presumably be adapted easily enough
for
> another ILS, whether using Expect or direct manipulation of database
tables.
> (I'm not volunteering, though...) It would probably be easy enough to
adapt
> to another language also, assuming that language were at least as
> predictable in MARC as Hebrew. (It's pretty good - my list of "manual
> override" words that the auto-algorithm botches is now totaling about
35 in
> preliminary testing.)
>
> Note that I can't imagine automating the other direction, Hebrew- to
> Roman-script, unless there's some algorithm for this already floating
around
> out there.
>
> If anyone's interested, I'll clean up the code and open-source it.
>
> Cheers, Shabbat shalom,
>
> --
> Yitzchak Schaffer
> Systems Librarian
> Touro College Libraries
> 33 West 23rd Street
> New York, NY 10010
> Tel (212) 463-0400 x5230
> Fax (212) 627-3197
> [log in to unmask]
>
|