If you want to work with MARC records, you can use MARC::Detrans (http://search.cpan.org/~esummers/MARC-Detrans-1.41/)
The trick is you will need a config file for Persian. You also have to decide whether you want to use MARC-8 or UTF-8 before you construct the config file. We’ve done config files for Urdu (I’ll send you off list), but not Persian. I expect you could patch up the Urdu to make Persian without too much bother. It will not be perfect, but probably respectable. If you have a Persian speaker handy to spell check and tweak the results it will do really well.
One problem with the config files for UTF-8 is that with any diacritical marks in the Romanization you may need to account for two different types of characters. For example in Urdu:
<rule>
<roman>h?</roman>
<marc>?</marc>
</rule>
May look like
<rule>
<roman> ? </roman>
<marc>?</marc>
</rule>
But the first h? = 0068 + 0323 (h + combining dot below)
Whereas the second ? = a 1E25 (a single character)
This little snake in the grass really goofed me up for a while, and makes the config file look duplicative when its not.
Anyway if you’re interested in this alternative, I’ll be glad to provide whatever assistance I can.
JJ
-----Original Message-----
From: Code for Libraries [mailto:[log in to unmask]] On Behalf Of Craig Franklin
Sent: Tuesday, January 22, 2013 8:52 PM
To: [log in to unmask]
Subject: Re: [CODE4LIB] : Persian Romanization table
I think that looking for "English" might be a red herring, what you want is
a translation between Persian in the Arabic script to Persian in the Latin
script.
That said, a quick look at Wikipedia indicates that this might not be as
straightforward a task as one might expect:
http://en.wikipedia.org/w/index.php?title=Romanization_of_Persian&oldid=532605934
Cheers,
Craig
On 23 January 2013 08:30, Han, Yan <[log in to unmask]> wrote:
> Hello, All,
> I have a project to deal with Persian materials. I have already uses
> Google Translate API to translate. Now I am looking for an API to
> transliterate /Romanize (NOT Translate) Persian to English (not English to
> Persian). In other words, I have Persian in, and English out.
> There is a Romanization table (Persian romanization table - Library of
> Congress<http://www.loc.gov/catdir/cpso/romanization/persian.pdf>
> www.loc.gov/catdir/cpso/romanization/persian.pdf<
> http://www.loc.gov/catdir/cpso/romanization/persian.pdf>).
>
> For example, If
>
> ???? should output as Kit?b
> My finding is that existing tools only do the opposite
>
> 1. Google Transliterate: you enter English, output Persian (Input
> “Bookmark”, output “??????? “, Input “??????? “, output “??????? “)
>
> 2. OCLC language: the same as Google Transliterate.
>
> 3. http://mylanguages.org/persian_romanization.php : works, but no
> API.
>
> Anyone know such API exists?
>
> Thanks much,
>
> Yan
>
>
Connect with Queens Library:
* QueensLibrary.org
http://www.queenslibrary.org/
* Facebook
http://www.facebook.com/queenslibrarynyc
* Twitter
http://www.twitter.com/queenslibrary
* LinkedIn
http://www.linkedin.com/company/queens-library
* Google+
https://plus.google.com/u/0/116278397527253207785
* Foursquare
https://foursquare.com/queenslibrary
* YouTube
http://www.youtube.com/queenslibrary
* Flickr
http://www.flickr.com/photos/qbpllid/
* Goodreads
http://www.goodreads.com/group/show/58240.Queens_Library
The information contained in this message may be privileged and
confidential and protected from disclosure. If the reader of this
message is not the intended recipient, or an employee or agent
responsible for delivering this message to the intended recipient,
you are hereby notified that any dissemination, distribution or
copying of this communication is strictly prohibited. If you have
received this communication in error, please notify us immediately
by replying to the message and deleting it from your computer.
|