Hello, All and Jane
First I would like to appreciate Jane Jacobs at Queens Library providing me Urdu Romanization table.
As we are working on creating Persian/Pushutu transliterate software, my Persian language expert has the following question :
" In according to our conversation for transliterating Persian to Roman letters, I faced a big problem: As the short vowels do not show up on or under the letters in Persian, how a machine can read a word in Persian. For example we have the word “پدر "; to the machine this word is PDR, because it cannot read the vowels. There is no rule for the short vowels in the Persian language; so the machine does not understand if the first letter is “pi”, “pa” or “po”. Is there any way to overcome this obstacle? "
This seems to me that we missed a critical piece of information here. (Something like a dictionary). Without it, there is no way to have good translation from computer. We will have to have a Persian speaker to check/correct the computer's transliteration.
Any suggestions ?
Thanks,
Yan
-----Original Message-----
From: Jacobs, Jane W [mailto:[log in to unmask]]
Sent: Wednesday, January 23, 2013 6:28 AM
To: Han, Yan
Subject: RE: : Persian Romanization table
Hi Yan,
As per my message to the listserve, here are the config files for Urdu. If you do a Persian config file, I d love to get it and if possible add it to the MARC::Detrans site.
Let me know if you want to follow this road.
JJ
-----Original Message-----
From: Code for Libraries [mailto:[log in to unmask]] On Behalf Of Han, Yan
Sent: Tuesday, January 22, 2013 5:31 PM
To: [log in to unmask]
Subject: [CODE4LIB] : Persian Romanization table
Hello, All,
I have a project to deal with Persian materials. I have already uses Google Translate API to translate. Now I am looking for an API to transliterate /Romanize (NOT Translate) Persian to English (not English to Persian). In other words, I have Persian in, and English out.
There is a Romanization table (Persian romanization table - Library of Congress<http://www.loc.gov/catdir/cpso/romanization/persian.pdf> www.loc.gov/catdir/cpso/romanization/persian.pdf<http://www.loc.gov/catdir/cpso/romanization/persian.pdf>).
For example, If
???? should output as Kit?b
My finding is that existing tools only do the opposite
1. Google Transliterate: you enter English, output Persian (Input Bookmark , output ??????? , Input ??????? , output ??????? )
2. OCLC language: the same as Google Transliterate.
3. http://mylanguages.org/persian_romanization.php : works, but no API.
Anyone know such API exists?
Thanks much,
Yan
|