| 
Hello, Charles,
The plan is to write a program which can use a pre-defined language mapping XML file.   One language needs one pre-defined mapping XML file, so that any language can have its own mapping (extensible for future language transliteration).  In this case,  a Persian language mapping XML file, and a Pashuto language mapping XML file.  
Thanks for the language tool.  I will take a look.
Yan
-----Original Message-----
From: Riley, Charles [mailto:[log in to unmask]] 
Sent: Wednesday, April 17, 2013 5:31 PM
To: [log in to unmask]; Jacobs, Jane W; Code for Libraries ([log in to unmask])
Cc: Seyede Pouye Khoshkhoosani
Subject: [lita-l] RE: : Persian Romanization table
Hi Yan,
Sounds like a really interesting project.  Is the intent to support going from Persian to Pashto directly, as well as from each language to Roman script?
Among the natural language processing tools found here-- http://www.ling.ohio-state.edu/~jonsafari/persian_nlp.html
--the one that *might* be the most helpful is the link to the Persian Lexical Project, where the romanized orthography used is one that accounts for vowels inserted between the consonants.  It's not a large dataset, but carries a GPLv2 license--maybe useful in some testing, and see if it's worth expanding on the effort.
Best,
Charles Riley
________________________________________
From: Han, Yan [[log in to unmask]]
Sent: Wednesday, April 17, 2013 8:14 PM
To: Jacobs, Jane W; Code for Libraries ([log in to unmask]); [log in to unmask]
Cc: Seyede Pouye Khoshkhoosani
Subject: [lita-l] RE: : Persian Romanization table
Hello, All and Jane
First I would like to appreciate Jane Jacobs at Queens Library providing me Urdu Romanization table.
As we are working on creating Persian/Pushutu transliterate software, my Persian language expert has the following question :
" In according to our conversation for transliterating Persian to Roman letters, I faced a big problem: As the short vowels do not show up on or under the letters in Persian, how a machine can read a word in Persian. For example we have the word   "; to the machine this word is PDR, because it cannot read the vowels. There is no rule for the short vowels in the Persian language; so the machine does not understand if the first letter is pi, pa or po. Is there any way to overcome this obstacle? "
 This seems to me that we missed a critical piece of information here. (Something like a dictionary). Without it, there is no way to have good translation from computer. We will have to have a Persian speaker to check/correct the computer's transliteration.
Any suggestions ?
Thanks,
Yan
-----Original Message-----
From: Jacobs, Jane W [mailto:[log in to unmask]]
Sent: Wednesday, January 23, 2013 6:28 AM
To: Han, Yan
Subject: RE: : Persian Romanization table
Hi Yan,
As per my message to the listserve, here are the config files for Urdu.  If you do a Persian config file, I d love to get it and if possible add it to the MARC::Detrans site.
Let me know if you want to follow this road.
JJ
-----Original Message-----
From: Code for Libraries [mailto:[log in to unmask]] On Behalf Of Han, Yan
Sent: Tuesday, January 22, 2013 5:31 PM
To: [log in to unmask]
Subject: [CODE4LIB] : Persian Romanization table
Hello, All,
I have a project to deal with Persian materials. I have already uses Google Translate API to translate. Now I am looking for an API to transliterate /Romanize (NOT Translate) Persian to English (not English to Persian). In other words, I have Persian in, and English out.
There is a Romanization table (Persian romanization table - Library of Congress<http://www.loc.gov/catdir/cpso/romanization/persian.pdf> www.loc.gov/catdir/cpso/romanization/persian.pdf<http://www.loc.gov/catdir/cpso/romanization/persian.pdf>).
For example, If
????  should output as  Kit?b
My finding is that existing tools only do the opposite
1.      Google Transliterate: you enter English, output Persian (Input  Bookmark , output  ???????  , Input  ???????  , output  ???????  )
2.      OCLC language: the same as Google Transliterate.
3.      http://mylanguages.org/persian_romanization.php  : works, but no API.
Anyone know such API exists?
Thanks much,
Yan
To maximize your use of LITA-L or to unsubscribe, see http://www.ala.org/lita/involve/email
To maximize your use of LITA-L or to unsubscribe, see http://www.ala.org/lita/involve/email |