stuart yeates writes > Thomas Krichel wrote: ... > > It will try to guess between UTF-8 and ISO-8859-1. This can be done > > because UTF-8 has many invalid byte sequences. But say if you > > wanted to guess between ISO-8859-1 and ISO-8859-2, you'd be out of > > luck. > > Not necessarily. I meant you would be out of luck with the tool I proposed. > There are tools such as http://www.let.rug.nl/~vannoord/TextCat/ > which provide very reliable guessing of languages. I am happy to read this, I had requirements for language detection several times already. But the detection of languages is a bit of a different problem than the detection of character codes. Cheers, Thomas Krichel http://openlib.org/home/krichel http://authorclaim.org/profile/pkr1 skype: thomaskrichel