[I realise there was a recent related 'Character-sets for dummies'[1]
discussion recently] 

I am using tictocs[2] list of journal RSS feeds, and I am getting
gibberish in places for diacritics. Below is an example:

in emacs:
 221	Acta Ortop  dica Brasileira	1413-7852	
in Firefox:
 221	Acta Ortop  dica Brasileira	1413-7852

Note that the emacs view is both of a save of the Firefox, and from a
direct download using 'wget'.

Is this something on my end, or are the tictocs people not serving
proper UTF-8? 

The HTTP header from wget claims UTF-8:
> wget -S
> --2009-12-21 12:47:59--
> Resolving
> Connecting to||:80... connected.
> HTTP request sent, awaiting response... 
>   HTTP/1.1 200 OK
>   Date: Mon, 21 Dec 2009 17:42:05 GMT
>   Server: Apache/2.2.13 (Unix) mod_ssl/2.2.13 OpenSSL/0.9.8k PHP/5.3.0 DAV/2
>   X-Powered-By: PHP/5.3.0
>   Content-Type: text/plain; charset=utf-8
>   Connection: close
> Length: unspecified [text/plain]
><....stuff removed>

Can someone validate if they are also experiencing this issue?



Glen Newton | [log in to unmask]
Researcher, Information Science, CISTI Research
& NRC W3C Advisory Committee Representative
tel/t l: 613-990-9163 | facsimile/t l copieur 613-952-8246
Canada Institute for Scientific and Technical Information (CISTI)
National Research Council Canada (NRC)| M-55, 1200 Montreal Road
Institut canadien de l'information scientifique et technique (ICIST) 
Conseil national de recherches Canada | M-55, 1200 chemin Montr al
Ottawa, Ontario K1A 0R6  
Government of Canada | Gouvernement du Canada