In XML standard:
It is RECOMMENDED that character encodings registered (as charsets) with the Internet Assigned Numbers Authority [IANA-CHARSETS], other than those just listed, be referred to using their registered names; other encodings SHOULD use names starting with an "x-" prefix. XML processors SHOULD match character encoding names in a case-insensitive way and SHOULD either interpret an IANA-registered name as the encoding registered at IANA for that name or treat it as unknown (processors are, of course, not required to support all IANA- registered encodings).
As I suggested -- since MARC8 isn't (so far as I know) registered -- you won't get far with most standard tools, in whatever language -- you'll have to extend them to first recognize the encoding name, and second, decode the content.
smm
-----Original Message-----
From: Jonathan Rochkind [mailto:[log in to unmask]]
Sent: Tuesday, April 17, 2012 4:19 PM
To: Code for Libraries
Cc: Sheila M. Morrissey
Subject: Re: [CODE4LIB] MarcXML and char encodings
On 4/17/2012 3:01 PM, Sheila M. Morrissey wrote:
> No -- it is perfectly legal - -but you MUST declare the encoding to BE Marc8 in the XML prolog,
Wait, how canyou declare a Marc8 encoding in an XML
decleration/prolog/whatever it's called?
The things that appear there need to be from a specific list, and I
didn't think Marc8 was on that list?
Can you give me an example? And, if you happen to have it, link to XML
standard that says this is legal?
|