> If I want to have a MarcXML document encoded in Marc8 -- what should
it
> look like? What should be in the XML decleration? What should be in
the
> MARC header embedded in the XML? Or is it not in fact legal at all?
I'm going out on a limb here, but I don't think it is legal. There is
no formal encoding that corresponds to MARC-8, so there's no way to tell
XML tools how to interpret the bytes.
> If I want to have a MarcXML document encoded in UTF8, what should it
> look like? What should be in the XML decleration? What should be in
the
> MARC header embedded in the XML?
<?xml encoding="UTF-8"?>
I suppose you'll want to set the leader to UTF-8 as well, but it doesn't
really matter to any XML tools.
> If I want to have a MarcXML document with a char encoding that is
> _neither_ Marc8 nor UTF8, but something else generally legal for XML
--
> is this legal at all? And if so, what should it look like? What should
> be in the XML decleration? What should be in the MARC header embedded
in
> the XML?
I'd claim this is legal, if it is legal XML. Set your encoding to
anything that is valid.
As a Java programmer, using java XML tools, the encoding is just a hint
to the tools. I end up with Unicode strings after the XML is read. So
I always ignore the encoding byte in the leader.
Following that logic, that byte is about encoding. It has meaning when
ISO 2709 is the transfer mechanism. But, in this case, XML is the
transfer mechanism and it's rules for identifying the encoding are what
matter. I'm proposing that the encoding byte in the leader is
meaningless.
Ralph
|