Print

Print


On 4/17/2012 1:57 PM, Kyle Banerjee wrote:

> In some cases, invalid XML. In an ideal world, the encoding should be 
> included in the declaration. But I wouldn't trust it. kyle 

So would you use the Marc header payload instead?

Or you're just saying you wouldn't trust _any_ encoding declerations you 
find anywhere?

When writing a library to handle marc, I think the base line should be 
making it do the official legal standards-complaint right thing.  Extra 
heuristics to deal with invalid data can be added on top.

But my trouble here is I can't even figure out what the official legal 
standards-compliant thing is.

Maybe that's becuase the MarcXML standard simply doesn't address it, and 
it's all implementation dependent. sigh.

The problem is how the XML documents own char encoding is supposed to 
interact with the MARC header; especially because there's no way to put 
Marc8 in an XML char encoding doctype (is there?);  and whether 
encodings other than Marc8 or UTF8 are legal in MarcXML, even though 
they aren't in MARC ISO binary.

I think the answer might be "nobody knows, and there is no standard 
right way to do it." Which is unfortunate.