On 4/17/2012 1:57 PM, Kyle Banerjee wrote: > In some cases, invalid XML. In an ideal world, the encoding should be > included in the declaration. But I wouldn't trust it. kyle So would you use the Marc header payload instead? Or you're just saying you wouldn't trust _any_ encoding declerations you find anywhere? When writing a library to handle marc, I think the base line should be making it do the official legal standards-complaint right thing. Extra heuristics to deal with invalid data can be added on top. But my trouble here is I can't even figure out what the official legal standards-compliant thing is. Maybe that's becuase the MarcXML standard simply doesn't address it, and it's all implementation dependent. sigh. The problem is how the XML documents own char encoding is supposed to interact with the MARC header; especially because there's no way to put Marc8 in an XML char encoding doctype (is there?); and whether encodings other than Marc8 or UTF8 are legal in MarcXML, even though they aren't in MARC ISO binary. I think the answer might be "nobody knows, and there is no standard right way to do it." Which is unfortunate.