> So would you use the Marc header payload instead?
> Or you're just saying you wouldn't trust _any_ encoding declerations you
> find anywhere?
The short version is that too many vendors and systems just supply some
value without making sure that's what they're spitting out. I haven't had
to mess with this stuff for a few years, so I'm hoping Terry Reese weighs
in on this conversation -- he has a lot of experience dealing with encoding
headaches. However, the bottom line is that the most reliable method is to
use heuristics to detect what's going on. Yeah, that totally kills the
point of listing encodings in first place, but just as is the case with any
unreliably used data point, it's all GIGO.
When writing a library to handle marc, I think the base line should be
> making it do the official legal standards-complaint right thing. Extra
> heuristics to deal with invalid data can be added on top.
I'm hoping things have improved, but if heuristics are more reliable than
reading the right areas of the record, you have to ignore what's there
(which makes even reading it pointless). I do think there is value in
encouraging vendors to actually pay attention to this stuff as such basic
screwups undermine both the the credibility of the data source and the
service that depends on the data.
> But my trouble here is I can't even figure out what the official legal
> standards-compliant thing is.
> Maybe that's becuase the MarcXML standard simply doesn't address it, and
> it's all implementation dependent. sigh.
> The problem is how the XML documents own char encoding is supposed to
> interact with the MARC header; especially because there's no way to put
> Marc8 in an XML char encoding doctype (is there?); and whether encodings
> other than Marc8 or UTF8 are legal in MarcXML, even though they aren't in
> MARC ISO binary.
> I think the answer might be "nobody knows, and there is no standard right
> way to do it." Which is unfortunate.
A good summary of the situation as I understand it.
Digital Services Program Manager
Orbis Cascade Alliance
[log in to unmask] / 503.999.9787