...... Maybe we have different understandings of "valid".
>
> If leader bytes 20-23 are not "4500", I suggest that is _by definition_ not
> a "valid" Marc21 file. It violates the Marc21 specification.
>
> Now, they may still be _usable_, by software that ignores these bytes
> anyway or works around them. We definitely have a lot of software that does
> that.
>
> Which can end up causing problems that remind me of very analagous problems
> caused by the early days of web browsers that felt like being 'tolerant' of
> bad data. "My html works in every web brower BUT this one, why not? Oh,
> becuase that's the only one that actually followed the standard, oops."
>
There is some question as to what value there is in validating fields that
have no meaning by definition. What benefit does validating an undefined
value have other than create an opportunity to break things and slow the
process down just a little? The entire concept of an invalid entry in an
undefined field (e.g byte 23) is oxymoronic.
I'd go so far as to question the value of validating redundant data that
theoretically has meaning but which are never supposed to vary. The 4 and
the 5 simply repeat what is already known about the structure of the MARC
record. Choking on stuff like this is like having a web browser ask you want
to do with a page because it lacks a document type declaration.
Garbage data is the reality, so having parsers stop when they encounter data
they don't actually need unnecessarily complicates things. That kind of
stuff should generate a warning at worst.
kyle
|