> Actually, I believe I am suffering from a number of different types of > errors in my MARC data: 1) encoding issues (MARC8 versus UTF-8), 2) > syntactical errors (lack of periods, invalid choices of indicators, etc.), > 3) incorrect data types (strings entered into fields denoted for integers, > etc.) Just about the only thing I haven't encountered are structural errors > such as invalid leader, and this doesn't even take into account possible > data entry errors (author is Franklin when Twain was entered). This MARC stuff is more confusing than it needs to be. As far as the original question about the difference between USMARC and MARC21, there is none for all practical purposes. In the mid 90's, the USMARC and CANMARC communities tried to eliminate differences between them to improve standardization. The outcome was called MARC21. Structurally, it's all the same stuff. The differences they're talking about resolving between CANMARC and USMARC refer to what MARC tags correspond with which data fields rather than substantive differences in structure.. The MARC format itself is just a container, and it does not require that the fields be numeric -- that title is in 245 is simply a cataloging practice. Although catalogers always use numbers, the structure of the MARC format allows other characters to be used. > Despite all of the library commmunities voiced obsession with doing things > 'by the book' according to standards, anyone that's actually tried to work > with an actually existing large corpus of MARC data.... finds that is is all > over the place, and very non-compliant in many ways. This sums up the problem nicely. For all their carping about detail, accuracy, and the like catalogers are not consistent once you get beyond a few basic metadata fields. This is because catalogers like to believe they can exert far more bibliographic control than is realistically possible. As a result, they have developed hopelessly complex procedures that would cause any Byzantine ruler to break down in tears. Have you ever seen the books catalogers do to do their jobs? There's not just AACR2, but also the Library of Congress Rule Interpretations, the Subject Cataloging Manual, LCCS, Cutter Tables, code lists for various fields, CONSER manual, Romanization tables, Bib formats and standards, and there are a zillion specialized resources. BTW, there is nothing unusual about using all the resources mentioned above to catalog a single piece. If you mention inconsistency to a cataloger, you'll trigger a monologue on quality control and who isn't doing what properly. However, you know the system is poorly designed when people who've been cataloging for more than 10 years can't get it right. In any case, the consistency is so bad that you're better off running heuristic procedures on data strings than trusting special purpose fields. Even fields as basic as encoding level that all catalogers know are not trustworthy enough to rely on. Catalogers. Can't live with 'em. Can't shoot 'em.... kyle (ex-cataloger who created literally thousands of original records in OCLC during a former lifetime) -- ---------------------------------------------------------- Kyle Banerjee Digital Services Program Manager Orbis Cascade Alliance [log in to unmask] / 541.359.9599