LISTSERV 16.5 - CODE4LIB Archives

There's also the fact that validation must be built into individual 
systems, since there is no schema or dtd against which everyone can 
validate the MARC record. Thus, every system validates somewhat 
differently (and some not at all). Although MARC is a machine-readable 
record, we don't have a machine-actionable standard for our data. This, 
I must admit, just boggles the mind in this day and age.

kc

Kyle Banerjee wrote:
>> Actually, I believe I am suffering from a number of different types of
>> errors in my MARC data: 1) encoding issues (MARC8 versus UTF-8), 2)
>> syntactical errors (lack of periods, invalid choices of indicators, etc.),
>> 3) incorrect data types (strings entered into fields denoted for integers,
>> etc.) Just about the only thing I haven't encountered are structural errors
>> such as invalid leader, and this doesn't even take into account possible
>> data entry errors (author is Franklin when Twain was entered).
>>     
>
> This MARC stuff is more confusing than it needs to be. As far as the
> original question about the difference between USMARC and MARC21,
> there is none for all practical purposes. In the mid 90's, the USMARC
> and CANMARC communities tried to eliminate differences between them to
> improve standardization. The outcome was called MARC21.
>
> Structurally, it's all the same stuff. The differences they're talking
> about resolving between CANMARC and USMARC refer to what MARC tags
> correspond with which data fields rather than substantive differences
> in structure..
>
> The MARC format itself is just a container, and it does not require
> that the fields be numeric -- that title is in 245 is simply a
> cataloging practice. Although catalogers always use numbers, the
> structure of the MARC format allows other characters to be used.
>
>   
>> Despite all of the library commmunities voiced obsession with doing things
>> 'by the book' according to standards, anyone that's actually tried to work
>> with an actually existing large corpus of MARC data.... finds that is is all
>> over the place, and very non-compliant in many ways.
>>     
>
> This sums up the problem nicely. For all their carping about detail,
> accuracy, and the like catalogers are not consistent once you get
> beyond a few basic metadata fields.
>
> This is because catalogers like to believe they can exert far more
> bibliographic control than is realistically possible. As a result,
> they have developed hopelessly complex procedures that would cause any
> Byzantine ruler to break down in tears.
>
> Have you ever seen the books catalogers do to do their jobs? There's
> not just AACR2, but also the Library of Congress Rule Interpretations,
> the Subject Cataloging Manual, LCCS, Cutter Tables, code lists for
> various fields, CONSER manual, Romanization tables, Bib formats and
> standards, and there are a zillion specialized resources. BTW, there
> is nothing unusual about using all the resources mentioned above to
> catalog a single piece.
>
> If you mention inconsistency to a cataloger, you'll trigger a
> monologue on quality control and who isn't doing what properly.
> However, you know the system is poorly designed when people who've
> been cataloging for more than 10 years can't get it right. In any
> case, the consistency is so bad that you're better off running
> heuristic procedures on data strings than trusting special purpose
> fields. Even fields as basic as encoding level that all catalogers
> know are not trustworthy enough to rely on.
>
> Catalogers. Can't live with 'em. Can't shoot 'em....
>
> kyle (ex-cataloger who created literally thousands of original records
> in OCLC during a former lifetime)
>   


-- 
-----------------------------------
Karen Coyle / Digital Library Consultant
[log in to unmask] http://www.kcoyle.net
ph.: 510-540-7596   skype: kcoylenet
fx.: 510-848-3913
mo.: 510-435-8234
------------------------------------