Although I agree with Roy's suggestion that librarians not gloat about our metadata, the notion that the value of a data element can be elicited from the frequency of its use in the overall domain of library materials is misleading and contrary to the report Roy cites.
The sub-section of the very useful and informative OCLC report that Roy cites is very good on this point. Section 2. MARC Tag Usage in WorldCat by Karen Smith-Yoshimura clearly lays out the data in the context of WorldCat and the cataloging practice of the OCLC members.
Library holdings are dominated by texts and in terms of titles cataloged texts are dominated by books. This preponderance of books tilts the ratios of use per individual data elements. Many data elements pertain to either a specific form of material, manuscripts, for instance. Others pertain to specific content, musical notation, for instance. Some pertain to both, manuscript scores, for instance. Within the total aggregate of library materials, data elements that are specific per material or content do not rise in usage rates to anything near 20% of the aggregate total of titles. Yet these elements are necessary or valuable to those wishing to discover and use the materials, and when one recalls that 1% use rates in WorldCat equal about 1,000,000 titles the usefulness of many MARC data elements can be seen as widespread.
According to the report, 69 MARC tags occur in more than 1% of the records in WorldCat. That is quite a few more than the Roy's 11, but even accounting for Karen's data elements being equivalent to the number of MARC sub-fields this is far fewer than the 1,000 data elements available to a cataloger in MARC.
Matthew Beacom
By the way, the descriptive fields used in more than 20% of the MARC records in WorldCat are:
245 Title statement 100%
260 Imprint statement 96%
300 Physical description 91%
100 Main entry - personal name 61%
650 Subject added entry - topical term 46%
500 General note 44%
700 Added entry - personal name 28%
They answer, more or less, a few basic questions a user might have about the material:
What is it called? Who made it? When was it made? How big is it? What is it about? Answers to the question, How can I get it? are usually given in the associated MARC holdings record.
-----Original Message-----
From: Code for Libraries [mailto:[log in to unmask]] On Behalf Of Roy Tennant
Sent: Monday, May 03, 2010 2:15 PM
To: [log in to unmask]
Subject: Re: [CODE4LIB] MODS and DCTERMS
I would even argue with the statement "very detailed, well over 1,000
different data elements, some well-coded data (not all)". There are only 11
(yes, eleven) MARC fields that appear in 20% or more of MARC records
currently in WorldCat[1], and at least three of those elements are control
numbers or other elements that contribute nothing to actual description. I
would say overall that we would do well to not gloat about our metadata
until we've reviewed the facts on the ground. Luckily, now we can.
Roy
[1] http://www.oclc.org/research/publications/library/2010/2010-06.pdf
On Mon, May 3, 2010 at 11:03 AM, Eric Lease Morgan <[log in to unmask]> wrote:
> On May 3, 2010, at 1:55 PM, Karen Coyle wrote:
>
> > 1. MARC the data format -- too rigid, needs to go away
> > 2. MARC21 bib data -- very detailed, well over 1,000 different data
> > elements, some well-coded data (not all); unfortunately trapped in #1
>
>
>
> The differences between the two points enumerated above, IMHO, seem to be
> the at the heart of the never-ending debate between computer types and
> cataloger types when it comes to library metadata. The non-library computer
> types don't appreciate the value of human-aided systematic description. And
> the cataloger types don't understand why MARC is a really terrible bit
> bucket, especially considering the current environment. All too often the
> two "camps" don't know to what the other is speaking. "MARC must die. Long
> live MARC."
>
> --
> Eric Lease Morgan
>
|