Thanks, Matthew, for a much more nuanced and accurate depiction of the data.
I would encourage anyone interested in this topic to spend some time with
this report, which was one result of a great deal of work by many people in
research institutions around the world. The findings and recommendations are
well worth your time.
Roy
On Mon, May 3, 2010 at 11:55 AM, Beacom, Matthew <[log in to unmask]>wrote:
> Although I agree with Roy's suggestion that librarians not gloat about our
> metadata, the notion that the value of a data element can be elicited from
> the frequency of its use in the overall domain of library materials is
> misleading and contrary to the report Roy cites.
>
> The sub-section of the very useful and informative OCLC report that Roy
> cites is very good on this point. Section 2. MARC Tag Usage in WorldCat by
> Karen Smith-Yoshimura clearly lays out the data in the context of WorldCat
> and the cataloging practice of the OCLC members.
>
> Library holdings are dominated by texts and in terms of titles cataloged
> texts are dominated by books. This preponderance of books tilts the ratios
> of use per individual data elements. Many data elements pertain to either a
> specific form of material, manuscripts, for instance. Others pertain to
> specific content, musical notation, for instance. Some pertain to both,
> manuscript scores, for instance. Within the total aggregate of library
> materials, data elements that are specific per material or content do not
> rise in usage rates to anything near 20% of the aggregate total of titles.
> Yet these elements are necessary or valuable to those wishing to discover
> and use the materials, and when one recalls that 1% use rates in WorldCat
> equal about 1,000,000 titles the usefulness of many MARC data elements can
> be seen as widespread.
>
> According to the report, 69 MARC tags occur in more than 1% of the records
> in WorldCat. That is quite a few more than the Roy's 11, but even
> accounting for Karen's data elements being equivalent to the number of MARC
> sub-fields this is far fewer than the 1,000 data elements available to a
> cataloger in MARC.
>
> Matthew Beacom
>
>
> By the way, the descriptive fields used in more than 20% of the MARC
> records in WorldCat are:
>
> 245 Title statement 100%
> 260 Imprint statement 96%
> 300 Physical description 91%
> 100 Main entry - personal name 61%
> 650 Subject added entry - topical term 46%
> 500 General note 44%
> 700 Added entry - personal name 28%
>
> They answer, more or less, a few basic questions a user might have about
> the material:
> What is it called? Who made it? When was it made? How big is it? What is it
> about? Answers to the question, How can I get it? are usually given in the
> associated MARC holdings record.
>
>
> -----Original Message-----
> From: Code for Libraries [mailto:[log in to unmask]] On Behalf Of
> Roy Tennant
> Sent: Monday, May 03, 2010 2:15 PM
> To: [log in to unmask]
> Subject: Re: [CODE4LIB] MODS and DCTERMS
>
> I would even argue with the statement "very detailed, well over 1,000
> different data elements, some well-coded data (not all)". There are only 11
> (yes, eleven) MARC fields that appear in 20% or more of MARC records
> currently in WorldCat[1], and at least three of those elements are control
> numbers or other elements that contribute nothing to actual description. I
> would say overall that we would do well to not gloat about our metadata
> until we've reviewed the facts on the ground. Luckily, now we can.
> Roy
>
> [1] http://www.oclc.org/research/publications/library/2010/2010-06.pdf
>
> On Mon, May 3, 2010 at 11:03 AM, Eric Lease Morgan <[log in to unmask]> wrote:
>
> > On May 3, 2010, at 1:55 PM, Karen Coyle wrote:
> >
> > > 1. MARC the data format -- too rigid, needs to go away
> > > 2. MARC21 bib data -- very detailed, well over 1,000 different data
> > > elements, some well-coded data (not all); unfortunately trapped in #1
> >
> >
> >
> > The differences between the two points enumerated above, IMHO, seem to be
> > the at the heart of the never-ending debate between computer types and
> > cataloger types when it comes to library metadata. The non-library
> computer
> > types don't appreciate the value of human-aided systematic description.
> And
> > the cataloger types don't understand why MARC is a really terrible bit
> > bucket, especially considering the current environment. All too often the
> > two "camps" don't know to what the other is speaking. "MARC must die.
> Long
> > live MARC."
> >
> > --
> > Eric Lease Morgan
> >
>
|