Quoting "Beacom, Matthew" <[log in to unmask]>:
>
> According to the report, 69 MARC tags occur in more than 1% of the
> records in WorldCat. That is quite a few more than the Roy's 11,
> but even accounting for Karen's data elements being equivalent to
> the number of MARC sub-fields this is far fewer than the 1,000 data
> elements available to a cataloger in MARC.
So much depends on how you count things, so at the
http://kcoyle.net/rda/ site I have put two MARC-related files. The
first is just a list of "elements" (variable subfields) in alpha order
with duplicates removed. Yes, I realize how imperfect this is, and
that we will need to look beyond names to *meaning* of elements to
determine what we really have. This file does not include indicators,
and sometimes indicators really do create a separate element, like
when person name becomes "Family" based on its indicator.
That file has over 560 entries.
The next file probably needs some more thought, but it is a list of
the variable field indicators and subfields, leaving in subfields that
are duplicated in different fields. I removed some of the numeric
subfields that didn't seem to result in an actual elements (2, 3, 5,
6, 8), but could be wrong about that. I also did not include
indicators that are = "Undefined". We can debate whether a personal
name in an added entry is the same element as a personal name in a
subject heading, and similarly for the various places where geographic
names are used, titles, etc etc etc. This is the analysis that is
needed to reduce MARC21 to a cleaner set of data elements.
That file has 1421 entries.
Neither of these contains any of the fixed field elements (many of
which, IMO, should replace textual elements now carried in MARC21).
When I looked at the fixed fields (and this is reported at
http://futurelib.pbworks.com/Data+and+Studies), I came up with this
count of *unique* fixed field elements (each with multiple values):
008 - 58
007 - 55
Each one of these should become a controlled value list in a SemWeb
implementation of MARC. RDA appears to have a total of 68 defined
value lists, but I don't believe that those include ones defined
elsewhere, such as languages, country codes, etc.
kc
p.s. linked from that same page is the file I am using for this
analysis, in CSV format, if anyone else wants to play with it. I have
tried to keep it up to date with MARBI proposals.
>
> Matthew Beacom
>
>
> By the way, the descriptive fields used in more than 20% of the MARC
> records in WorldCat are:
>
> 245 Title statement 100%
> 260 Imprint statement 96%
> 300 Physical description 91%
> 100 Main entry - personal name 61%
> 650 Subject added entry - topical term 46%
> 500 General note 44%
> 700 Added entry - personal name 28%
>
> They answer, more or less, a few basic questions a user might have
> about the material:
> What is it called? Who made it? When was it made? How big is it?
> What is it about? Answers to the question, How can I get it? are
> usually given in the associated MARC holdings record.
>
>
> -----Original Message-----
> From: Code for Libraries [mailto:[log in to unmask]] On Behalf
> Of Roy Tennant
> Sent: Monday, May 03, 2010 2:15 PM
> To: [log in to unmask]
> Subject: Re: [CODE4LIB] MODS and DCTERMS
>
> I would even argue with the statement "very detailed, well over 1,000
> different data elements, some well-coded data (not all)". There are only 11
> (yes, eleven) MARC fields that appear in 20% or more of MARC records
> currently in WorldCat[1], and at least three of those elements are control
> numbers or other elements that contribute nothing to actual description. I
> would say overall that we would do well to not gloat about our metadata
> until we've reviewed the facts on the ground. Luckily, now we can.
> Roy
>
> [1] http://www.oclc.org/research/publications/library/2010/2010-06.pdf
>
> On Mon, May 3, 2010 at 11:03 AM, Eric Lease Morgan <[log in to unmask]> wrote:
>
>> On May 3, 2010, at 1:55 PM, Karen Coyle wrote:
>>
>> > 1. MARC the data format -- too rigid, needs to go away
>> > 2. MARC21 bib data -- very detailed, well over 1,000 different data
>> > elements, some well-coded data (not all); unfortunately trapped in #1
>>
>>
>>
>> The differences between the two points enumerated above, IMHO, seem to be
>> the at the heart of the never-ending debate between computer types and
>> cataloger types when it comes to library metadata. The non-library computer
>> types don't appreciate the value of human-aided systematic description. And
>> the cataloger types don't understand why MARC is a really terrible bit
>> bucket, especially considering the current environment. All too often the
>> two "camps" don't know to what the other is speaking. "MARC must die. Long
>> live MARC."
>>
>> --
>> Eric Lease Morgan
>>
>
--
Karen Coyle
[log in to unmask] http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234
begin_of_the_skype_highlighting 1-510-435-8234 end_of_the_skype_highlighting
skype: kcoylenet
|