Print

Print


Quoting "Beacom, Matthew" <[log in to unmask]>:

>
> According to the report, 69 MARC tags occur in more than 1% of the   
> records in WorldCat.  That is quite a few more than the Roy's 11,   
> but even accounting for Karen's data elements being equivalent to   
> the number of MARC sub-fields this is far fewer than the 1,000 data   
> elements available to a cataloger in MARC.

So much depends on how you count things, so at the  
http://kcoyle.net/rda/ site I have put two MARC-related files. The  
first is just a list of "elements" (variable subfields) in alpha order  
with duplicates removed. Yes, I realize how imperfect this is, and  
that we will need to look beyond names to *meaning* of elements to  
determine what we really have. This file does not include indicators,  
and sometimes indicators really do create a separate element, like  
when person name becomes "Family" based on its indicator.

That file has over 560 entries.

The next file probably needs some more thought, but it is a list of  
the variable field indicators and subfields, leaving in subfields that  
are duplicated in different fields. I removed some of the numeric  
subfields that didn't seem to result in an actual elements (2, 3, 5,  
6, 8), but could be wrong about that. I also did not include  
indicators that are = "Undefined". We can debate whether a personal  
name in an added entry is the same element as a personal name in a  
subject heading, and similarly for the various places where geographic  
names are used, titles, etc etc etc. This is the analysis that is  
needed to reduce MARC21 to a cleaner set of data elements.

That file has 1421 entries.

Neither of these contains any of the fixed field elements (many of  
which, IMO, should replace textual elements now carried in MARC21).  
When I looked at the fixed fields (and this is reported at  
http://futurelib.pbworks.com/Data+and+Studies), I came up with this  
count of *unique* fixed field elements (each with multiple values):

008 - 58
007 - 55

Each one of these should become a controlled value list in a SemWeb  
implementation of MARC. RDA appears to have a total of 68 defined  
value lists, but I don't believe that those include ones defined  
elsewhere, such as languages, country codes, etc.

kc

p.s. linked from that same page is the file I am using for this  
analysis, in CSV format, if anyone else wants to play with it. I have  
tried to keep it up to date with MARBI proposals.

>
> Matthew Beacom
>
>
> By the way, the descriptive fields used in more than 20% of the MARC  
>  records in WorldCat are:
>
> 245 Title statement 100%
> 260 Imprint statement 96%
> 300 Physical description 91%
> 100 Main entry - personal name 61%
> 650 Subject added entry - topical term 46%
> 500 General note 44%
> 700 Added entry - personal name 28%
>
> They answer, more or less, a few basic questions a user might have   
> about the material:
> What is it called? Who made it? When was it made? How big is it?   
> What is it about? Answers to the question, How can I get it? are   
> usually given in the associated MARC holdings record.
>
>
> -----Original Message-----
> From: Code for Libraries [mailto:[log in to unmask]] On Behalf  
>  Of Roy Tennant
> Sent: Monday, May 03, 2010 2:15 PM
> To: [log in to unmask]
> Subject: Re: [CODE4LIB] MODS and DCTERMS
>
> I would even argue with the statement "very detailed, well over 1,000
> different data elements, some well-coded data (not all)". There are only 11
> (yes, eleven) MARC fields that appear in 20% or more of MARC records
> currently in WorldCat[1], and at least three of those elements are control
> numbers or other elements that contribute nothing to actual description. I
> would say overall that we would do well to not gloat about our metadata
> until we've reviewed the facts on the ground. Luckily, now we can.
> Roy
>
> [1] http://www.oclc.org/research/publications/library/2010/2010-06.pdf
>
> On Mon, May 3, 2010 at 11:03 AM, Eric Lease Morgan <[log in to unmask]> wrote:
>
>> On May 3, 2010, at 1:55 PM, Karen Coyle wrote:
>>
>> > 1. MARC the data format -- too rigid, needs to go away
>> > 2. MARC21 bib data -- very detailed, well over 1,000 different data
>> > elements, some well-coded data (not all); unfortunately trapped in #1
>>
>>
>>
>> The differences between the two points enumerated above, IMHO, seem to be
>> the at the heart of the never-ending debate between computer types and
>> cataloger types when it comes to library metadata. The non-library computer
>> types don't appreciate the value of human-aided systematic description. And
>> the cataloger types don't understand why MARC is a really terrible bit
>> bucket, especially considering the current environment. All too often the
>> two "camps" don't know to what the other is speaking. "MARC must die. Long
>> live MARC."
>>
>> --
>> Eric Lease Morgan
>>
>



-- 
Karen Coyle
[log in to unmask] http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234  
begin_of_the_skype_highlighting              1-510-435-8234      end_of_the_skype_highlighting
skype: kcoylenet