Print

Print


I am pretty sure that the marc4j standard reader ignores them; the tolerant
reader definitely does. Otherwise JHU might have about two parseable records
based on the mangled leaders that J-Rock  gets stuck with :-)

An analysis of the ~7M LC bib records from the scriblio.net data files (~
Dec 2006) indicated that leader  has less than 8 bits of information in it
(shannon-weaver definition). This excludes the initial length value, which
is redundant given the end of record marker.


The LC V'GER adds a pseudo tag 000 to it's HTML view of the MARC leader.
 The final characters of the leader are "450".

Also, I object to the phrase "decent MARC tool".  Any tool capable of
dealing with MARC as it exists cannot afford the luxury of decency :-)

[ HA: "A clear conscience?"
 BW: "Yes, Sir Humphrey."
 HA: "When did you acquire this taste for luxuries?"]

Simon

On Fri, Apr 1, 2011 at 5:16 AM, Owen Stephens <[log in to unmask]> wrote:

> "I'm sure any decent MARC tool can deal with them, since decent MARC tools
> are certainly going to be forgiving enough to deal with four characters
> that
> apparently don't even really matter."
>
> You say that, but I'm pretty sure Marc4J throws errors MARC records where
> these characters are incorrect
>
> Owen
>
> On Fri, Apr 1, 2011 at 3:51 AM, William Denton <[log in to unmask]> wrote:
>
> > On 28 March 2011, Ford, Kevin wrote:
> >
> >  I couldn't get Simon's MARC 21 Magic file to work.  Among other issues,
> I
> >> received "line too long" errors.  But, since I've been curious about
> this
> >> for sometime, I figured I'd take a whack at it myself.  Try this:
> >>
> >
> > This is very nice!  Thanks.  I tried it on a bunch of MARC files I have,
> > and it recognized almost all of them.  A few it didn't, so I had a closer
> > look, and they're invalid.
> >
> > For example, the Internet Archive's Binghamton catalogue dump:
> >
> > http://ia600307.us.archive.org/6/items/marc_binghamton_univ/
> >
> > $ file -m marc.magic bgm*mrc
> > bgm_openlib_final_0-5.mrc:         data
> > bgm_openlib_final_10-15.mrc:       MARC Bibliographic
> > bgm_openlib_final_15-18.mrc:       data
> > bgm_openlib_final_5-10.mrc:        MARC Bibliographic
> >
> > But why?  Aha:
> >
> > $ head -c 25 bgm_openlib_final_*mrc
> > ==> bgm_openlib_final_0-5.mrc <==
> > 01812cas  2200457   45x00
> > ==> bgm_openlib_final_10-15.mrc <==
> > 01008nam  2200289ua 45000
> > ==> bgm_openlib_final_15-18.mrc <==
> > 01614cam    00385   45  0
> > ==> bgm_openlib_final_5-10.mrc <==
> > 00887nam  2200265v  45000
> >
> > As you say, the leader should end with 4500 (as defined at
> > http://www.loc.gov/marc/authority/adleader.html) but two of those files
> > don't.  So they're not valid MARC.  I'm sure any decent MARC tool can
> deal
> > with them, since decent MARC tools are certainly going to be forgiving
> > enough to deal with four characters that apparently don't even really
> > matter.
> >
> > So on the one hand they're usable MARC but file wouldn't say so, and on
> the
> > other that's a good indication that the files have failed a basic
> validity
> > test.  I wonder if there are similar situations for JPEGs or MP3s.
> >
> > I think you should definitely submit this for inclusion in the magic
> file.
> > It would be very useful for us all!
> >
> > Bill
> >
> > P.S. I'd never used head -c (to show a fixed number of bytes) before.
> > Always nice to find a new useful option to an old command.
> >
> >
> >  #--------------------------------------------
> >> # MARC 21 Magic  (Second cut)
> >>
> >> # Set at position 0
> >> 0       short   >0x0000
> >>
> >> # leader ends with 4500
> >>
> >>> 20      string  4500
> >>>
> >>
> >> # leader starts with 5 digits, followed by codes specific to MARC format
> >>
> >>> 0       regex/1 (^[0-9]{5})[acdnp][^bhlnqsu-z]  MARC Bibliographic
> >>>> 0       regex/1 (^[0-9]{5})[acdnosx][z] MARC Authority
> >>>> 0       regex/1 (^[0-9]{5})[cdn][uvxy]  MARC Holdings
> >>>> 0       regex/1 (^[0-9]{5})[acdn][w]    MARC Classification
> >>>> 0       regex/1 (^[0-9]{5})[cdn][q]     MARC Community
> >>>>
> >>>
> >
> > --
> > William Denton, Toronto : miskatonic.org www.frbr.org openfrbr.org
> >
>
>
>
> --
> Owen Stephens
> Owen Stephens Consulting
> Web: http://www.ostephens.com
> Email: [log in to unmask]
>