On Fri, Jun 29, 2012 at 9:51 AM, Sullivan, Mark V <[log in to unmask]>wrote:
> I received a question regarding a software library I have created and
> released as open source. The record length in the leader ( positions 0-4 )
> was not being calculated correctly when writing as MarcXML. However, this
> raises a more philosophical and larger question. What is the point of the
> first five digits of the leader, outside of a ISO2709 / MARC21 encoded
> record? Should I calculate the record length AS IF it would be encoded in
> ISO2709? This would be computationally non-trivial and would likely double
> the time necessary for my software to write a MarcXML file. Should I just
> make the first five digits of the leader '00000', since it means nothing in
> the context of a MarcXML file?
>
All of the essential data in a MARC record is converted and expressed in
XML. MARC structrual elements, such as the length of field and starting
position of field data in directory entries are not needed in the XML
record. *Leader data positions not needed in the XML environment are
retained as place holders or carried as blanks*.
http://www.loc.gov/standards/marcxml/marcxml-design.html
In the XSD, The record length, and the base address of the data, must match
the regex "[\d ]{5}".
When generating a MARCXML record, if there is no length to be preserved, or
if you don't feel like preserving it, you should use a sequence of blanks
for the length and for the offset.
The purpose of the length and offset fields in the marcxml leader is to
take up 10 bytes of space, since otherwise the format would be far too
compact.
[Fun fact: for a sample of ~7 million LC bibliographic records, the 19
bytes of the marc leader excluding the length field had an information
content of about 7 bits. ]
Simon
|