Print

Print


Thanks Michael. So one weird thing is that at least some of those 
characters "specifically designated as control characters" aren't 
ordinarily what everyone else considers "control characters".  To me, 
"control character" means ASCII less than 20. Which the last four 
aren't. So now it's unclear what the "prohibted" (by not being 
mentioned) control characters are, since I don't know what MARC 
considers a 'control character' exactly.

But I'm really just picking nits to demonstrate the impenetrability of 
MARC specs.  I believe you all (especially Terry) that CR and LF aren't 
allowed.

But, two, Michael, are you the doran in this? 
http://rocky.uta.edu/doran/charsets/marc8default.html

You might want to remove CR, LF, and the other disallowed control 
characters from your own published list of MARC8 characters!

On 5/19/2011 3:16 PM, Doran, Michael D wrote:
>> Is it really true that newline characters are not allowed in a marc
>> value?
> Yes.
>
>    CONTROL FUNCTION CODES [1]
>
>    Eight characters are specifically designated as control characters for MARC 21 use:
>
>    - escape character, 1B(hex) in MARC-8 and Unicode encoding
>    - subfield delimiter, 1F(hex) in MARC-8 and Unicode encoding
>    - field terminator, 1E(hex) in MARC-8 and Unicode encoding
>    - record terminator, 1D(hex) in MARC-8 and Unicode encoding
>    - non-sorting character(s) begin, 88(hex) in MARC-8 and 98(hex) in Unicode encoding
>    - non-sorting character(s) end, 89(hex) in MARC-8 and 9C(hex) in Unicode encoding
>    - joiner, 8D(hex) in MARC-8 and 200D (hex) in Unicode encoding
>    - nonjoiner, 8E(hex) in MARC-8 and 200C (hex) in Unicode encoding.
>
> [1] http://www.loc.gov/marc/specifications/specchargeneral.html#controlfunction
>
> -- Michael
>
> # Michael Doran, Systems Librarian
> # University of Texas at Arlington
> # 817-272-5326 office
> # 817-688-1926 mobile
> # [log in to unmask]
> # http://rocky.uta.edu/doran/
>
>
>
>
>
>
>> -----Original Message-----
>> From: Code for Libraries [mailto:[log in to unmask]] On Behalf Of
>> Jonathan Rochkind
>> Sent: Thursday, May 19, 2011 1:27 PM
>> To: [log in to unmask]
>> Subject: Re: [CODE4LIB] is this valid marc ?
>>
>> Is it really true that newline characters are not allowed in a marc
>> value?  I thought they were, not with any special meaning, just as
>> ordinary data.  If they're not, that's useful to know, so I don't put
>> any there!
>>
>> I'd ask for a reference to the standard that says this, but I suspect
>> it's going to be some impenetrable implication of a side effect of an
>> subtle adjective either way.
>>
>> On 5/19/2011 2:19 PM, Karen Coyle wrote:
>>> Quoting Andreas Orphanides<[log in to unmask]>:
>>>
>>>> Anyway, I think having these two parts of the same URL data on
>>>> separate lines is definitely Not Right, but I am not sure if it adds
>>>> up to invalid MARC.
>>> Exactly. The CR and LF characters are NOT defined as valid in the MARC
>>> character set and should not be used. In fact, in MARC there is no
>>> concept of "lines", only variable length strings (usually up to 9999
>>> char).
>>>
>>> kc
>>>
>>>> -dre.
>>>>
>>>> [1] http://www.loc.gov/marc/bibliographic/bd856.html
>>>> [2] I am not a cataloger. Don't hurt me.
>>>> [3] I am not an expert on MARC ingest or on ruby-marc. I could be wrong.
>>>>
>>>> On 5/19/2011 12:37 PM, James Lecard wrote:
>>>>> I'm using ruby-marc ruby parser (v.0.4.2) to parse some marc files I
>>>>> get
>>>>> from a partner.
>>>>>
>>>>> The 856 field is splitted over 2 lines, causing the ruby library to
>>>>> ignore
>>>>> it (I've patched it to overcome this issue) but I want to know if
>>>>> this kind
>>>>> of marc is valid ?
>>>>>
>>>>> =LDR  00638nam  2200181uu 4500
>>>>> =001  cla-MldNA01
>>>>> =008  080101s2008\\\\\\\|||||||||||||||||fre||
>>>>> =040  \\$aMy Provider
>>>>> =041  0\$afre
>>>>> =245  10$aThis Subject
>>>>> =260  \\$aParis$bJ. Doe$c2008
>>>>> =490  \\$aSome topic
>>>>> =650  1\$aNarratif, Autre forme
>>>>> =655  \7$abook$2lcsh
>>>>> =752  \\$aA Place on earth
>>>>> =776  \\$dParis: John Doe and Cie, 1973
>>>>> =856  \2$qtext/html
>>>>> =856
>>>>> \\$uhttp://www.this-link-will-not-be-retrieved-by-ruby-marc-library
>>>>>
>>>>> Thanks,
>>>>>
>>>>> James L.
>>>
>>>