I believe that the ruby-marc API, when you do record['856'], you just
get the first 856, if there are more than one. You have to use other API
(I forget offhand) to get more than one, the ['856'] is just a shortcut
when you will only have one or only care about the first one.
So I don't think there's any bug in ruby-marc.
Your data example is _odd_ though, it's not usual to record 856's like
that, and it probably shouldn't be recorded like that. Multiple 856's
can exist where then are in fact multiple URLs recorded.
On 5/19/2011 1:16 PM, James Lecard wrote:
> I'll dig in this one, thanks for this input Jonathan... I'm not so so
> familiar with the library yet, I'll do some more debugging but in fact what
> is happening is that I have no value with an access such as
> record['856']['u'] field, while I get one for record['856']['q']
> And the marc you are seeing is copy/pasted from a marc editor gui, its not
> the actual marc record, I edited it so that its data is not recognisable
> (for confidentiality).
> 2011/5/19 Jonathan Rochkind<[log in to unmask]>
>> Now whether it _means_ what you want it to mean is another question, yeah.
>> As Andreas said, I don't think that particular example _ought_ to have two
>> But it ought to be perfectly parseable marc.
>> If your 'patch' is to make ruby-marc combine those multiple 856's into one
>> -- that is not right, two seperate 856's are two seperate 856's, same as any
>> other marc field. Applying that patch would mess up ruby-marc, not fix it.
>> You need to be more specific about what you're doing and what you mean
>> exactly by 'causing the ruby library to ignore it'. I wonder if you are
>> just using the a method in ruby-marc which only returns the first field
>> matching a given tag when there is more than one.
>> On 5/19/2011 12:51 PM, Andreas Orphanides wrote:
>>> From the MARC documentation :
>>> "Field 856 is repeated when the location data elements vary (the URL in
>>> subfield $u or subfields $a, $b, $d, when used). It is also repeated when
>>> more than one access method is used, different portions of the item are
>>> available electronically, mirror sites are recorded, different
>>> formats/resolutions with different URLs are indicated, and related items are
>>> So it looks like however the URL is handled, a single 856 field should be
>>> used to indicate the location . I am not familiar enough with MARC to say
>>> how it "should" have been done, but it looks like $q and $u would probably
>>> be sufficient (if they're in the same line).
>>> However, since the field is repeatable, the parser shouldn't be choking on
>>> it, unless it's choking on it for a sophisticated reason (e.g., "These
>>> aren't the subfield tags I expect to be seeing"). It also looks like if $u
>>> is provided, the first subfield should indicate access method (in this case
>>> "4" for HTTP). Maybe that's what's causing the problem? 
>>> Anyway, I think having these two parts of the same URL data on separate
>>> lines is definitely Not Right, but I am not sure if it adds up to invalid
>>>  http://www.loc.gov/marc/bibliographic/bd856.html
>>>  I am not a cataloger. Don't hurt me.
>>>  I am not an expert on MARC ingest or on ruby-marc. I could be wrong.
>>> On 5/19/2011 12:37 PM, James Lecard wrote:
>>>> I'm using ruby-marc ruby parser (v.0.4.2) to parse some marc files I get
>>>> from a partner.
>>>> The 856 field is splitted over 2 lines, causing the ruby library to
>>>> it (I've patched it to overcome this issue) but I want to know if this
>>>> of marc is valid ?
>>>> =LDR 00638nam 2200181uu 4500
>>>> =001 cla-MldNA01
>>>> =008 080101s2008\\\\\\\|||||||||||||||||fre||
>>>> =040 \\$aMy Provider
>>>> =041 0\$afre
>>>> =245 10$aThis Subject
>>>> =260 \\$aParis$bJ. Doe$c2008
>>>> =490 \\$aSome topic
>>>> =650 1\$aNarratif, Autre forme
>>>> =655 \7$abook$2lcsh
>>>> =752 \\$aA Place on earth
>>>> =776 \\$dParis: John Doe and Cie, 1973
>>>> =856 \2$qtext/html
>>>> =856 \\$uhttp://www.this-link-will-not-be-retrieved-by-ruby-marc-library
>>>> James L.