LISTSERV 16.5 - CODE4LIB Archives

ruby-marc does not have any capacity to convert MARC-8 in any ruby
interpreter: MRI, JRuby, Rubinius, whatever.

Given the amount of work required to include this (unless Mark
Matienzo feels like hacking into ruby-marc what he did for pymarc), I
think I'd need to see a really compelling need (that can't be solved
by one of the options that Ed mentioned) before making this much of a
priority.

-Ross.

On Mon, Nov 2, 2009 at 11:42 AM, Jonathan Rochkind <[log in to unmask]> wrote:
> I thought that marc-ruby did MARC8 already!  Wait, does it just do it in
> 'native' ruby interpreter, but not in jruby?
>
> I'm dealing with records in MARC8 now I think with marc-ruby, and it looked
> like the non-roman characters were coming accross okay! I might need to go
> investigate my setup further now....
>
> Jonathan
>
> Ed Summers wrote:
>>
>> Hi Brendan:
>>
>> Ahh the lovely MARC-8 :-)
>>
>> It's a fair bit of effort I think. One approach could be to porting
>> the MARC8->Unicode functionality from pymarc [1,2]. It's only one-way,
>> but that's normally what most sane people want to do anyhow.
>>
>> Another approach would be to look into wrapping yaz-iconv [3] from
>> IndexData which provides much more (and faster) MARC related character
>> mapping facilities.
>>
>> If you just want to get something done without extending ruby-marc you
>> can pre-process your data with yaz-marcdump and then throw it at
>> ruby-marc. Or perhaps if you are in jruby-land you could use marc4j
>> which has MARC-8 support.
>>
>> I've cc'ed code4lib since someone else might have some better ideas.
>> Thanks for writing.
>>
>> //Ed
>>
>> [1]
>> http://bazaar.launchpad.net/~ehs-pobox/pymarc/dev/annotate/head%3A/pymarc/marc8.py
>> [2]
>> http://bazaar.launchpad.net/~ehs-pobox/pymarc/dev/annotate/head%3A/pymarc/marc8_mapping.py
>> [3] http://www.indexdata.com/yaz/doc/yaz-iconv.html
>> [4] http://marc4j.tigris.org/
>>
>> On Fri, Oct 30, 2009 at 3:22 AM, Brendan Boesen <[log in to unmask]>
>> wrote:
>>
>>>
>>> Hi Guys,
>>>
>>> I guess this is the 'bug the authors if you need it' email.
>>>
>>> I'm trying to parse a MARC record and it contains Chinese characters.
>>>  From
>>> the leader:
>>>       01051cam  2200265 a 4504
>>> it looks like the record uses MARC8 encoding.
>>>
>>> I'm investigating a way to get a Unicode encoded one but that may not
>>> work
>>> out.  What sort of effort do you think is involved in adding MARC8
>>> support
>>> into marc-ruby? (And is there anything I could do to help with that?)
>>>
>>> Regards,
>>>
>>> Brendan Boesen
>>> National Library of Australia
>>>
>>>
>>>
>>
>>
>