Ahh the lovely MARC-8 :-)
It's a fair bit of effort I think. One approach could be to porting
the MARC8->Unicode functionality from pymarc [1,2]. It's only one-way,
but that's normally what most sane people want to do anyhow.
Another approach would be to look into wrapping yaz-iconv  from
IndexData which provides much more (and faster) MARC related character
If you just want to get something done without extending ruby-marc you
can pre-process your data with yaz-marcdump and then throw it at
ruby-marc. Or perhaps if you are in jruby-land you could use marc4j
which has MARC-8 support.
I've cc'ed code4lib since someone else might have some better ideas.
Thanks for writing.
On Fri, Oct 30, 2009 at 3:22 AM, Brendan Boesen <[log in to unmask]> wrote:
> Hi Guys,
> I guess this is the 'bug the authors if you need it' email.
> I'm trying to parse a MARC record and it contains Chinese characters. From
> the leader:
> 01051cam 2200265 a 4504
> it looks like the record uses MARC8 encoding.
> I'm investigating a way to get a Unicode encoded one but that may not work
> out. What sort of effort do you think is involved in adding MARC8 support
> into marc-ruby? (And is there anything I could do to help with that?)
> Brendan Boesen
> National Library of Australia