Print

Print


On 10/24/2011 2:52 PM, Ross Singer wrote:
> On Mon, Oct 24, 2011 at 7:39 PM, Eric Lease Morgan<[log in to unmask]>  wrote:
>
>> Okay. How do I go about converting MARC-8 encoded records into UTF-8? I know yaz-marcdump changes the encoding bit in MARC leaders. Does it also convert MARC-8 characters to UTF-8? (I guess I could simply try it and see what happens.)
>>
> Yes, it does.  It uses yaz-iconv.  Theoretically, you could wrap some
> Perl module around that.  I've contemplated it for ruby-marc, but then
> it always seems a lot easier to ignore it and delete any emails that
> request it.

Or use jruby, where you can use Marc4J.   Or actually port either the 
Java or (apparently?) Perl version into ruby; okay that one is not 
"easier" then anything in the short term, but in the long term I'd 
rather have pure ruby that something that relies on an external bash 
call or a C extension, those latter are invariably going to be annoying 
and confusing maintenance down the line, in my experience.

But I'm not doing any of these things anytime soon either. So far all my 
ruby that deals with Marc gets something else to convert it first.  (In 
my largest case, Java Marc4J converts it before it's stored in a stored 
field in a Solr index, and my ruby only gets it from the stored field in 
Solr, already converted).