Hi Jonathan, > I tried to figure out how to custom add a new encoding to ruby 1.9 with > the idea of adding Marc8 as an actuall ruby 1.9 character encoding > supported same as any other built in char encoding Not a trivial undertaking. Remember that the MARC-8 environment allows alternate character sets to be invoked within a MARC record using two different "escape" methods [1]. Just one of the reasons why you're not finding a bunch of these MARC-8 conversion modules, and one for every language. ;-) -- Michael [1] Technique 1 is unique to MARC-8 and provides access to a small number of Greek symbols, subscripts, and superscripts. Technique 2 is based on the ANSI X3.41 (ISO 2022) "Code Extension Techniques for Use with 7-bit and 8-bit Character Sets" standard. See the MARC 21 Specification for details on accessing alternate graphic character sets (http://www.loc.gov/marc/specifications/speccharmarc8.html#alternative). > -----Original Message----- > From: Code for Libraries [mailto:[log in to unmask]] On Behalf Of > Jonathan Rochkind > Sent: Monday, October 24, 2011 2:01 PM > To: [log in to unmask] > Subject: Re: [CODE4LIB] marc-8 > > What _ought_ to be easiest of all is getting our ILS's to NEVER export > Marc8 _ever_ again. UTF8 only. > > Sadly, that only ought to be easiest. > > But IMO there's no reason any of us should be dealing with Marc8 ever > again. The only thing that should deal in Marc8 is an ILS, and should > only input it, NEVER output it, UTF8 only, please! > > But this is not the world we live in. > > I tried to figure out how to custom add a new encoding to ruby 1.9 with > the idea of adding Marc8 as an actuall ruby 1.9 character encoding > supported same as any other built in char encoding, but I couldn't > figure out if that was possible or how to do it. If it was possible to > do at that low level in ruby 1.9, it might justify the time to do it. > > On 10/24/2011 2:55 PM, Doran, Michael D wrote: > > Eric, > > > > Sometimes for grandpa Perl stuff -- especially as concerns charsets and/or > internationalization -- it's worth pinging these lists: > > > > [log in to unmask] (yes, still alive and kicking) > > > > [log in to unmask] (very low traffic list, but some knowledgeable > subscribers) > > > > -- Michael > > > >> -----Original Message----- > >> From: Doran, Michael D > >> Sent: Monday, October 24, 2011 1:48 PM > >> To: 'Code for Libraries' > >> Subject: RE: [CODE4LIB] marc-8 > >> > >>> Okay. How do I go about converting MARC-8 encoded records into UTF-8? > >> In Perl... using the handy MARC::Charset module (tip 'o the hat to Ed > >> Summers, and now maintained by Galen Charlton). > >> > >> -- Michael > >> > >>> -----Original Message----- > >>> From: Code for Libraries [mailto:[log in to unmask]] On Behalf Of > >> Eric > >>> Lease Morgan > >>> Sent: Monday, October 24, 2011 1:39 PM > >>> To: [log in to unmask] > >>> Subject: Re: [CODE4LIB] marc-8 > >>> > >>> On Oct 24, 2011, at 2:34 PM, Doran, Michael D wrote: > >>> > >>>>> In Perl, how do I specify MARC-8 when reading (decoding) and writing > >>>>> (encoding) data? > >>>> You can't. MARC-8 is a character set that is unknown to the operating > >>> system. Your best bet is to convert MARC-8-encoded records into UTF-8. > >>> > >>> /me throws his hands up in the air and screams! > >>> > >>> Okay. How do I go about converting MARC-8 encoded records into UTF-8? I > >> know > >>> yaz-marcdump changes the encoding bit in MARC leaders. Does it also > >> convert > >>> MARC-8 characters to UTF-8? (I guess I could simply try it and see what > >>> happens.) > >>> > >>> -- > >>> Eric Morgan