Hi Eric, > In Perl, how do I specify MARC-8 when reading (decoding) and writing > (encoding) data? You can't. MARC-8 is a character set that is unknown to the operating system. Your best bet is to convert MARC-8-encoded records into UTF-8. > ...it is converted it Perl's > internal encoding (UTF-8) As an FTY, UTF-8 is *not* Perl's internal encoding. -- Michael # Michael Doran, Systems Librarian # University of Texas at Arlington # 817-272-5326 office # 817-688-1926 mobile # [log in to unmask] # http://rocky.uta.edu/doran/ > -----Original Message----- > From: Code for Libraries [mailto:[log in to unmask]] On Behalf Of Eric > Lease Morgan > Sent: Monday, October 24, 2011 1:18 PM > To: [log in to unmask] > Subject: [CODE4LIB] marc-8 > > In Perl, how do I specify MARC-8 when reading (decoding) and writing > (encoding) data? > > Character encoding is the bane of my existence. I have learned that when > reading from a file I ought to specify the type of encoding the file is in > and decode accordingly, or else. Once read, it is converted it Perl's > internal encoding (UTF-8) and can be manipulated. Similarly, when writing I > am expected to specify the encoding. Both the reading (decoding) and the > writing (encoding) can be done with the Encode module. Here is a some code > illustrating what I'm trying to do with MARC records which are apparently in > MARC-8: > > # require > use Encode qw( encode decode ); > > # initialize > my $batch = MARC::Batch->new( 'USMARC', './records.mrc' ); > open OUT, ' > updated.mrc'; > > # process each record > while ( my $marc = $batch->next ) { > > # get the title > my $_245 = decode( 'FOO', $marc->title ); > > # do cool stuff with the title here > > # output the cool stuff > print OUT encode( 'FOO', $_245 ); > > } > > # done > close OUT; > exit; > > > My problem is, I don't know what to put in place of FOO. What is the official > name of MARC-8's encoding scheme? > > -- > Eric "The Ugly American" Morgan > University of Notre Dame > > (574) 631-8604