> As an FTY, Oops, in a hurry. s/FTY/FYI/ > -----Original Message----- > From: Doran, Michael D > Sent: Monday, October 24, 2011 1:35 PM > To: 'Code for Libraries' > Subject: RE: marc-8 > > Hi Eric, > > > In Perl, how do I specify MARC-8 when reading (decoding) and writing > > (encoding) data? > > You can't. MARC-8 is a character set that is unknown to the operating > system. Your best bet is to convert MARC-8-encoded records into UTF-8. > > > ...it is converted it Perl's > > internal encoding (UTF-8) > > As an FTY, UTF-8 is *not* Perl's internal encoding. > > -- Michael > > # Michael Doran, Systems Librarian > # University of Texas at Arlington > # 817-272-5326 office > # 817-688-1926 mobile > # [log in to unmask] > # http://rocky.uta.edu/doran/ > > > > > -----Original Message----- > > From: Code for Libraries [mailto:[log in to unmask]] On Behalf Of > Eric > > Lease Morgan > > Sent: Monday, October 24, 2011 1:18 PM > > To: [log in to unmask] > > Subject: [CODE4LIB] marc-8 > > > > In Perl, how do I specify MARC-8 when reading (decoding) and writing > > (encoding) data? > > > > Character encoding is the bane of my existence. I have learned that when > > reading from a file I ought to specify the type of encoding the file is in > > and decode accordingly, or else. Once read, it is converted it Perl's > > internal encoding (UTF-8) and can be manipulated. Similarly, when writing I > > am expected to specify the encoding. Both the reading (decoding) and the > > writing (encoding) can be done with the Encode module. Here is a some code > > illustrating what I'm trying to do with MARC records which are apparently > in > > MARC-8: > > > > # require > > use Encode qw( encode decode ); > > > > # initialize > > my $batch = MARC::Batch->new( 'USMARC', './records.mrc' ); > > open OUT, ' > updated.mrc'; > > > > # process each record > > while ( my $marc = $batch->next ) { > > > > # get the title > > my $_245 = decode( 'FOO', $marc->title ); > > > > # do cool stuff with the title here > > > > # output the cool stuff > > print OUT encode( 'FOO', $_245 ); > > > > } > > > > # done > > close OUT; > > exit; > > > > > > My problem is, I don't know what to put in place of FOO. What is the > official > > name of MARC-8's encoding scheme? > > > > -- > > Eric "The Ugly American" Morgan > > University of Notre Dame > > > > (574) 631-8604