In Perl, how do I specify MARC-8 when reading (decoding) and writing (encoding) data?
Character encoding is the bane of my existence. I have learned that when reading from a file I ought to specify the type of encoding the file is in and decode accordingly, or else. Once read, it is converted it Perl's internal encoding (UTF-8) and can be manipulated. Similarly, when writing I am expected to specify the encoding. Both the reading (decoding) and the writing (encoding) can be done with the Encode module. Here is a some code illustrating what I'm trying to do with MARC records which are apparently in MARC-8:
# require
use Encode qw( encode decode );
# initialize
my $batch = MARC::Batch->new( 'USMARC', './records.mrc' );
open OUT, ' > updated.mrc';
# process each record
while ( my $marc = $batch->next ) {
# get the title
my $_245 = decode( 'FOO', $marc->title );
# do cool stuff with the title here
# output the cool stuff
print OUT encode( 'FOO', $_245 );
}
# done
close OUT;
exit;
My problem is, I don't know what to put in place of FOO. What is the official name of MARC-8's encoding scheme?
--
Eric "The Ugly American" Morgan
University of Notre Dame
(574) 631-8604
|