Print

Print


In Perl, how do I specify MARC-8 when reading (decoding) and writing (encoding) data?

Character encoding is the bane of my existence. I have learned that when reading from a file I ought to specify the type of encoding the file is in and decode accordingly, or else. Once read, it is converted it Perl's internal encoding (UTF-8) and can be manipulated. Similarly, when writing I am expected to specify the encoding. Both the reading (decoding) and the writing (encoding) can be done with the Encode module. Here is a some code illustrating what I'm trying to do with MARC records which are apparently in MARC-8:

  # require
  use Encode qw( encode decode );
  
  # initialize
  my $batch = MARC::Batch->new( 'USMARC', './records.mrc' );
  open OUT, ' > updated.mrc';
  
  # process each record
  while ( my $marc = $batch->next ) {
  
    # get the title
    my $_245 = decode( 'FOO', $marc->title );    
    
    # do cool stuff with the title here
    
    # output the cool stuff
    print OUT encode( 'FOO', $_245 );
  
  }
  
  # done
  close OUT;
  exit;


My problem is, I don't know what to put in place of FOO. What is the official name of MARC-8's encoding scheme?

-- 
Eric "The Ugly American" Morgan
University of Notre Dame

(574) 631-8604