Here's another simple Perl script that works for me: #!/usr/bin/perl $file = shift or die "\nUsage: msplit filename [num of records] [new file name]\n\n"; $s = shift || 1000; $of = shift || $file; $/ = chr(29); $i = 0; $tot = 0; open IN, $file or die "Can't find input file '$file'!\n"; while (<IN>) { $i++; if ($i == $s || $tot == 0) { $i = 0; $out++; $out =~ s/^(\d)$/0$1/; $fout = "$of$out"; unlink $fout; open OUT, ">>$fout"; } $tot++; print OUT } print "\n$tot Marc records written to $out files\n\n"; --Charles Ledvina infosoup.org On Tue, 26 Jan 2010 08:16:56 -0600, Tod Olson <[log in to unmask]> wrote: > The yaz-marcdump utility[1], included in the YAZ toolkit[2], should work > and I've found it to be blindingly fast. > > -Tod > > [1] http://www.indexdata.com/yaz/doc/yaz-marcdump.html > [2] http://www.indexdata.com/yaz > > Tod Olson <[log in to unmask]> > Systems Librarian > University of Chicago Library > > On Jan 26, 2010, at 2:34 AM, Marc Chantreux wrote: > >> On Mon, Jan 25, 2010 at 11:48:47PM +0530, Saiful Amin wrote: >>> I also recommend using MARC::Batch. Attached is a simple script I wrote >>> for >>> myself. >> >> I think MARC::Batch would be very slow to split lot of records. As 0x1d >> is your record separator, a perl oneliner can do the job: >> >> http://www.tinybox.net/2009/10/12/perl-onliners-vim-and-iso2709/ >> >> perl -0x1d -wnE ' >> # new file every 1000 records >> $. == 1 || ! ($. % 1000 ) >> # with record number padded with 5 chars >> and open F,sprintf ">records_%.5d.mrc",$.; >> # actually prints the record >> print F >> ' bigfile.mrc >> >> if you file is UTF-8 encoded, use -CSD flags >> >> hope it helps >> regards >> >> >> -- >> Marc Chantreux >> BibLibre, expert en logiciels libres pour l'info-doc >> http://biblibre.com