Hi,
have a look at the Catmandu framework for Perl.
Install Catmandu ( https://metacpan.org/pod/Catmandu) and Catmandu::OAI
( https://metacpan.org/pod/Catmandu::OAI).
# in the perl script:
use Catmandu::Importer::OAI;
my $importer = Catmandu::Importer::OAI->new(
url => "...",
metadataPrefix => "..." , ); $importer->each(sub { my $hashref = $_[0];
# do something with $hashref... });
or directly from the command line:
$ catmandu convert OAI --url http://pub.uni-bielefeld.de/oai to JSON
(the arxiv oai interface seems to be very slow.)
There's also an importer for arxiv.org: Catmandu::ArXiv (
https://metacpan.org/pod/Catmandu::ArXiv)
Everything is also on github: https://github.com/LibreCat
Cheers,
Vitali
On 14.01.2014 21:01, Eka Grguric wrote:
> Hi,
>
> I am a complete newbie to Perl (and to Code4Lib) and am trying to set up a harvester to get complete metadata records from oai-pmh repositories. My current approach is to use things already built as much as possible - specifically the Net::Oai::Harvester (http://search.cpan.org/~esummers/OAI-Harvester-1.0/lib/Net/OAI/Harvester.pm). The code I'm using is located in the synopsis and specific parts of it seem to work with some samples I've tried. For example, if I submit a request for a list of sets to the oai url for arXiv.org (http://arXiv.org/oai2) I get the correct list.
>
> The error I run into reads "can't call listRecords() on an undefined value in *filename* line *#*". listRecords() seems to have been an issue in past iterations but I'm not sure how to get around it.
>
> At the moment it looks like this:
> ## list all the records in a repository
> my $list = $harvester->listRecords(
> metadataPrefix = 'oai_dc'
> );
>
> Any help (or Perl resources) would be appreciated!
>
> Thanks,
>
> Eka
> MLIS Candidate, UBC iSchool
--
Vitali Peil
Fachreferent
PUB <pub.uni-bielefeld.de>
Raum E1-144, Tel. 0521 106 6125
Universitätsbibliothek Bielefeld
|