On 1/14/14 10:45 PM, Edward Summers wrote:
> Just out of curiosity, does it work for a little bit then stop working? I know arXiv throttle crawlers, and am not sure if they throttle oai-pmh clients. Simeon Warner who helps run arXiv has been know to post code4lib, so maybe this will cross his radar.
The arXiv OAI endpoint at http://export.arxiv.org/oai2 uses 503
responses [1] control request frequency. I think most harvester
libraries support this OK.
I'm sad to say that our export.arxiv.org server is a bit overloaded at
the moment (mainly arXiv API and RSS load) and sometimes this affects
the OAI-PMH performance. We are working on improving performance to
handle the ever increasing load...
Cheers,
Simeon
[1]
http://www.openarchives.org/OAI/2.0/guidelines-repository.htm#FlowControl
> In the meantime, could you share your harvesting script on gist.github.com or somewhere similar for us to take a look?
>
> //Ed
>
>
> On Jan 14, 2014, at 4:46 PM, Eka Grguric <[log in to unmask]> wrote:
>
>> Thanks for responding!
>>
>> I initialized it as follows (following the code from the synopsis on the site).
>>
>> my $harvester = Net::OAI::Harvester->new(
>> baseURL => 'http://contentpro.lib.bcit.ca/iii/oairep/OAIRepository'
>> );
|