Print

Print


On 1/14/14 10:45 PM, Edward Summers wrote:
> Just out of curiosity, does it work for a little bit then stop working? I know arXiv throttle crawlers, and am not sure if they throttle oai-pmh clients. Simeon Warner who helps run arXiv has been know to post code4lib, so maybe this will cross his radar.

The arXiv OAI endpoint at http://export.arxiv.org/oai2 uses 503 
responses [1] control request frequency. I think most harvester 
libraries support this OK.

I'm sad to say that our export.arxiv.org server is a bit overloaded at 
the moment (mainly arXiv API and RSS load) and sometimes this affects 
the OAI-PMH performance. We are working on improving performance to 
handle the ever increasing load...

Cheers,
Simeon

[1] 
http://www.openarchives.org/OAI/2.0/guidelines-repository.htm#FlowControl

> In the meantime, could you share your harvesting script on gist.github.com or somewhere similar for us to take a look?
>
> //Ed
>
>
> On Jan 14, 2014, at 4:46 PM, Eka Grguric <[log in to unmask]> wrote:
>
>> Thanks for responding!
>>
>> I initialized it as follows (following the code from the synopsis on the site).
>>
>> my $harvester = Net::OAI::Harvester->new(
>> 		baseURL => 'http://contentpro.lib.bcit.ca/iii/oairep/OAIRepository'
>> 	 );