Print

Print


also, if your script can handle a redirect, you can use
our locator to find each item, e.g.

http://www.archive.org/download/librariesreaders00fostuoft/
http://www.archive.org/download/developmentofchi00tancuoft/
http://www.archive.org/download/rulesregulations00brituoft/

as the data does migrate occasionally for maintenance.


[log in to unmask]



On 5/19/09 10:51 AM, raj kumar wrote:
> On May 19, 2009, at 10:40 AM, Eric Lease Morgan wrote:
> 
>> On May 19, 2009, at 1:24 PM, Eric Lease Morgan wrote:
>>
>>> I applaud the Internet Archive and the Open Content Alliance's
>>> efforts.  archive.org++
>>
>> Try this hack with Google Books, not.
>>
>> $ echo 
>> http://ia300206.us.archive.org/3/items/librariesreaders00fostuoft/ > 
>> libraries.urls
>>
>> $ echo 
>> http://ia310827.us.archive.org/0/items/developmentofchi00tancuoft/ >> 
>> libraries.urls
>>
>> $ echo 
>> http://ia310832.us.archive.org/2/items/rulesregulations00brituoft/ >> 
>> libraries.urls
>>
>> $ echo 'wget -erobots=off --wait 1 -np -m -nd -A 
>> _djvu.txt,.pdf,.gif,_marc.xml -R _bw.pdf -i $1' > mirror.sh
>>
>> $ chmod +x mirror.sh
>>
>> $ ./mirror.sh libraries.urls
> 
> Here is a script that will let you download all the books from archive.org:
> 
> http://blog.openlibrary.org/2008/11/24/bulk-access-to-ocr-for-1-million-books/ 
> 
> 
> You'll have to slightly modify it to download the format you want...
> 
> -raj