Print

Print


On May 19, 2009, at 10:40 AM, Eric Lease Morgan wrote:

> On May 19, 2009, at 1:24 PM, Eric Lease Morgan wrote:
>
>> I applaud the Internet Archive and the Open Content Alliance's
>> efforts.  archive.org++
>
> Try this hack with Google Books, not.
>
> $ echo http://ia300206.us.archive.org/3/items/librariesreaders00fostuoft/ 
>  > libraries.urls
>
> $ echo http://ia310827.us.archive.org/0/items/developmentofchi00tancuoft/ 
>  >> libraries.urls
>
> $ echo http://ia310832.us.archive.org/2/items/rulesregulations00brituoft/ 
>  >> libraries.urls
>
> $ echo 'wget -erobots=off --wait 1 -np -m -nd -A  
> _djvu.txt,.pdf,.gif,_marc.xml -R _bw.pdf -i $1' > mirror.sh
>
> $ chmod +x mirror.sh
>
> $ ./mirror.sh libraries.urls

Here is a script that will let you download all the books from  
archive.org:

http://blog.openlibrary.org/2008/11/24/bulk-access-to-ocr-for-1-million-books/

You'll have to slightly modify it to download the format you want...

-raj