Hi Eric, We also have a series of scripts that we use with the Internet Archive API: https://github.com/digitalutsc/internetarchive_scripts. Namely, to watch an IA collection, download new items and process them based on a table of contents. Kim Pham Digital Projects & Technologies Librarian | Liaison Librarian, Physical & Environmental Sciences (Physics) UNIVERSITY OF TORONTO SCARBOROUGH AC 270 | 1265 Military Trail, Toronto, Ontario, M1C 1A4 https://utsc.library.utoronto.ca/ ________________________________________ From: Code for Libraries [[log in to unmask]] on behalf of Eric Lease Morgan [[log in to unmask]] Sent: September-18-17 3:37 PM To: [log in to unmask] Subject: [CODE4LIB] internet archive api Is there an Internet Archive API that will allow me to get the contents of a collection as a stream of data and not as a stream of HTML. A cool collection of early English print materials is available at the following URL: https://archive.org/details/bplsceep Each item is associated with an Internet Archive identifier. If I were able to easily extract these identifiers, then I would be more easily able to provide services based on the collection. But I’m lazy. I don’t want to read the HTML and scrape it accordingly. Ick! I’d rather be given the list of bibliographics in a more computer-friendly way. Again, can I programmatically read the contents of a Internet Archive collection? — Eric Morgan