Print

Print


Is there an Internet Archive API that will allow me to get the contents of a collection as a stream of data and not as a stream of HTML.

A cool collection of early English print materials is available at the following URL:

  https://archive.org/details/bplsceep

Each item is associated with an Internet Archive identifier. If I were able to easily extract these identifiers, then I would be more easily able to provide services based on the collection. But I’m lazy. I don’t want to read the HTML and scrape it accordingly. Ick! I’d rather be given the list of bibliographics in a more computer-friendly way.

Again, can I programmatically read the contents of a Internet Archive collection?

—
Eric Morgan