Print

Print


Hi Brewster. Below (quoted) is the URL you emailed me during the
code4lib conference last february for a way to get XML search responses
from the IA.

I am now getting around to implementing my functionality that will use
this... and it looks like this is no longer available? I guess it's good
it didn't become available only AFTER I implemented my code!  But is
there any alternative you know of?  Is there anyone else at the IA that
would be better for me to contact regarding this matter?

The OpenLibrary API is not an alternative, because I want to search the
entire corpus of Internet Archive held digitized texts, and probably
audio books as well. Alexis tells me that so far OL includes just a
small fraction of all IA texts, and that this won't be changing any time
soon.

The software I am working on is an availability/discovery search where,
starting from a known item, which may or may not be in our catalog, my
software tells the user about various versions and copies available for
free on the internet. The Internet Archive is of course one major source
of this material, so it would be important to include it. But I need
some way to machine-search IA.  Short of trying to harvest everything
through OAI-PMH and build my own index---do you have machien interfaces
I could use?

I suppose I could 'screen scrape' the html delivered by the IA html
search. Perhaps that's what I'll have to do?

Thanks for any advice,
Jonathan

Brewster Kahle wrote:
>
> http://www.archive.org/services/search.php?query=contributor%3Ahopkins+AND+mediatype%3Atexts&limit=1000&submit=submit
>
>
> -brewster

--
Jonathan Rochkind
Digital Services Software Engineer
The Sheridan Libraries
Johns Hopkins University
410.516.8886
rochkind (at) jhu.edu