Print

Print


I still haven't managed to get info from Proquest support, but thanks to 
off list hints from another coder, I have discovered the Proquest SRU 
endpoint, which I think is the thing they call the "XML gateway".

Here's an example query:

http://fedsearch.proquest.com/search/sru/pqdtft?operation=searchRetrieve&version=1.2&maximumRecords=30&startRecord=1&query=title%3D%22global%20warming%22%20AND%20author%3DCastet

For me, coming from an IP address recognized as 'on campus' for our 
general Proquest access, no additional authentication is required to use 
this API. I'm not sure if we at some point prior had them activate the 
"XML Gateway" for us, likely for a federated search product, or if it's 
just this way for everyone.

The path component after "/sru", "pqdtft" is the database code for 
Proquest Dissertations and Theses. I'm not sure where you find a list of 
these database codes in general; if you've made a succesful API request 
to that endpoint, there will be a <diagnosticMessage> element near the 
end of the response listing all database codes you have access to (but 
without corresponding full English names, you kind of have to guess).

The value of the 'query' parameter is a valid CQL query, as usual for 
SRU. Unfortunately, there seems to be no SRU "explain" response to tell 
you what fields/operators are available. But guessing often works, 
"title", "author", and "date" are all available -- I'm not sure exactly 
how 'date' works, need to experiment more. The CQL query param above 
un-escaped is:

title="global warming" AND author=Castet

Responses seem to be in MARCXML, and that seems to be the only option.

It looks like you can tell if a full text is available (on Proquest 
platform) for a given item, based on whether there's an 856 field with 
second indicator set to "0" -- that will be a URL to full text. I think. 
It looks like. Did I mention if there are docs for any of this, I 
haven't found them?

So, there you go, a Proquest search API!

Jonathan



On 2/12/14 3:44 PM, Jonathan Rochkind wrote:
> Aha, thinking to google search for "proquest z3950" actually got me some
> additional clues!
>
> "Sites that are currently using Z39.50 to search ProQuest are advised to
> consider moving to the XML gateway."
>
> in Google snippets for:
>
> http://www.proquest.com/assets/downloads/products/techrequirements_np.pdf
>
> Also "If you are using the previous XML
> gateway for access other than with a federated search vendor, please
> contact our support center at
> www.proquest.com/go/migrate and we can get you the new XML gateway
> implementation documentation."
>
> Okay, so now I at least know that something called the "XML Gateway"
> exists, and that's what I want info on or ask about!  (Why are our
> vendors so reluctant to put info on their services online?)
>
> I am not a huge fan of z3950, and am not ordinarily optimistic about
> it's ability to actually do what I need, but I'd use it if it was all
> that was available; in this case, it seems like Proquest is recommending
> you do NOT use it, but use this mysterious 'XML gateway'.
>
>
>
> On 2/12/14 3:29 PM, Eric Lease Morgan wrote:
>> On Feb 12, 2014, at 3:22 PM, Jonathan Rochkind <[log in to unmask]> wrote:
>>
>>> I feel like at some point I heard there was a search API for the
>>> Proquest content/database platform.
>>
>>
>> While it may not be the coolest, I’d be willing to bet Proquest
>> supports Z39.50. I used it lately to do some interesting queries
>> against the New York Times Historical Newspapers Database (index). [1]
>> Okay. I know. Z39.50 and their Reverse Polish Notation query language.
>> Yuck. Moreover, the bibliographic data is probably downloadable at
>> MARC records, but hey.
>>
>> [1] Z39.50 hack - http://blogs.nd.edu/emorgan/2013/11/fun/
>>
>> —
>> Eric Lease Morgan
>>
>>
>
>