The document you want to request from ProQuest support was called Federated-Search.docx when they sent it to me. This will address many of your documentation needs.
ProQuest used to have an excel spreadsheet with all of the product codes for the databases available for download from http://support.proquest.com/kb/article?ArticleId=3698&source=article&c=12&cid=26, but it appears to no longer be available from that source. ProQuest support should be able to answer where it went when you request the federated search document.
You may receive multiple 856 fields for Citation/Abstract, Full Text, and Scanned PDF:
=856 41$3Citation/Abstract$uhttp://search.proquest.com/docview/...
=856 40$3Full Text$uhttp://search.proquest.com/docview/...
=856 40$3Scanned PDF$uhttp://search.proquest.com/docview/...
I would suggest that rather than relying on the 2nd indicator, you should parse subfield 3 instead to find the format that you prefer. You see the multiple 856 fields in the MARC records for ProQuest holdings as well, as that is how ProQuest handles coverage gaps in titles, so if you have ever processed ProQuest MARC records before, you should be already prepared for this.
--
Andrew Anderson, Director of Development, Library and Information Resources Network, Inc.
http://www.lirn.net/ | http://www.twitter.com/LIRNnotes | http://www.facebook.com/LIRNnotes
On Feb 17, 2014, at 10:28, Jonathan Rochkind <[log in to unmask]> wrote:
> I still haven't managed to get info from Proquest support, but thanks to off list hints from another coder, I have discovered the Proquest SRU endpoint, which I think is the thing they call the "XML gateway".
>
> Here's an example query:
>
> http://fedsearch.proquest.com/search/sru/pqdtft?operation=searchRetrieve&version=1.2&maximumRecords=30&startRecord=1&query=title%3D%22global%20warming%22%20AND%20author%3DCastet
>
> For me, coming from an IP address recognized as 'on campus' for our general Proquest access, no additional authentication is required to use this API. I'm not sure if we at some point prior had them activate the "XML Gateway" for us, likely for a federated search product, or if it's just this way for everyone.
>
> The path component after "/sru", "pqdtft" is the database code for Proquest Dissertations and Theses. I'm not sure where you find a list of these database codes in general; if you've made a succesful API request to that endpoint, there will be a <diagnosticMessage> element near the end of the response listing all database codes you have access to (but without corresponding full English names, you kind of have to guess).
>
> The value of the 'query' parameter is a valid CQL query, as usual for SRU. Unfortunately, there seems to be no SRU "explain" response to tell you what fields/operators are available. But guessing often works, "title", "author", and "date" are all available -- I'm not sure exactly how 'date' works, need to experiment more. The CQL query param above un-escaped is:
>
> title="global warming" AND author=Castet
>
> Responses seem to be in MARCXML, and that seems to be the only option.
>
> It looks like you can tell if a full text is available (on Proquest platform) for a given item, based on whether there's an 856 field with second indicator set to "0" -- that will be a URL to full text. I think. It looks like. Did I mention if there are docs for any of this, I haven't found them?
>
> So, there you go, a Proquest search API!
>
> Jonathan
>
>
>
> On 2/12/14 3:44 PM, Jonathan Rochkind wrote:
>> Aha, thinking to google search for "proquest z3950" actually got me some
>> additional clues!
>>
>> "Sites that are currently using Z39.50 to search ProQuest are advised to
>> consider moving to the XML gateway."
>>
>> in Google snippets for:
>>
>> http://www.proquest.com/assets/downloads/products/techrequirements_np.pdf
>>
>> Also "If you are using the previous XML
>> gateway for access other than with a federated search vendor, please
>> contact our support center at
>> www.proquest.com/go/migrate and we can get you the new XML gateway
>> implementation documentation."
>>
>> Okay, so now I at least know that something called the "XML Gateway"
>> exists, and that's what I want info on or ask about! (Why are our
>> vendors so reluctant to put info on their services online?)
>>
>> I am not a huge fan of z3950, and am not ordinarily optimistic about
>> it's ability to actually do what I need, but I'd use it if it was all
>> that was available; in this case, it seems like Proquest is recommending
>> you do NOT use it, but use this mysterious 'XML gateway'.
>>
>>
>>
>> On 2/12/14 3:29 PM, Eric Lease Morgan wrote:
>>> On Feb 12, 2014, at 3:22 PM, Jonathan Rochkind <[log in to unmask]> wrote:
>>>
>>>> I feel like at some point I heard there was a search API for the
>>>> Proquest content/database platform.
>>>
>>>
>>> While it may not be the coolest, I’d be willing to bet Proquest
>>> supports Z39.50. I used it lately to do some interesting queries
>>> against the New York Times Historical Newspapers Database (index). [1]
>>> Okay. I know. Z39.50 and their Reverse Polish Notation query language.
>>> Yuck. Moreover, the bibliographic data is probably downloadable at
>>> MARC records, but hey.
>>>
>>> [1] Z39.50 hack - http://blogs.nd.edu/emorgan/2013/11/fun/
>>>
>>> —
>>> Eric Lease Morgan
>>>
>>>
>>
>>
|