Arash, Yes, we have made WorldCat available to researchers under a special license agreement. I suggest contacting Thom Hickey<[log in to unmask]> about such an arrangement. Thanks, Roy On Fri, May 18, 2012 at 3:46 AM, Arash.Joorabchi <[log in to unmask]> wrote: > Dear Karen, > > I am conducting a research experiment on automatic text classification and I am trying to retrieve top matching bib records (which include DDC fields) for a set of keyphrases extracted from a given document. So, I suppose this is a rather exceptional use case. In fact, the right approach for this experiment is to process the full dump of WorldCat database directly rather than sending a limited number of queries via the API. > > I read here: > http://dltj.org/article/worldcat-lld-may-become-available under-odc-by/ > that WorldCat might become available as open linked data in future, which would solve my problem and help similar text mining projects. However, I wonder if it is currently available to researchers under a research/non-commercial use license agreement. > > Regards, > Arash > > -----Original Message----- > From: Code for Libraries [mailto:[log in to unmask]] On Behalf Of Karen Coombs > Sent: 17 May 2012 08:37 > To: [log in to unmask] > Subject: Re: [CODE4LIB] WorldCat SRU queries - elimination of records without a DDC no from the result set > > I forwarded this thread to the Product Manager for the WorldCat Search > API. She responded back that unfortunately this query is not possible > using the API at this time. > > FYI, the SRU interface to WorldCat Search API doesn't currently > support any scan type searches either. > > Is there a particular use case you're trying to support? Know that > would help us document this as a possible enhancement. > > Karen > > Karen Coombs > Senior Product Analyst > Web Services > OCLC > [log in to unmask] > > On Wed, May 16, 2012 at 9:49 PM, Arash.Joorabchi <[log in to unmask]> wrote: >> Hi Andy, >> >> >> >> I am a SRU newbie myself, so I don't know how this could be achieved >> using scan operations and could not find much info on SRU website >> (http://www.loc.gov/standards/sru/). >> >> As for the wildcards, according to this guide: >> http://www.oclc.org/support/documentation/worldcat/searching/refcard/sea >> rchworldcatquickreference.pdf the symbols should be preceded by at least >> 3 characters, and therefore clauses like: >> >> >> >> ... AND srw.dd=* >> >> ... AND srw.dd=?.* >> >> ... AND srw/dd=###.* >> >> ... AND srw/dd=?3.* >> >> >> >> >> >> do not work and result in the following error: >> >> Diagnostics >> >> Identifier: >> >> info:srw/diagnostic/1/9 >> >> Meaning: >> >> >> >> Details: >> >> >> >> Message: >> >> Not enough chars in truncated term:Truncated words too short(9) >> >> >> >> >> >> Thanks, >> >> Arash >> >> >> >> ________________________________ >> >> From: Houghton,Andrew [mailto:[log in to unmask]] >> Sent: 16 May 2012 11:58 >> To: Arash.Joorabchi >> Subject: Re: [CODE4LIB] WorldCat SRU queries - elimination of records >> without a DDC no from the result set >> >> >> >> I'm not an SRU guru, but is it possible to do a scan and look for a >> postings of zero? >> >> >> >> Andy. >> >> On May 16, 2012, at 6:39, "Arash.Joorabchi" <[log in to unmask]> >> wrote: >> >> Hi mark, >> >> Srw.dd=* does not work either: >> >> Identifier: info:srw/diagnostic/1/27 >> Meaning: >> Details: srw.dd >> Message: The index [srw.dd] did not include a searchable >> value >> >> I suppose the only option left is to retrieve everything and >> filter the results on the client side. >> >> Thanks for your quick reply. >> Arash >> >> >> -----Original Message----- >> From: Code for Libraries [mailto:[log in to unmask]] On >> Behalf Of Mike Taylor >> Sent: 16 May 2012 10:43 >> To: [log in to unmask] >> Subject: Re: [CODE4LIB] WorldCat SRU queries - elimination of >> records without a DDC no from the result set >> >> There is no standard way in CQL to express "field X is not >> empty". >> Depending on implementations, NOT srw.dd="" might work (but >> evidently >> doesn't in this case). Another possibility is srw.dd=*, but >> again >> that may or may not work, and might be appallingly inefficient >> if it >> does. NOT srw.dd=null will definitely not work: "null" is not a >> special word in CQL. >> >> -- Mike. >> >> >> On 16 May 2012 10:32, Arash.Joorabchi <[log in to unmask]> >> wrote: >> > Hi all, >> > >> > I am sending SRU queries to the WorldCat in the following >> form: >> > >> > >> > String host = >> > "http://worldcat.org/webservices/catalog/search/"; >> > String query = "sru?query=srw.kw=\"" + keyword + >> "\"" >> > + " AND srw.ln exact \"eng\"" >> > + " AND srw.mt all \"bks\"" >> > + " AND srw.nt=\"" + keyword + >> "\"" >> > + "&servicelevel=full" >> > + "&maximumRecords=100" >> > + "&sortKeys=relevance,,0" >> > + "&wskey=[wskey]"; >> > >> > And it is working fine, however I'd like to limit the results >> to those >> > records that have a DDC number assigned to them, but I don't >> know what's >> > the right way to specify this limit in the query. >> > >> > NOT srw.dd="" >> > NOT srw.dd=null >> > >> > Neither of above work >> > >> > >> > Thanks, >> > Arash >> > >> >> ________________________________ >> >> No virus found in this message. >> Checked by AVG - www.avg.com >> Version: 2012.0.2176 / Virus Database: 2425/5001 - Release Date: >> 05/15/12 > > ----- > No virus found in this message. > Checked by AVG - www.avg.com > Version: 2012.0.2176 / Virus Database: 2425/5004 - Release Date: 05/16/12