On May 13, 2004, at 10:23 AM, Walter Lewis wrote: > For how many useful targets would it be possible to define a consistent > intermediate layer structure that would: > > - handle a SRU/SRW search > - transform it into an "native" database search > - transform the results into an SRU/SRW friendly result set > > and still return them in a reasonable time? > > I'm not (necessarily) suggesting a centralized service that would do > this (a la OCLC) but rather a set of protocols that I could drop into a > locally managed site for targets that we choose to address in this > fashion. Can the problem be abstracted sufficiently? Can we build in > alerts to trigger actions when the structure of a given result doesn't > match the pattern we've been expecting (i.e. site change alert)? I am not able to answer the question of how many, but the algorithm Walter outlines is exactly what SRW/U are designed to address. As you many of you may or many not know, I've been playing with SRU lately, and I've written the following text briefly describing it: SRW and SRU in Five Hundred Words or Less Introduction Search and Retrieve Web Service (SRW) and Search and Retrieve URL Service (SRU) are Web Services-based protocols for querying databases and returning search results. SRW and SRU requests and results are very similar. The difference between them lies in the ways the queries and results are encapsulated and transmitted between client and server applications. The canonical URL for SRW and SRU is: http://www.loc.gov/z3950/agency/zing/srw/ Basic "operations" Both protocols define three and only three basic "operations": explain, scan, searchRetrieve: * explain - Explain operations are requests sent by clients as a way of learning about the server's database. At a minimum, responses to explain operations return the location of the database, a description of what the database contains, and what features of the protocol the server supports. * scan - Scan operations are processes for enumerating the terms found in the remote database's index. Clients send scan requests and servers return lists of terms. The process is akin to browsing a back-of-the-book index where a person looks up a term in a book index and "scans" the entries surrounding the term. * searchRetrieve - SearchRetrieve operations are the heart of the matter. They provide the means to query the remote database and return search results. Queries must be articulated using the Common Query Language (CQL). CQL queries range from simple freetext searches to complex Boolean operations with nested queries and proximity qualifications. Servers do not have to implement every aspect of CQL, but they have to know how to return diagnostic messages when something is requested but not supported. The results of searchRetrieve operations can be returned in any number of formats, as specified via explain operations. Examples might include structured but plain text streams or data marked up in XML vocabularies such as Dublin Core, RDF, MARCXML, etc. Differences in operation The differences between SRW and SRU lie in the way operations are encapsulated and transmitted between client and server as well as how results are returned. SRW is essentially as SOAP-ful Web service. Operations are encapsulated by clients as SOAP requests and sent to the server. Likewise, responses by servers are encapsulated using SOAP and returned to clients. Since SOAP is used in SRW, HTTP is not a necessary transport protocol. On the other hand, SRU is essentially a REST-ful Web Service. Operations are encoded as name/value pairs in the query string of a URL. As such operations sent by SRU clients can only be transmitted via HTTP GET requests. The result of SRU requests are XML streams, the same streams returns via SRW requests sans the SOAP envelope. Summary SRW and SRU are "brother and sister" standardized protocols for accomplishing the task of querying databases and returning search results. If index providers were to expose their services via SRW and/or SRU, then access to these services would become more ubiquitous. I have also taken a stab at creating an SRU interface to a union list of serials. It sport some fun features such as a Did You Mean? function a la Google, as well as a suggestion function offering alternative searches to try if you get too many hits. The underlying indexer is swish-e. The interface to the index is written in Perl. Try searching for 'computers in librariez' without the quotes: http://dewey.library.nd.edu/morgan/sru/search.cgi P.S. You will need a very modern browser to get human-readable output from the interface since the raw XML sent to the user agent is expected to be transformed with XSLT for display. -- Eric Lease Morgan University Libraries of Notre Dame