Eric Lease Morgan wrote: > [snip] Screen scraping is for the birds!! If we, libraries, are so > much about > standards, then why do we tolerate interfaces to indexes that force us > to do screen scraping. As soon as they change their interface you have > all sorts of new work to do. "Just give me the data." I would be the last one to disagree with the intent of Eric's statement. Consistent, standards-driven interfaces make content more accessible for all. Part of the challenge of the "dark web", is that it is full of unique sets of data, both large and small, that have significance to specific users in specific contexts and have a relatively custom interface to suit it. (In the best of worlds with a standard set of HTML widgets) In courses on cataloguing many, many years ago I was drawn to the insight that publishers don't give a damn about librarians and consistent title pages (and versos). Especially do designers not give a damn. I'm not sure why I'd ever hoped that the web would be better. We do screen scraping not from choice, but from necessity. In my case, I do as little as possible, framing the target so that users can carry on in the native environment. All I really need are: a) the success/failure indicators (a result count when possible) b) a quick tweak to supply a base href c) and the ability to pass on permissions (session ids, cookies etc.) In short, if you can't get just the data (Eric's point, and the moral high ground), touch the wrapper as little as possible and move on. Various screen-scraping exercises have to find their own stopping point in manipulating the content of the result set/HTML page. The less you touch the further it scales. Perhaps that's the essential difference between a "federated" search and one where you attempt to "unify" the result set. The approach I have taken does not attempt to dedupe or re-sort or merge the disparate results in any useful way. Walter Lewis Halton Hills