Print

Print


> When I tackled Ebsco, I ran into issues of site authentication via
> cookies that were passed to the search gateway but not on to the client
> browser. Peter Binkley, at the University of Alberta recommended a proxy
> configuration to balance off this issue.  Essentially those connections
> would have to continue to operate inside a search gateway proxied session.

yeah...haven't really tackled passing the searches thru...that's why i
just tried to give the stable urls from the result lists if i could get
them....something to consider tho

> I don't know how the perl tools stack up in terms of parallel search
> streams.  The php/curl combination is purely serial and the last targets
> will time out if there is a tardy responder in the middle of the serial
> queue.

yeah....another reason for some of my decisions, esp. the iframe stuff;
not terribly pleased with it, but avoided the concurrency issue...

there is an extension to perl's LWP that allows parallel searching:

    http://www2.inf.ethz.ch/~langhein/ParallelUA/

haven't looked very closely, but would probably be another solution....

my main problem right now is the parsing....ugh, ugh...ugh....

> Art Rhyno, at the University of Windsor, suggested a parallel
> approach might be possible in a Cocoon environment.  This has the
> advantage of passing all the inbound HTML pages through JTidy and giving
> you the XHTML/XML compliant input stream you wanted (in most cases, even
> when the output from the target was some distance from compliance).

another possible solution....art's done some pretty cool things with
cocoon, but i haven't tried that kool-aid yet ;)