LISTSERV 16.5 - CODE4LIB Archives

On Mon, 20 Dec 2004 12:42:42 -0500, Eric Lease Morgan <[log in to unmask]> wrote:
> On Dec 17, 2004, at 12:50 PM, Clay Redding wrote:
>
> > What you describe is very close to what I've done with my Postgres
> > solution to search some EAD docs using a Perl/CGI.  The XML starts on
> > the filesystem.  I then index it with swish-e and insert the XML blob
> > into Postgres since swish-e isn't entirely XML aware.  In case I need
> > extra ability to deliver XML text fragments to enrichen the output of
> > my
> > HTML in the CGI,  I use the Postgres/Throwingbeans XPath functionality
> > with a simple select SQL.  The database really does very little in my
> > app (it's only one table, actually) -- it's swish-e that drives it, and
> > it's really fast.
>
> This is interesting, very.
>
> Yes, I intend to index entire works with swish-e. Searches against
> swish-e indexes return pointers to entire documents or keys to
> databases. Consequently, unless I index bunches o' paragraphs as
> individual documents, it will be difficult to use swish-e as my indexer
> as well as return paragraphs/lines from my texts. The idea of using
> XPATH queries to extract particular paragraphs from texts is
> intriguing. 'Food for thought. Thank you.

One could think of the XPath expressions pointing to retrievable
chunks of XML as analogous to database keys.  That's how I was viewing
them in my hypothetical (Lucene && (eXist || ThrowingBeans)) solution.

Chuck