What you describe is very close to what I've done with my Postgres
solution to search some EAD docs using a Perl/CGI. The XML starts on
the filesystem. I then index it with swish-e and insert the XML blob
into Postgres since swish-e isn't entirely XML aware. In case I need
extra ability to deliver XML text fragments to enrichen the output of my
HTML in the CGI, I use the Postgres/Throwingbeans XPath functionality
with a simple select SQL. The database really does very little in my
app (it's only one table, actually) -- it's swish-e that drives it, and
it's really fast.
Eric Lease Morgan wrote:
> Yep, this is my thought as well. My plan will be to write easily
> indexable content to the file system and/or create a dump from my
> database, and then feed this output to an indexer. My current favorite
> is swish-e. Swish-e will provide all the searching functionality I
> could desire. Results returned from swish-e will be either URL's
> pointing to documents on the file system or to keys in the the
> database. Documents on the filesystem would most likely highly
> structured HTML documents. If keys are returned, then I would use the
> key to extract the necessary information from the database, build some
> sort of document on the fly, and return the results to the browser.
> I like this sort of approach instead of querying the database directly
> because querying the database directly forces me to create all sorts of
> ugly SQL statements defining specific fields to query. Ick, again.
> Eric Morgan