LISTSERV 16.5 - CODE4LIB Archives

On Dec 15, 2004, at 6:11 PM, Chuck Bearden wrote:

> In my mind, searching large swaths of full-text is rather different
> than searching structured metadata, especially on controlled
> vocabulary index points.  One idea that comes to mind is using either
> the Throwing Beans/PostgreSQL or eXist solutions for storage and
> XPath-based access, and creating full-text indices with Lucene[3]....

Yep, this is my thought as well. My plan will be to write easily
indexable content to the file system and/or create a dump from my
database, and then feed this output to an indexer. My current favorite
is swish-e. Swish-e will provide all the searching functionality I
could desire. Results returned from swish-e will be either URL's
pointing to documents on the file system or to keys in the the
database. Documents on the filesystem would most likely highly
structured HTML documents. If keys are returned, then I would use the
key to extract the necessary information from the database, build some
sort of document on the fly, and return the results to the browser.

I like this sort of approach instead of querying the database directly
because querying the database directly forces me to create all sorts of
ugly SQL statements defining specific fields to query. Ick, again.

--
Eric Morgan