Print

Print


> I don't care a whole lot whether I use this indexer, that indexer, or
> the other indexer as long as I can make sure I have an SRU, OpenURL,
> Z39.50, etc. interface to the index. This will always allow me to
> swap out the an older indexer for a new one as they become available.

I am so behind in e-mail that I might be treading on ground that is worn
out on this, but I would add to Eric's list that I don't care about the
indexer if:

* the indexer has an open and configurable relevancy weighting algorithm
* the indexer allows control of how the data is normalized
* the indexer uses pluggable parsers
* the indexer supports very fast retrieval

then, on the preferred side:

* the indexer allows the index process to effectively leverage commodity
hardware
* the indexer creates an index that can be combined with others

It is on this last point that I think lucene is so compelling, though it
fares well on all of these. One of our most common comments when we do
surveys of our user community is "don't show me what you can't deliver
NOW". A world class indexer opens the door for scoping at the collection
level, there doesn't have to be one solution for IR and it would be a very
unhealthy ecosystem without variance, but I suspect it would be easier to
convince a company like Elsevier that I want a lucene index for licensed
content than almost any other technology offering. So a definite "yes" to
SRU, OpenURL,  Z39.50, and the rest, but I wonder if sustaining a lucene
index is a good idea regardless of what the main building blocks for a
library's preferred IR layer turn out to be. Library standards don't tend
to delve into the architecture of indexing anyway, but this is really
where a lot of what can be delivered gets defined.

art