On May 30, 2006, at 6:51 PM, Jonathan Rochkind wrote:
> But that's not the only model. An 'indexing' thesaurus or controlled
> vocabularly can be applied at the time of indexing. It can be
> applied by
> a machine algorithm, in which case it would certainly be part of the
> indexer/searcher. (There are various (somewhat experimental at this
> point) machine classification or clustering algorithms that an indexer
> could support, and that are generally beyond the reach of an
> interface to
> provide without indexer support). Or, more traditionally, the indexing
> controlled vocabulary can be applied by a human---but even in this
> you want the assigned terms to be captured in a _structured_ way so
> searches can be done on the controlled vocabularly itself. Not mixing
> together controlled terms and uncontrolled terms in one big keyword
> (which is in fact what some search products do, including some rather
> popular ones). For that matter, a field-specific storage of terms is
> neccesary for any kind of facetted browsing or field-specific search,
Good comment! Thank you.
Noted, the addition of thesaurus terms, controlled vocabularies, etc.
can (and maybe should) be applied during the indexing process. This
requires the person (or the program) doing the indexing to understand
the characteristics of the data and supplement the data with the
necessary terms. In short, it means enhancing the data as it gets
indexed. Enhancing the data by hand or programmatically is not a
function of the indexer itself but the indexing process. I could do
such things with any of the indexers I know of. None of them, out-of-
the box, automatically support thesauri, etc.
This point to ways librarians can be directly involved in the
indexing process; it just goes to show there are lot's o'
opportunities in Library Land.
Eric Lease Morgan
I'm hiring a Senior Programmer Analyst. See http://