Print

Print


The minimum word length and stop word list are run-time configurable.  
The exclusion of words that are in more than 50% of the corpus is a  
compile-time issue (or simply use boolean). Here are the settings to  
be aware of:

ft_min_word_len=3
ft_stopword_file=/dev/null

--Casey

http://about.scriblio.net/
http://maisonbisson.com/


On Jun 1, 2009, at 11:13 AM, Mike Taylor wrote:

> However, all of these oddities -- over eager stop-list, ignoring short
> words, not counting words in more than half the rows -- can be sorted
> out by configuration options.