Print

Print


On Wed, 24 Nov 2004 09:21:31 -0500, Ross Singer
<[log in to unmask]> wrote:
> What do you think is more appropriate (and intuitive) for a search
> engine if the user gives no boolean, "and" or "or"?
>
> I guess my question is, assuming it's a keyword search, and the user
> types in "institute paper science", would it be more appropriate to
> default to "institute AND paper AND science" or "institute OR paper OR
> science".

IMHO, the logical thing would be to OR the terms together, then count
the keyword matches for each item and use that as the first component
in the sort.  Of course, that's assuming that the search itself wasn't
quoted.  if the actual string was quoted, then the terms should be
ANDed.  Here is some (inefficient) SQL to show what I mean:

Search string: paper science
SQL: SELECT recordid, text, ( CASE WHEN POSITION('paper' IN text) >= 0
THEN 1 ELSE 0 END + CASE WHEN POSITION('science' IN text) >= 0 THEN 1
ELSE 0 END ) AS rank FROM keyword_table WHERE LOWER(text) LIKE
'%paper%' OR LOWER(text) LIKE '%science%' ORDER BY 3 DESC, 2 ASC;

Search String: institute "paper science"
SQL: SELECT recordid, text, ( CASE WHEN POSITION('paper' IN text) >= 0
THEN 1 ELSE 0 END + CASE WHEN POSITION('science' IN text) >= 0 THEN 1
ELSE 0 END + CASE WHEN POSITION('institute' IN text) >= 0 THEN 1 ELSE
0 END) AS rank FROM keyword_table WHERE (LOWER(text) LIKE '%paper%'
AND LOWER(text) LIKE '%science%') OR LOWER(text) LIKE '%institute%'
ORDER BY 3 DESC, 2 ASC;


Now, that only counts each matching word once per searched string, but
you get the idea.

>
> I'm just sort of curious what other people's take on this might be.

I am to.  This is just my take on it, and I'm a programmer, not a
librarian, so perhaps I'm not the best person to answer the question
;)

>
> Thanks,
> -Ross.
>


--
Mike Rylander
[log in to unmask]
GPLS -- PINES Development
Database Developer