Print

Print


On Apr 11, 2005 2:38 PM, rob caSSon <[log in to unmask]> wrote:
> does anyone know of a published ranking algorithm for bibilographic
> items.....taking into account where in a record a search term appears
> (title, keyword, etc), maybe weighting a particular source database,
> etc?

That is very dependant on your audience.  A knowledgeable searcher
will understand why, when searching for a phrase that appears in a
subtitle as opposed to the main title, the hit they were looking for
showed up farther down a list.  A more ...er... naive user would
expect a lower-ranked but exactly-matching subtitle to be listed
higher.

From our (informal) research at GPLS we've found that ranking hits
based on where in the record they matched is more confusing to the
user than not, so we just rank everything at the same level when
searching across record sections.

However, we also do stemmed searching, which means that we have to go
back and bump up the rank of records that matched on exact words and
phrases along with the stemmed version.  This gives the illusion of
very smart matching with relatively minimal effort.

All that being said, if you know that some of your data are better
than others you could definitely rank the "good" data higher.  I
imagine that would provide good results to the user and (as long as
the "bad" data providers wanted to be hit more often) would push the
"bad" data providers to clean up their data.

--
Mike Rylander
[log in to unmask]
GPLS -- PINES Development
Database Developer
http://open-ils.org