On Wed, Sep 30, 2009 at 7:56 AM, Tim Cornwell <[log in to unmask]> wrote:
> 41,000 sites and 21 million pages (http://www.ablegrape.com/en/about.html) is a lot of
> vetting.
...
> Authoratative vetting of a large volume of resources is a hard problem. I haven't seen
> any good solutions, but am leaning toward crowd-sourcing with an authoratative crowd. :-)
>
> Do you have any additional information on how AbleGrape vets these?
I can only guess, but I would think it's probably a combination of
automatic and manual vetting: crawl the links from known "good sites",
filter out bad sites, filter out off-topic sites, manually add
newly-discovered sites not already in the index, manually remove
inappropriate sites that somehow made it into the index, adjust the
algorithms, try to build a user community and solicit feedback. (I
once reported inappropriate results coming from a wine producer's
website that had been taken over by vandals, and AbleGrape removed it
from the index almost immediately.)
Keith
|