You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Massimo Schiavon <ms...@volunia.com> on 2010/06/15 11:36:08 UTC

Result sorting based on other engine ranking

There is a way to reorder the results returned by nutch based on results 
returned by other search engines to the same (or similar) query?

-- 
schmax

Re: Result sorting based on other engine ranking

Posted by Dennis Kubes <ku...@apache.org>.
Search engine results, including Nutch are based on scores.  There is an 
index time score and and query time score that get combined (multiplied) 
to return a final score.  Documents are returned by score desc by default.

For what you were asking, you could write a MapReduce job to run through 
a listing of common queries on major search engines.  Save the highest 
scoring pages and then add those scores to nutch documents are index 
time.  If you wanted to do it dynamically you could run your query on 
other search engines first and then use a query plugin to boost certain 
urls, determined from the other search engines, at query time.

Dennis

On 06/15/2010 04:36 AM, Massimo Schiavon wrote:
> There is a way to reorder the results returned by nutch based on 
> results returned by other search engines to the same (or similar) query?
>