You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Pratik Garg <sa...@gmail.com> on 2012/12/04 17:33:29 UTC

New Scoring

Hi,

Nutch provides a default and new Scoring method for giving score to the
pages. I have couple of questions

* What is the difference between these two methods?
* If I want to pass this data to solr during indexing , do I have to do
anything extra.
* If I want to sort the results from solr based on this data , which field
I should use?

Thanks,
Pratik

RE: New Scoring

Posted by Markus Jelsma <ma...@openindex.io>.

 
 
-----Original message-----
> From:Pratik Garg <sa...@gmail.com>
> Sent: Wed 05-Dec-2012 19:17
> To: user@nutch.apache.org
> Cc: Chirag Goel <go...@gmail.com>
> Subject: New Scoring
> 
> Hi,
> 
> Nutch provides a default and new Scoring method for giving score to the
> pages. I have couple of questions
> 
> * What is the difference between these two methods?

LinkRank is a power iterative algorithm such as PageRank. It can be used incrementally and it very stable. Opic has trouble with increments.

> * If I want to pass this data to solr during indexing , do I have to do
> anything extra.

The CrawlDB has a score field which is used to populate the boost field. With Opic this is added via the scoring filter. If you use the linkrank algorithm make sure you call it's scoreupdater tool, that writes the calculated scores back to the crawldb.

> * If I want to sort the results from solr based on this data , which field
> I should use?

the boost field.

> 
> Thanks,
> Pratik
>