You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by eyal edri <ey...@gmail.com> on 2007/09/18 16:56:11 UTC

nutch scoring - documentation

Hello,

Can anyone direct me to documentation on how the nutch scoring system works?
I want to know how nutch decides who are the -topN xxx pages retrieved when
using the "topN" argument in command line.
I'm not using any indexing/searching (just plain
inject-->crawl-->fetch-->update cycle).

I know nutch-defaults.xml determins the initials scoring for pages, but do
those pages change thier ranking afterwards?

thanks
-- 
Eyal Edri

Re: nutch scoring - documentation

Posted by Tim Gautier <ti...@gmail.com>.
The scoring is done through the OPIC scoring filter.  You can find it
in the code under src/plugin/scoring-opic.  The comments at the top of
the java file have a url that explains in greater detail the theory
behind the scoring.

On 9/18/07, eyal edri <ey...@gmail.com> wrote:
> Hello,
>
> Can anyone direct me to documentation on how the nutch scoring system works?
> I want to know how nutch decides who are the -topN xxx pages retrieved when
> using the "topN" argument in command line.
> I'm not using any indexing/searching (just plain
> inject-->crawl-->fetch-->update cycle).
>
> I know nutch-defaults.xml determins the initials scoring for pages, but do
> those pages change thier ranking afterwards?
>
> thanks
> --
> Eyal Edri
>