You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by atencorps <ch...@googlemail.com> on 2009/06/01 00:32:48 UTC

Ranking & Scoring Algorithm Pseudocode

Hi,

I came across the Ranking & Score system in Nutch 1.0 ( which includes the
webgraph, linkrank etc).

My question is , where can I find the pseudocode for the Ranking & Scoring
Algorithm/System in place in Nutch 1.0 ?

Thanks 

-- 
View this message in context: http://www.nabble.com/Ranking---Scoring-Algorithm-Pseudocode-tp23807595p23807595.html
Sent from the Nutch - Dev mailing list archive at Nabble.com.


Re: Ranking & Scoring Algorithm Pseudocode

Posted by Dennis Kubes <ku...@apache.org>.
There isn't any pseudocode for this.  The code for the main algorithm is 
in the LinkRank class.  It is similar in nature to PageRank except it 
has the ability to filter reciprocal links.  If the Link Loops program 
is run it also has the ability to filter out link cycles, but that 
program is O(n) running time so not very efficient.

The LinkRank class is just a single score factor though, the setup of 
the new indexing system allows multiple factors to be combined where the 
LinkRank may be only a single factor in that.

If looking for how the algorithm works I suggest looking at the early 
PageRank algorithm papers.  Here are some links which you may find useful:

http://en.wikipedia.org/wiki/PageRank
http://www.ianrogers.net/google-page-rank/


Dennis

atencorps wrote:
> Hi,
> 
> I came across the Ranking & Score system in Nutch 1.0 ( which includes the
> webgraph, linkrank etc).
> 
> My question is , where can I find the pseudocode for the Ranking & Scoring
> Algorithm/System in place in Nutch 1.0 ?
> 
> Thanks 
>