You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by atencorps <ch...@googlemail.com> on 2009/06/01 00:32:48 UTC
Ranking & Scoring Algorithm Pseudocode
Hi,
I came across the Ranking & Score system in Nutch 1.0 ( which includes the
webgraph, linkrank etc).
My question is , where can I find the pseudocode for the Ranking & Scoring
Algorithm/System in place in Nutch 1.0 ?
Thanks
--
View this message in context: http://www.nabble.com/Ranking---Scoring-Algorithm-Pseudocode-tp23807595p23807595.html
Sent from the Nutch - Dev mailing list archive at Nabble.com.
Re: Ranking & Scoring Algorithm Pseudocode
Posted by Dennis Kubes <ku...@apache.org>.
There isn't any pseudocode for this. The code for the main algorithm is
in the LinkRank class. It is similar in nature to PageRank except it
has the ability to filter reciprocal links. If the Link Loops program
is run it also has the ability to filter out link cycles, but that
program is O(n) running time so not very efficient.
The LinkRank class is just a single score factor though, the setup of
the new indexing system allows multiple factors to be combined where the
LinkRank may be only a single factor in that.
If looking for how the algorithm works I suggest looking at the early
PageRank algorithm papers. Here are some links which you may find useful:
http://en.wikipedia.org/wiki/PageRank
http://www.ianrogers.net/google-page-rank/
Dennis
atencorps wrote:
> Hi,
>
> I came across the Ranking & Score system in Nutch 1.0 ( which includes the
> webgraph, linkrank etc).
>
> My question is , where can I find the pseudocode for the Ranking & Scoring
> Algorithm/System in place in Nutch 1.0 ?
>
> Thanks
>