You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Doug Cutting (JIRA)" <ji...@apache.org> on 2006/03/30 01:59:38 UTC
[jira] Commented: (NUTCH-240) Scoring API: extension point, scoring
filters and an OPIC plugin
[ http://issues.apache.org/jira/browse/NUTCH-240?page=comments#action_12372341 ]
Doug Cutting commented on NUTCH-240:
------------------------------------
The generator store/restore score stuff seems ugly. And it is not used by OPIC. Could we insteadhave a method that computes and returns a score to be used by the generator? Then it is up to the generator to use this w/o modifying the CrawlDatum.
The passScoreBeforeParsing/passScoreAfterParsing/distributeScoreToOutlink protocol also seems awkward, although I don't yet have a suggestion for how to improve it.
> Scoring API: extension point, scoring filters and an OPIC plugin
> ----------------------------------------------------------------
>
> Key: NUTCH-240
> URL: http://issues.apache.org/jira/browse/NUTCH-240
> Project: Nutch
> Type: Improvement
> Versions: 0.8-dev
> Reporter: Andrzej Bialecki
> Attachments: patch.txt
>
> This patch refactors all places where Nutch manipulates page scores, into a plugin-based API. Using this API it's possible to implement different scoring algorithms. It is also much easier to understand how scoring works.
> Multiple scoring plugins can be run in sequence, in a manner similar to URLFilters.
> Included is also an OPICScoringFilter plugin, which contains the current implementation of the scoring algorithm. Together with the scoring API it provides a fully backward-compatible scoring.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira