You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2011/04/01 16:57:07 UTC

[jira] [Closed] (NUTCH-267) Indexer doesn't consider linkdb when calculating boost value

     [ https://issues.apache.org/jira/browse/NUTCH-267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Markus Jelsma closed NUTCH-267.
-------------------------------

    Resolution: Won't Fix

Bulk close of legacy issues:
http://www.lucidimagination.com/search/document/2738eeb014805854/clean_up_open_legacy_issues_in_jira

> Indexer doesn't consider linkdb when calculating boost value
> ------------------------------------------------------------
>
>                 Key: NUTCH-267
>                 URL: https://issues.apache.org/jira/browse/NUTCH-267
>             Project: Nutch
>          Issue Type: Bug
>          Components: indexer
>    Affects Versions: 0.8
>            Reporter: Chris Schneider
>            Priority: Minor
>
> Before OPIC was implemented (Nutch 0.7, very early Nutch 0.8-dev), if indexer.boost.by.link.count was true, the indexer boost value was scaled based on the log of the # of inbound links:
>     if (boostByLinkCount)
>       res *= (float)Math.log(Math.E + linkCount);
> This is no longer true (even before Andrzej implemented scoring filters). Instead, the boost value is just the square root (or some other scorePower) of the page score. Shouldn't the invertlinks command, which creates the linkdb, have some affect on the boost value calculated during indexing (either via the OPICScoringFilter or some other built-in filter)?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira