You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "applepear (JIRA)" <ji...@apache.org> on 2012/07/25 01:25:35 UTC

[jira] [Comment Edited] (NUTCH-1238) Fetcher throughput threshold must start before feeder finished

    [ https://issues.apache.org/jira/browse/NUTCH-1238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421804#comment-13421804 ] 

applepear edited comment on NUTCH-1238 at 7/24/12 11:24 PM:
------------------------------------------------------------

the fix is not correct... in the fix, when the throughput falls below the threshold, the queue is emptied and the throughput threshold is disabled. however, as a result of this fix, the feeder may still be alive, and it may then continue to feed more problematic urls from the same domain(s). now that the throughput threshold is disabled, the map will no longer stop due to low throughput.
                
      was (Author: applepear):
    the fix is not correct... in the fix, when the throughput falls below the threshold, the queue is emptied and the throughput threshold is disabled. however, if the feeder is still alive, it will continue to feed more urls of the same domain, which is problematic. now that the throughput threshold is disabled, the map will no longer stop due to low throughput.
                  
> Fetcher throughput threshold must start before feeder finished
> --------------------------------------------------------------
>
>                 Key: NUTCH-1238
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1238
>             Project: Nutch
>          Issue Type: Improvement
>          Components: fetcher
>    Affects Versions: 1.4
>            Reporter: Markus Jelsma
>            Assignee: Markus Jelsma
>            Priority: Trivial
>             Fix For: 1.5
>
>         Attachments: NUTCH-1238-1.5-1.patch
>
>
> Right now the fetcher's minimum throughput threshold is activated only when the feeder has finished. However, for various reasons a running fetch can be slow. This issue must change the feature to start checking earlier, but not right after initialization.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira