You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2012/12/12 00:57:21 UTC

[jira] [Commented] (NUTCH-1503) Configuration properties not in sync between FetcherReducer and nutch-default.xml

    [ https://issues.apache.org/jira/browse/NUTCH-1503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13529497#comment-13529497 ] 

Sebastian Nagel commented on NUTCH-1503:
----------------------------------------

Hi Lewis,
both time limit properties are necessary:
* fetcher.timelimit.mins for the user to configure the limit (max. duration in minutes)
* fetcher.timelimit (internal use only) to set the time the fetcher has to finish (system time in milliseconds, same time for all distributed jobs)

Regarding fetcher.threads.per.host.by.ip: maybe we should not add already deprecated properties which will be removed later anyway (cf. NUTCH-1409).
+1 for adding fetcher.queue.use.host.settings to nutch-default.xml
Btw., your efforts to clean up properties remembered me that some time ago I promised on [user@nutch|http://lucene.472066.n3.nabble.com/Javadoc-incorrect-or-missing-code-in-1-5-1-Generator-td3997883.html] to prepare a list with all Nutch properties and flags whether they are "defined" and documented in nutch-default.xml: [it's in the wiki now|http://wiki.apache.org/nutch/NutchPropertiesCompleteList].

                
> Configuration properties not in sync between FetcherReducer and nutch-default.xml
> ---------------------------------------------------------------------------------
>
>                 Key: NUTCH-1503
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1503
>             Project: Nutch
>          Issue Type: Bug
>          Components: fetcher
>    Affects Versions: 2.1
>            Reporter: Lewis John McGibbney
>            Assignee: Lewis John McGibbney
>            Priority: Minor
>             Fix For: 2.2
>
>         Attachments: NUTCH-1503.patch
>
>
> FetcherReducer.java
> Bug: Following properties appear in FetcherReducer but not in nutch-default.xml
> {code}
> 290       useHostSettings = conf.getBoolean("fetcher.queue.use.host.settings", false);
> 300       this.timelimit = conf.getLong("fetcher.timelimit", -1);
> 450       this.byIP = conf.getBoolean("fetcher.threads.per.host.by.ip", true);
> 698       timelimit = context.getConfiguration().getLong("fetcher.timelimit", -1); 
> {code}
> Therefore they cannot be used properly in code execution and must be updated, removed and/or added to nutch-default.xml.
> Patch coming up just now.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira