You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Doug Cutting (JIRA)" <ji...@apache.org> on 2006/04/11 01:28:06 UTC

[jira] Commented: (HADOOP-127) Unclear precedence of config files and property definitions

    [ http://issues.apache.org/jira/browse/HADOOP-127?page=comments#action_12373950 ] 

Doug Cutting commented on HADOOP-127:
-------------------------------------

I think you have it right.  Some guidelines:

Folks should only define things in the -site files if they want to force them for all code.

Folks should not edit the -default files.

Non-default settings that may be overridden by application code should be put in mapred-default.xml.

Application settings are set in the job.

Strictly speaking, it doesn't much matter whether you put something in a nutch- or hadoop- file, although the intent is to keep things that are specific to Hadoop in hadoop- files and things specific to Nutch in nutch- files.




> Unclear precedence of config files and property definitions
> -----------------------------------------------------------
>
>          Key: HADOOP-127
>          URL: http://issues.apache.org/jira/browse/HADOOP-127
>      Project: Hadoop
>         Type: Bug

>   Components: conf
>  Environment: Hadoop 0.1.1, Nutch 0.8-dev
>     Reporter: Andrzej Bialecki 

>
> The order in which configuration resources are read is not sufficiently documented, and also there are no mechanisms preventing harmful re-definition of certain properties, if they are put in wrong config files.
> From reading the code in Hadoop Configuration.java, JobConf.java and Nutch NutchConfiguration.java I _think_ this is what's happening.
> There are two groups of resources: default resources, loaded first, and final resources, loaded at the end. All properties (re)-defined in files loaded later will override any previous definitions:
> * default resources: loaded in the order as they are added. The following files are added here, in order:
>     1. hadoop-default.xml (Configuration)
>     2. nutch-default.xml  (NutchConfiguration)
>     3. mapred-default.xml (JobConf)
>     4. job_xx_xxx.xml       (JobConf, in JobConf(File config))
> * final resource: which always come after default resources, i.e. if any value is defined here it will always override those set in default resources (NOTE: including per job settings!!!). The following files are added here, in reversed order:
>     2. hadoop-site.xml (Configuration)
>     1. nutch-site.xml    (NutchConfiguration)
> (i.e. hadoop-site.xml will take precedence over anything else defined in any other config file).
> I would appreciate checking that this is indeed the case, and suggestions how to ensure that you cannot so easily shoot yourself in the foot if you define wrong properties in hadoop-site or nutch-site ...

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira