You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Tianyin Xu (JIRA)" <ji...@apache.org> on 2015/12/19 22:40:46 UTC

[jira] [Updated] (MAPREDUCE-6582) A number of inconsistent default configuration values

     [ https://issues.apache.org/jira/browse/MAPREDUCE-6582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tianyin Xu updated MAPREDUCE-6582:
----------------------------------
    Description: 
In MapReduce, a list of default configuration values are inconsistent with what is described in the docs ({{mapred-default.xml}}),
(https://hadoop.apache.org/docs/r2.7.1/hadoop-mapreduce-client/hadoop-mapreduce-client-core/)

*1. {{mapreduce.reduce.shuffle.fetch.retry.timeout-ms}}*
In mapred-default.xml, the value is {{30000}}, while it is {{180000}} in the code,

{code:title=Fetcher.java (only usage)|borderStyle=solid}
 60   private static final int DEFAULT_STALLED_COPY_TIMEOUT = 3 * 60 * 1000;
...
152     this.fetchRetryTimeout = job.getInt(MRJobConfig.SHUFFLE_FETCH_RETRY_TIMEOUT_MS,
153         DEFAULT_STALLED_COPY_TIMEOUT);
{code}
\\

*2. {{mapreduce.shuffle.ssl.file.buffer.size}}*
In mapred-default.xml, the value is {{65536}}, while it is {{61440}} in the code,
{code:title=ShuffleHandler.java  (only usage)|borderStyle=solid}
 203   public static final int DEFAULT_SUFFLE_SSL_FILE_BUFFER_SIZE = 60 * 1024;
...
 396     sslFileBufferSize = conf.getInt(SUFFLE_SSL_FILE_BUFFER_SIZE_KEY,
 397                                     DEFAULT_SUFFLE_SSL_FILE_BUFFER_SIZE);
{code}

This one looks quite weird, because normally it should be {{64 * 1024}} (i.e., {{65536}}). I don't know whether it's a typo...
\\
\\

*3. {{mapreduce.reduce.shuffle.merge.percent}}*
In mapred-default.xml, the value is {{0.66}}, while it is {{0.9}} in the code,
{code:title=MergeManagerImpl.java  (only usage)|borderStyle=solid}
194                           jobConf.getFloat(MRJobConfig.SHUFFLE_MERGE_PERCENT,
195                                            0.90f));
{code}
\\

*4. {{mapreduce.task.timeout}}*
In mapred-default.xml, the value is {{600000}}, while it is {{300000}} in the code,
{code:title=TaskHeartbeatHandler.java  (only usage)|borderStyle=solid}
 90     taskTimeOut = conf.getInt(MRJobConfig.TASK_TIMEOUT, 5 * 60 * 1000);
{code}
\\

*5. {{mapreduce.task.io.sort.factor}}*
In mapred-default.xml, the value is {{10}}, while it is {{100}} in the code,
{code:title=MergeManagerImpl.java  (Usage #1)|borderStyle=solid}
175     this.ioSortFactor = jobConf.getInt(MRJobConfig.IO_SORT_FACTOR, 100);
{code}
{code:title=MapTask.java  (Usage #2)|borderStyle=solid}
 970       final int sortmb = job.getInt(JobContext.IO_SORT_MB, 100);
{code}
\\

*6. {{mapreduce.job.end-notification.max.attempts}}*
In mapred-default.xml, the value is {{5}}, while it is {{1}} in the code,
{code:title=JobEndNotifier.java  (only usage)|borderStyle=solid}
 72       , conf.getInt(MRJobConfig.MR_JOB_END_NOTIFICATION_MAX_ATTEMPTS, 1)
{code}
\\

*7. {{mapreduce.job.end-notification.retry.interval}}*
In mapred-default.xml, the value is {{1000}}. In the code, the default value is {{30000}}.
(actually the two usages are even not the same...) 
{code:title=.../mapred/JobEndNotifier.java  (Usage #1)|borderStyle=solid}
 50       long retryInterval = conf.getInt(JobContext.MR_JOB_END_RETRY_INTERVAL, 30000);
{code}
{code:title=.../v2/app/JobEndNotifier.java  (Usage #2)|borderStyle=solid}
 75     conf.getInt(MRJobConfig.MR_JOB_END_RETRY_INTERVAL, 5000)
{code}

  was:
In MapReduce, a list of default configuration values are inconsistent with what is described in the docs ({{mapred-default.xml}}),
(https://hadoop.apache.org/docs/r2.7.1/hadoop-mapreduce-client/hadoop-mapreduce-client-core/)

*1. {{mapreduce.reduce.shuffle.fetch.retry.timeout-ms}}*
In mapred-default.xml, the value is {{30000}}, while it is {{180000}} in the code,

{code:title=Fetcher.java (only usage)|borderStyle=solid}
 60   private static final int DEFAULT_STALLED_COPY_TIMEOUT = 3 * 60 * 1000;
...
152     this.fetchRetryTimeout = job.getInt(MRJobConfig.SHUFFLE_FETCH_RETRY_TIMEOUT_MS,
153         DEFAULT_STALLED_COPY_TIMEOUT);
{code}
\\

*2. {{mapreduce.shuffle.ssl.file.buffer.size}}*
In mapred-default.xml, the value is {{65536}}, while it is {{61440}} in the code,
{code:title=ShuffleHandler.java  (only usage)|borderStyle=solid}
 203   public static final int DEFAULT_SUFFLE_SSL_FILE_BUFFER_SIZE = 60 * 1024;
...
 396     sslFileBufferSize = conf.getInt(SUFFLE_SSL_FILE_BUFFER_SIZE_KEY,
 397                                     DEFAULT_SUFFLE_SSL_FILE_BUFFER_SIZE);
{code}

This one looks quite weird, because normally it should be {{64 * 1024}} (i.e., {{65536}}). I don't know whether it's a typo...
\\

*3. {{mapreduce.reduce.shuffle.merge.percent}}*
In mapred-default.xml, the value is {{0.66}}, while it is {{0.9}} in the code,
{code:title=MergeManagerImpl.java  (only usage)|borderStyle=solid}
194                           jobConf.getFloat(MRJobConfig.SHUFFLE_MERGE_PERCENT,
195                                            0.90f));
{code}
\\

*4. {{mapreduce.task.timeout}}*
In mapred-default.xml, the value is {{600000}}, while it is {{300000}} in the code,
{code:title=TaskHeartbeatHandler.java  (only usage)|borderStyle=solid}
 90     taskTimeOut = conf.getInt(MRJobConfig.TASK_TIMEOUT, 5 * 60 * 1000);
{code}
\\

*5. {{mapreduce.task.io.sort.factor}}*
In mapred-default.xml, the value is {{10}}, while it is {{100}} in the code,
{code:title=MergeManagerImpl.java  (Usage #1)|borderStyle=solid}
175     this.ioSortFactor = jobConf.getInt(MRJobConfig.IO_SORT_FACTOR, 100);
{code}
{code:title=MapTask.java  (Usage #2)|borderStyle=solid}
 970       final int sortmb = job.getInt(JobContext.IO_SORT_MB, 100);
{code}
\\

*6. {{mapreduce.job.end-notification.max.attempts}}*
In mapred-default.xml, the value is {{5}}, while it is {{1}} in the code,
{code:title=JobEndNotifier.java  (only usage)|borderStyle=solid}
 72       , conf.getInt(MRJobConfig.MR_JOB_END_NOTIFICATION_MAX_ATTEMPTS, 1)
{code}
\\

*7. {{mapreduce.job.end-notification.retry.interval}}*
In mapred-default.xml, the value is {{1000}}. In the code, the default value is {{30000}}.
(actually the two usages are even not the same...) 
{code:title=.../mapred/JobEndNotifier.java  (Usage #1)|borderStyle=solid}
 50       long retryInterval = conf.getInt(JobContext.MR_JOB_END_RETRY_INTERVAL, 30000);
{code}
{code:title=.../v2/app/JobEndNotifier.java  (Usage #2)|borderStyle=solid}
 75     conf.getInt(MRJobConfig.MR_JOB_END_RETRY_INTERVAL, 5000)
{code}


> A number of inconsistent default configuration values
> -----------------------------------------------------
>
>                 Key: MAPREDUCE-6582
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6582
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: documentation
>    Affects Versions: 2.7.1
>            Reporter: Tianyin Xu
>
> In MapReduce, a list of default configuration values are inconsistent with what is described in the docs ({{mapred-default.xml}}),
> (https://hadoop.apache.org/docs/r2.7.1/hadoop-mapreduce-client/hadoop-mapreduce-client-core/)
> *1. {{mapreduce.reduce.shuffle.fetch.retry.timeout-ms}}*
> In mapred-default.xml, the value is {{30000}}, while it is {{180000}} in the code,
> {code:title=Fetcher.java (only usage)|borderStyle=solid}
>  60   private static final int DEFAULT_STALLED_COPY_TIMEOUT = 3 * 60 * 1000;
> ...
> 152     this.fetchRetryTimeout = job.getInt(MRJobConfig.SHUFFLE_FETCH_RETRY_TIMEOUT_MS,
> 153         DEFAULT_STALLED_COPY_TIMEOUT);
> {code}
> \\
> *2. {{mapreduce.shuffle.ssl.file.buffer.size}}*
> In mapred-default.xml, the value is {{65536}}, while it is {{61440}} in the code,
> {code:title=ShuffleHandler.java  (only usage)|borderStyle=solid}
>  203   public static final int DEFAULT_SUFFLE_SSL_FILE_BUFFER_SIZE = 60 * 1024;
> ...
>  396     sslFileBufferSize = conf.getInt(SUFFLE_SSL_FILE_BUFFER_SIZE_KEY,
>  397                                     DEFAULT_SUFFLE_SSL_FILE_BUFFER_SIZE);
> {code}
> This one looks quite weird, because normally it should be {{64 * 1024}} (i.e., {{65536}}). I don't know whether it's a typo...
> \\
> \\
> *3. {{mapreduce.reduce.shuffle.merge.percent}}*
> In mapred-default.xml, the value is {{0.66}}, while it is {{0.9}} in the code,
> {code:title=MergeManagerImpl.java  (only usage)|borderStyle=solid}
> 194                           jobConf.getFloat(MRJobConfig.SHUFFLE_MERGE_PERCENT,
> 195                                            0.90f));
> {code}
> \\
> *4. {{mapreduce.task.timeout}}*
> In mapred-default.xml, the value is {{600000}}, while it is {{300000}} in the code,
> {code:title=TaskHeartbeatHandler.java  (only usage)|borderStyle=solid}
>  90     taskTimeOut = conf.getInt(MRJobConfig.TASK_TIMEOUT, 5 * 60 * 1000);
> {code}
> \\
> *5. {{mapreduce.task.io.sort.factor}}*
> In mapred-default.xml, the value is {{10}}, while it is {{100}} in the code,
> {code:title=MergeManagerImpl.java  (Usage #1)|borderStyle=solid}
> 175     this.ioSortFactor = jobConf.getInt(MRJobConfig.IO_SORT_FACTOR, 100);
> {code}
> {code:title=MapTask.java  (Usage #2)|borderStyle=solid}
>  970       final int sortmb = job.getInt(JobContext.IO_SORT_MB, 100);
> {code}
> \\
> *6. {{mapreduce.job.end-notification.max.attempts}}*
> In mapred-default.xml, the value is {{5}}, while it is {{1}} in the code,
> {code:title=JobEndNotifier.java  (only usage)|borderStyle=solid}
>  72       , conf.getInt(MRJobConfig.MR_JOB_END_NOTIFICATION_MAX_ATTEMPTS, 1)
> {code}
> \\
> *7. {{mapreduce.job.end-notification.retry.interval}}*
> In mapred-default.xml, the value is {{1000}}. In the code, the default value is {{30000}}.
> (actually the two usages are even not the same...) 
> {code:title=.../mapred/JobEndNotifier.java  (Usage #1)|borderStyle=solid}
>  50       long retryInterval = conf.getInt(JobContext.MR_JOB_END_RETRY_INTERVAL, 30000);
> {code}
> {code:title=.../v2/app/JobEndNotifier.java  (Usage #2)|borderStyle=solid}
>  75     conf.getInt(MRJobConfig.MR_JOB_END_RETRY_INTERVAL, 5000)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)