You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by John Armstrong <jo...@ccri.com> on 2011/06/21 14:42:29 UTC

When is mapred-site.xml read?

One of my colleagues and I have a little confusion between us as to
exactly when mapred-site.xml is read.  The pages on hadoop.apache.org don't
seem to specify it very clearly.

One position is that mapred-site.xml is read by the daemon processes at
startup, and so changing a parameter in mapred-site.xml requires bouncing
the daemons.

The other position is that an individual job's launcher reads the local
mapred-site.xml when creating its baseline configuration, so changing a
parameter in mapred-site.xml will take effect immediately with the next job
launched.

Thanks for any help you can give improving our understanding of how these
configuration files work.

Re: When is mapred-site.xml read?

Posted by Alex Kozlov <al...@cloudera.com>.
*keep.failed.task.files* is also set by the client (also, HDFS block size,
replication level, *io.sort.{mb,factor}*, etc.)

On Tue, Jun 21, 2011 at 7:15 AM, John Armstrong <jo...@ccri.com>wrote:

> On Tue, 21 Jun 2011 06:37:50 -0700, Alex Kozlov <al...@cloudera.com>
> wrote:
> > However, the job's tasks are executed in a separate JVM and some
> > of the parameters, like max heap from *mapred.java.child.opts*, are set
> > during the job execution.  In this case the parameter is coming from the
> > client side where the whole configuration is serialized and passed to
> the
> > slave TT nodes.  It is never read from the TT node's *-site.xml.  I
> don't
> > know  a comprehensive list of parameters that can be changed by the
> client
> > side, but most of the parameters are set during the daemon startup.
>
> mapred.java.child.opts was the one that led one of us to believe that
> changes in mapred-site.xml take effect for all newly-launched jobs.
>
> What about keep.failed.task.files? Do you know if that one needs a restart
> of JT or TTs?
>
>
>

Re: When is mapred-site.xml read?

Posted by John Armstrong <jo...@ccri.com>.
On Tue, 21 Jun 2011 06:37:50 -0700, Alex Kozlov <al...@cloudera.com>
wrote:
> However, the job's tasks are executed in a separate JVM and some
> of the parameters, like max heap from *mapred.java.child.opts*, are set
> during the job execution.  In this case the parameter is coming from the
> client side where the whole configuration is serialized and passed to
the
> slave TT nodes.  It is never read from the TT node's *-site.xml.  I
don't
> know  a comprehensive list of parameters that can be changed by the
client
> side, but most of the parameters are set during the daemon startup.

mapred.java.child.opts was the one that led one of us to believe that
changes in mapred-site.xml take effect for all newly-launched jobs.

What about keep.failed.task.files? Do you know if that one needs a restart
of JT or TTs?



Re: When is mapred-site.xml read?

Posted by Alex Kozlov <al...@cloudera.com>.
Hi John, You are right: the *-site.xml files are read by daemons on
startup.  However, the job's tasks are executed in a separate JVM and some
of the parameters, like max heap from *mapred.java.child.opts*, are set
during the job execution.  In this case the parameter is coming from the
client side where the whole configuration is serialized and passed to the
slave TT nodes.  It is never read from the TT node's *-site.xml.  I don't
know  a comprehensive list of parameters that can be changed by the client
side, but most of the parameters are set during the daemon startup.

On Tue, Jun 21, 2011 at 5:42 AM, John Armstrong <jo...@ccri.com>wrote:

> One of my colleagues and I have a little confusion between us as to
> exactly when mapred-site.xml is read.  The pages on hadoop.apache.orgdon't
> seem to specify it very clearly.
>
> One position is that mapred-site.xml is read by the daemon processes at
> startup, and so changing a parameter in mapred-site.xml requires bouncing
> the daemons.
>
> The other position is that an individual job's launcher reads the local
> mapred-site.xml when creating its baseline configuration, so changing a
> parameter in mapred-site.xml will take effect immediately with the next job
> launched.
>
> Thanks for any help you can give improving our understanding of how these
> configuration files work.
>