You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by Jay Vyas <ja...@gmail.com> on 2012/08/18 17:01:24 UTC

resetting conf/ parameters in a life cluster.

Hi guys:

I've reset my max counters as follows :

./hadoop-site.xml:
 <property><name>mapreduce.job.counters.limit</name><value>15000</value></property>

However, a job is failing (after reducers get to 100%!) at the very end,
due to exceeded counter limit.  I've confirmed in my
code that indeed the correct counter parameter is being set.

My hypothesis: Somehow, the name node counters parameter is effectively
being transferred to slaves... BUT the name node *itself* hasn't updated its
maximum counter allowance, so it throws an exception at the end of the job,
that is, they dying message from hadoop is

" max counter limit 120 exceeded.... "

I've confirmed in my job that the counter parameter is correct, when the
job starts... However... somehow the "120 limit exceeded" exception is
still thrown.

This is in elastic map reduce, hadoop .20.205

-- 
Jay Vyas
MMSB/UCHC

Re: resetting conf/ parameters in a life cluster.

Posted by Harsh J <ha...@cloudera.com>.

No, you will need to restart the TaskTracker to have it in effect.

On Sat, Aug 18, 2012 at 8:46 PM, Jay Vyas <ja...@gmail.com> wrote:
> hmmm.... I wonder if there is a way to push conf/*xml parameters out to all
> the slaves, maybe at runtime ?
>
> On Sat, Aug 18, 2012 at 4:06 PM, Harsh J <ha...@cloudera.com> wrote:
>
>> Jay,
>>
>> Oddly, the counters limit changes (increases, anyway) needs to be
>> applied at the JT, TT and *also* at the client - to take real effect.
>>
>> On Sat, Aug 18, 2012 at 8:31 PM, Jay Vyas <ja...@gmail.com> wrote:
>> > Hi guys:
>> >
>> > I've reset my max counters as follows :
>> >
>> > ./hadoop-site.xml:
>> >
>>  <property><name>mapreduce.job.counters.limit</name><value>15000</value></property>
>> >
>> > However, a job is failing (after reducers get to 100%!) at the very end,
>> > due to exceeded counter limit.  I've confirmed in my
>> > code that indeed the correct counter parameter is being set.
>> >
>> > My hypothesis: Somehow, the name node counters parameter is effectively
>> > being transferred to slaves... BUT the name node *itself* hasn't updated
>> its
>> > maximum counter allowance, so it throws an exception at the end of the
>> job,
>> > that is, they dying message from hadoop is
>> >
>> > " max counter limit 120 exceeded.... "
>> >
>> > I've confirmed in my job that the counter parameter is correct, when the
>> > job starts... However... somehow the "120 limit exceeded" exception is
>> > still thrown.
>> >
>> > This is in elastic map reduce, hadoop .20.205
>> >
>> > --
>> > Jay Vyas
>> > MMSB/UCHC
>>
>>
>>
>> --
>> Harsh J
>>
>
>
>
> --
> Jay Vyas
> MMSB/UCHC



-- 
Harsh J

Re: resetting conf/ parameters in a life cluster.

Posted by Jay Vyas <ja...@gmail.com>.

hmmm.... I wonder if there is a way to push conf/*xml parameters out to all
the slaves, maybe at runtime ?

On Sat, Aug 18, 2012 at 4:06 PM, Harsh J <ha...@cloudera.com> wrote:

> Jay,
>
> Oddly, the counters limit changes (increases, anyway) needs to be
> applied at the JT, TT and *also* at the client - to take real effect.
>
> On Sat, Aug 18, 2012 at 8:31 PM, Jay Vyas <ja...@gmail.com> wrote:
> > Hi guys:
> >
> > I've reset my max counters as follows :
> >
> > ./hadoop-site.xml:
> >
>  <property><name>mapreduce.job.counters.limit</name><value>15000</value></property>
> >
> > However, a job is failing (after reducers get to 100%!) at the very end,
> > due to exceeded counter limit.  I've confirmed in my
> > code that indeed the correct counter parameter is being set.
> >
> > My hypothesis: Somehow, the name node counters parameter is effectively
> > being transferred to slaves... BUT the name node *itself* hasn't updated
> its
> > maximum counter allowance, so it throws an exception at the end of the
> job,
> > that is, they dying message from hadoop is
> >
> > " max counter limit 120 exceeded.... "
> >
> > I've confirmed in my job that the counter parameter is correct, when the
> > job starts... However... somehow the "120 limit exceeded" exception is
> > still thrown.
> >
> > This is in elastic map reduce, hadoop .20.205
> >
> > --
> > Jay Vyas
> > MMSB/UCHC
>
>
>
> --
> Harsh J
>



-- 
Jay Vyas
MMSB/UCHC

Re: resetting conf/ parameters in a life cluster.

Posted by Harsh J <ha...@cloudera.com>.

Jay,

Oddly, the counters limit changes (increases, anyway) needs to be
applied at the JT, TT and *also* at the client - to take real effect.

On Sat, Aug 18, 2012 at 8:31 PM, Jay Vyas <ja...@gmail.com> wrote:
> Hi guys:
>
> I've reset my max counters as follows :
>
> ./hadoop-site.xml:
>  <property><name>mapreduce.job.counters.limit</name><value>15000</value></property>
>
> However, a job is failing (after reducers get to 100%!) at the very end,
> due to exceeded counter limit.  I've confirmed in my
> code that indeed the correct counter parameter is being set.
>
> My hypothesis: Somehow, the name node counters parameter is effectively
> being transferred to slaves... BUT the name node *itself* hasn't updated its
> maximum counter allowance, so it throws an exception at the end of the job,
> that is, they dying message from hadoop is
>
> " max counter limit 120 exceeded.... "
>
> I've confirmed in my job that the counter parameter is correct, when the
> job starts... However... somehow the "120 limit exceeded" exception is
> still thrown.
>
> This is in elastic map reduce, hadoop .20.205
>
> --
> Jay Vyas
> MMSB/UCHC



-- 
Harsh J