You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by Akira AJISAKA <aj...@oss.nttdata.co.jp> on 2014/04/15 16:21:25 UTC

Re: Update interval of default counters

Moved to user@hadoop.apache.org.

You can configure the interval by setting
"mapreduce.client.progressmonitor.pollinterval" parameter.
The default value is 1000 ms.

For more details, please see 
http://hadoop.apache.org/docs/stable/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml.

Regards,
Akira

(2014/04/15 15:29), Dharmesh Kakadia wrote:
> Hi,
>
> What is the update interval of inbuilt framework counters? Is that
> configurable?
> I am trying to collect very fine grained information about the job
> execution and using counters for that. It would be great if someone can
> point me to documentation/code for it. Thanks in advance.
>
> Thanks,
> Dharmesh
>


Re: Update interval of default counters

Posted by Akira AJISAKA <aj...@oss.nttdata.co.jp>.
I'm thinking the reason for hard-coding is to protect Hadoop cluster
from high network traffic. If the value is too small, there are
too many network traffic between Map/Reduce tasks and MRAppMaster.

Please see https://issues.apache.org/jira/browse/MAPREDUCE-4381 also.

That's why you need to be very careful if you really want to change
the value.

The source code is at
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/Task.java 
(line 532-533)

   /** The number of milliseconds between progress reports. */
   public static final int PROGRESS_INTERVAL = 3000;

Regards,
Akira

(2014/04/16 22:17), Dharmesh Kakadia wrote:
> Hi Akira,
>
> Thanks fir the quick reply.
> Any particular reason for hard-coding it? Is there a workaround? I want to
> be able to get the counters as fine as possible. Also can you point me to
> the relevant source code. I am willing to take the issue and contribute if
> required.
>
> Thanks,
> Dharmesh
>
>
> On Wed, Apr 16, 2014 at 3:14 PM, Akira AJISAKA
> <aj...@oss.nttdata.co.jp>wrote:
>
>> Moved mapreduce-dev@ to Bcc.
>>
>> Hi Dharmesh,
>>
>> The parameter is to set the interval of polling the progress
>> of the MRAppMaster, not the Map/Reduce tasks. The tasks send
>> the progress (includes the counter information) to MRAppMaster
>> every 3000 milliseconds, which is hard-coded.
>>
>> That's why a sudden big change in counter values happens
>> even if the parameter is set to a small value.
>>
>> Regards,
>> Akira
>>
>>
>> (2014/04/16 15:42), Dharmesh Kakadia wrote:
>>
>>> Hi Akira,
>>>
>>> Thanks for the reply, but as I understand this is the interval of console
>>> counter printing. What I am trying to get
>>>
>>> while(!job.isComplete()){
>>>    getcounters() and do some processing on that.
>>> }
>>>
>>> Now this is running fine, but the status I get the same counter values
>>> repeatedly and then suddenly a big change in counter values.
>>> For example, getcounters for REDUCE_INPUT_RECORDS returns values like
>>>
>>> 0
>>> 0
>>> ..
>>> 0
>>> 280
>>> 280
>>> ...
>>> 280
>>> 516
>>> 516
>>> ...
>>> 516
>>>
>>> etc.
>>>
>>> I want to get more finer values, instead of directly jumping from 280 to
>>> 516.
>>> Did that make sense? mapreduce.client.progressmonitor.pollinterval does
>>> not
>>> seem to effect it. Any workaround ?
>>>
>>> Thanks,
>>> Dharmesh
>>>
>>>
>>>
>>>
>>> On Tue, Apr 15, 2014 at 7:51 PM, Akira AJISAKA
>>> <aj...@oss.nttdata.co.jp>wrote:
>>>
>>>   Moved to user@hadoop.apache.org.
>>>>
>>>> You can configure the interval by setting
>>>> "mapreduce.client.progressmonitor.pollinterval" parameter.
>>>> The default value is 1000 ms.
>>>>
>>>> For more details, please see http://hadoop.apache.org/docs/
>>>> stable/hadoop-mapreduce-client/hadoop-mapreduce-
>>>> client-core/mapred-default.xml.
>>>>
>>>> Regards,
>>>> Akira
>>>>
>>>>
>>>> (2014/04/15 15:29), Dharmesh Kakadia wrote:
>>>>
>>>>   Hi,
>>>>>
>>>>> What is the update interval of inbuilt framework counters? Is that
>>>>> configurable?
>>>>> I am trying to collect very fine grained information about the job
>>>>> execution and using counters for that. It would be great if someone can
>>>>> point me to documentation/code for it. Thanks in advance.
>>>>>
>>>>> Thanks,
>>>>> Dharmesh
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>


Re: Update interval of default counters

Posted by Dharmesh Kakadia <dh...@gmail.com>.
Hi Akira,

Thanks fir the quick reply.
Any particular reason for hard-coding it? Is there a workaround? I want to
be able to get the counters as fine as possible. Also can you point me to
the relevant source code. I am willing to take the issue and contribute if
required.

Thanks,
Dharmesh


On Wed, Apr 16, 2014 at 3:14 PM, Akira AJISAKA
<aj...@oss.nttdata.co.jp>wrote:

> Moved mapreduce-dev@ to Bcc.
>
> Hi Dharmesh,
>
> The parameter is to set the interval of polling the progress
> of the MRAppMaster, not the Map/Reduce tasks. The tasks send
> the progress (includes the counter information) to MRAppMaster
> every 3000 milliseconds, which is hard-coded.
>
> That's why a sudden big change in counter values happens
> even if the parameter is set to a small value.
>
> Regards,
> Akira
>
>
> (2014/04/16 15:42), Dharmesh Kakadia wrote:
>
>> Hi Akira,
>>
>> Thanks for the reply, but as I understand this is the interval of console
>> counter printing. What I am trying to get
>>
>> while(!job.isComplete()){
>>   getcounters() and do some processing on that.
>> }
>>
>> Now this is running fine, but the status I get the same counter values
>> repeatedly and then suddenly a big change in counter values.
>> For example, getcounters for REDUCE_INPUT_RECORDS returns values like
>>
>> 0
>> 0
>> ..
>> 0
>> 280
>> 280
>> ...
>> 280
>> 516
>> 516
>> ...
>> 516
>>
>> etc.
>>
>> I want to get more finer values, instead of directly jumping from 280 to
>> 516.
>> Did that make sense? mapreduce.client.progressmonitor.pollinterval does
>> not
>> seem to effect it. Any workaround ?
>>
>> Thanks,
>> Dharmesh
>>
>>
>>
>>
>> On Tue, Apr 15, 2014 at 7:51 PM, Akira AJISAKA
>> <aj...@oss.nttdata.co.jp>wrote:
>>
>>  Moved to user@hadoop.apache.org.
>>>
>>> You can configure the interval by setting
>>> "mapreduce.client.progressmonitor.pollinterval" parameter.
>>> The default value is 1000 ms.
>>>
>>> For more details, please see http://hadoop.apache.org/docs/
>>> stable/hadoop-mapreduce-client/hadoop-mapreduce-
>>> client-core/mapred-default.xml.
>>>
>>> Regards,
>>> Akira
>>>
>>>
>>> (2014/04/15 15:29), Dharmesh Kakadia wrote:
>>>
>>>  Hi,
>>>>
>>>> What is the update interval of inbuilt framework counters? Is that
>>>> configurable?
>>>> I am trying to collect very fine grained information about the job
>>>> execution and using counters for that. It would be great if someone can
>>>> point me to documentation/code for it. Thanks in advance.
>>>>
>>>> Thanks,
>>>> Dharmesh
>>>>
>>>>
>>>>
>>>
>>
>

Re: Update interval of default counters

Posted by Akira AJISAKA <aj...@oss.nttdata.co.jp>.
Moved mapreduce-dev@ to Bcc.

Hi Dharmesh,

The parameter is to set the interval of polling the progress
of the MRAppMaster, not the Map/Reduce tasks. The tasks send
the progress (includes the counter information) to MRAppMaster
every 3000 milliseconds, which is hard-coded.

That's why a sudden big change in counter values happens
even if the parameter is set to a small value.

Regards,
Akira

(2014/04/16 15:42), Dharmesh Kakadia wrote:
> Hi Akira,
>
> Thanks for the reply, but as I understand this is the interval of console
> counter printing. What I am trying to get
>
> while(!job.isComplete()){
>   getcounters() and do some processing on that.
> }
>
> Now this is running fine, but the status I get the same counter values
> repeatedly and then suddenly a big change in counter values.
> For example, getcounters for REDUCE_INPUT_RECORDS returns values like
>
> 0
> 0
> ..
> 0
> 280
> 280
> ...
> 280
> 516
> 516
> ...
> 516
>
> etc.
>
> I want to get more finer values, instead of directly jumping from 280 to
> 516.
> Did that make sense? mapreduce.client.progressmonitor.pollinterval does not
> seem to effect it. Any workaround ?
>
> Thanks,
> Dharmesh
>
>
>
>
> On Tue, Apr 15, 2014 at 7:51 PM, Akira AJISAKA
> <aj...@oss.nttdata.co.jp>wrote:
>
>> Moved to user@hadoop.apache.org.
>>
>> You can configure the interval by setting
>> "mapreduce.client.progressmonitor.pollinterval" parameter.
>> The default value is 1000 ms.
>>
>> For more details, please see http://hadoop.apache.org/docs/
>> stable/hadoop-mapreduce-client/hadoop-mapreduce-
>> client-core/mapred-default.xml.
>>
>> Regards,
>> Akira
>>
>>
>> (2014/04/15 15:29), Dharmesh Kakadia wrote:
>>
>>> Hi,
>>>
>>> What is the update interval of inbuilt framework counters? Is that
>>> configurable?
>>> I am trying to collect very fine grained information about the job
>>> execution and using counters for that. It would be great if someone can
>>> point me to documentation/code for it. Thanks in advance.
>>>
>>> Thanks,
>>> Dharmesh
>>>
>>>
>>
>


Re: Update interval of default counters

Posted by Akira AJISAKA <aj...@oss.nttdata.co.jp>.
Moved mapreduce-dev@ to Bcc.

Hi Dharmesh,

The parameter is to set the interval of polling the progress
of the MRAppMaster, not the Map/Reduce tasks. The tasks send
the progress (includes the counter information) to MRAppMaster
every 3000 milliseconds, which is hard-coded.

That's why a sudden big change in counter values happens
even if the parameter is set to a small value.

Regards,
Akira

(2014/04/16 15:42), Dharmesh Kakadia wrote:
> Hi Akira,
>
> Thanks for the reply, but as I understand this is the interval of console
> counter printing. What I am trying to get
>
> while(!job.isComplete()){
>   getcounters() and do some processing on that.
> }
>
> Now this is running fine, but the status I get the same counter values
> repeatedly and then suddenly a big change in counter values.
> For example, getcounters for REDUCE_INPUT_RECORDS returns values like
>
> 0
> 0
> ..
> 0
> 280
> 280
> ...
> 280
> 516
> 516
> ...
> 516
>
> etc.
>
> I want to get more finer values, instead of directly jumping from 280 to
> 516.
> Did that make sense? mapreduce.client.progressmonitor.pollinterval does not
> seem to effect it. Any workaround ?
>
> Thanks,
> Dharmesh
>
>
>
>
> On Tue, Apr 15, 2014 at 7:51 PM, Akira AJISAKA
> <aj...@oss.nttdata.co.jp>wrote:
>
>> Moved to user@hadoop.apache.org.
>>
>> You can configure the interval by setting
>> "mapreduce.client.progressmonitor.pollinterval" parameter.
>> The default value is 1000 ms.
>>
>> For more details, please see http://hadoop.apache.org/docs/
>> stable/hadoop-mapreduce-client/hadoop-mapreduce-
>> client-core/mapred-default.xml.
>>
>> Regards,
>> Akira
>>
>>
>> (2014/04/15 15:29), Dharmesh Kakadia wrote:
>>
>>> Hi,
>>>
>>> What is the update interval of inbuilt framework counters? Is that
>>> configurable?
>>> I am trying to collect very fine grained information about the job
>>> execution and using counters for that. It would be great if someone can
>>> point me to documentation/code for it. Thanks in advance.
>>>
>>> Thanks,
>>> Dharmesh
>>>
>>>
>>
>


Re: Update interval of default counters

Posted by Akira AJISAKA <aj...@oss.nttdata.co.jp>.
Moved mapreduce-dev@ to Bcc.

Hi Dharmesh,

The parameter is to set the interval of polling the progress
of the MRAppMaster, not the Map/Reduce tasks. The tasks send
the progress (includes the counter information) to MRAppMaster
every 3000 milliseconds, which is hard-coded.

That's why a sudden big change in counter values happens
even if the parameter is set to a small value.

Regards,
Akira

(2014/04/16 15:42), Dharmesh Kakadia wrote:
> Hi Akira,
>
> Thanks for the reply, but as I understand this is the interval of console
> counter printing. What I am trying to get
>
> while(!job.isComplete()){
>   getcounters() and do some processing on that.
> }
>
> Now this is running fine, but the status I get the same counter values
> repeatedly and then suddenly a big change in counter values.
> For example, getcounters for REDUCE_INPUT_RECORDS returns values like
>
> 0
> 0
> ..
> 0
> 280
> 280
> ...
> 280
> 516
> 516
> ...
> 516
>
> etc.
>
> I want to get more finer values, instead of directly jumping from 280 to
> 516.
> Did that make sense? mapreduce.client.progressmonitor.pollinterval does not
> seem to effect it. Any workaround ?
>
> Thanks,
> Dharmesh
>
>
>
>
> On Tue, Apr 15, 2014 at 7:51 PM, Akira AJISAKA
> <aj...@oss.nttdata.co.jp>wrote:
>
>> Moved to user@hadoop.apache.org.
>>
>> You can configure the interval by setting
>> "mapreduce.client.progressmonitor.pollinterval" parameter.
>> The default value is 1000 ms.
>>
>> For more details, please see http://hadoop.apache.org/docs/
>> stable/hadoop-mapreduce-client/hadoop-mapreduce-
>> client-core/mapred-default.xml.
>>
>> Regards,
>> Akira
>>
>>
>> (2014/04/15 15:29), Dharmesh Kakadia wrote:
>>
>>> Hi,
>>>
>>> What is the update interval of inbuilt framework counters? Is that
>>> configurable?
>>> I am trying to collect very fine grained information about the job
>>> execution and using counters for that. It would be great if someone can
>>> point me to documentation/code for it. Thanks in advance.
>>>
>>> Thanks,
>>> Dharmesh
>>>
>>>
>>
>


Re: Update interval of default counters

Posted by Akira AJISAKA <aj...@oss.nttdata.co.jp>.
Moved mapreduce-dev@ to Bcc.

Hi Dharmesh,

The parameter is to set the interval of polling the progress
of the MRAppMaster, not the Map/Reduce tasks. The tasks send
the progress (includes the counter information) to MRAppMaster
every 3000 milliseconds, which is hard-coded.

That's why a sudden big change in counter values happens
even if the parameter is set to a small value.

Regards,
Akira

(2014/04/16 15:42), Dharmesh Kakadia wrote:
> Hi Akira,
>
> Thanks for the reply, but as I understand this is the interval of console
> counter printing. What I am trying to get
>
> while(!job.isComplete()){
>   getcounters() and do some processing on that.
> }
>
> Now this is running fine, but the status I get the same counter values
> repeatedly and then suddenly a big change in counter values.
> For example, getcounters for REDUCE_INPUT_RECORDS returns values like
>
> 0
> 0
> ..
> 0
> 280
> 280
> ...
> 280
> 516
> 516
> ...
> 516
>
> etc.
>
> I want to get more finer values, instead of directly jumping from 280 to
> 516.
> Did that make sense? mapreduce.client.progressmonitor.pollinterval does not
> seem to effect it. Any workaround ?
>
> Thanks,
> Dharmesh
>
>
>
>
> On Tue, Apr 15, 2014 at 7:51 PM, Akira AJISAKA
> <aj...@oss.nttdata.co.jp>wrote:
>
>> Moved to user@hadoop.apache.org.
>>
>> You can configure the interval by setting
>> "mapreduce.client.progressmonitor.pollinterval" parameter.
>> The default value is 1000 ms.
>>
>> For more details, please see http://hadoop.apache.org/docs/
>> stable/hadoop-mapreduce-client/hadoop-mapreduce-
>> client-core/mapred-default.xml.
>>
>> Regards,
>> Akira
>>
>>
>> (2014/04/15 15:29), Dharmesh Kakadia wrote:
>>
>>> Hi,
>>>
>>> What is the update interval of inbuilt framework counters? Is that
>>> configurable?
>>> I am trying to collect very fine grained information about the job
>>> execution and using counters for that. It would be great if someone can
>>> point me to documentation/code for it. Thanks in advance.
>>>
>>> Thanks,
>>> Dharmesh
>>>
>>>
>>
>


Re: Update interval of default counters

Posted by Akira AJISAKA <aj...@oss.nttdata.co.jp>.
Moved mapreduce-dev@ to Bcc.

Hi Dharmesh,

The parameter is to set the interval of polling the progress
of the MRAppMaster, not the Map/Reduce tasks. The tasks send
the progress (includes the counter information) to MRAppMaster
every 3000 milliseconds, which is hard-coded.

That's why a sudden big change in counter values happens
even if the parameter is set to a small value.

Regards,
Akira

(2014/04/16 15:42), Dharmesh Kakadia wrote:
> Hi Akira,
>
> Thanks for the reply, but as I understand this is the interval of console
> counter printing. What I am trying to get
>
> while(!job.isComplete()){
>   getcounters() and do some processing on that.
> }
>
> Now this is running fine, but the status I get the same counter values
> repeatedly and then suddenly a big change in counter values.
> For example, getcounters for REDUCE_INPUT_RECORDS returns values like
>
> 0
> 0
> ..
> 0
> 280
> 280
> ...
> 280
> 516
> 516
> ...
> 516
>
> etc.
>
> I want to get more finer values, instead of directly jumping from 280 to
> 516.
> Did that make sense? mapreduce.client.progressmonitor.pollinterval does not
> seem to effect it. Any workaround ?
>
> Thanks,
> Dharmesh
>
>
>
>
> On Tue, Apr 15, 2014 at 7:51 PM, Akira AJISAKA
> <aj...@oss.nttdata.co.jp>wrote:
>
>> Moved to user@hadoop.apache.org.
>>
>> You can configure the interval by setting
>> "mapreduce.client.progressmonitor.pollinterval" parameter.
>> The default value is 1000 ms.
>>
>> For more details, please see http://hadoop.apache.org/docs/
>> stable/hadoop-mapreduce-client/hadoop-mapreduce-
>> client-core/mapred-default.xml.
>>
>> Regards,
>> Akira
>>
>>
>> (2014/04/15 15:29), Dharmesh Kakadia wrote:
>>
>>> Hi,
>>>
>>> What is the update interval of inbuilt framework counters? Is that
>>> configurable?
>>> I am trying to collect very fine grained information about the job
>>> execution and using counters for that. It would be great if someone can
>>> point me to documentation/code for it. Thanks in advance.
>>>
>>> Thanks,
>>> Dharmesh
>>>
>>>
>>
>


Re: Update interval of default counters

Posted by Dharmesh Kakadia <dh...@gmail.com>.
Hi Akira,

Thanks for the reply, but as I understand this is the interval of console
counter printing. What I am trying to get

while(!job.isComplete()){
 getcounters() and do some processing on that.
}

Now this is running fine, but the status I get the same counter values
repeatedly and then suddenly a big change in counter values.
For example, getcounters for REDUCE_INPUT_RECORDS returns values like

0
0
..
0
280
280
...
280
516
516
...
516

etc.

I want to get more finer values, instead of directly jumping from 280 to
516.
Did that make sense? mapreduce.client.progressmonitor.pollinterval does not
seem to effect it. Any workaround ?

Thanks,
Dharmesh




On Tue, Apr 15, 2014 at 7:51 PM, Akira AJISAKA
<aj...@oss.nttdata.co.jp>wrote:

> Moved to user@hadoop.apache.org.
>
> You can configure the interval by setting
> "mapreduce.client.progressmonitor.pollinterval" parameter.
> The default value is 1000 ms.
>
> For more details, please see http://hadoop.apache.org/docs/
> stable/hadoop-mapreduce-client/hadoop-mapreduce-
> client-core/mapred-default.xml.
>
> Regards,
> Akira
>
>
> (2014/04/15 15:29), Dharmesh Kakadia wrote:
>
>> Hi,
>>
>> What is the update interval of inbuilt framework counters? Is that
>> configurable?
>> I am trying to collect very fine grained information about the job
>> execution and using counters for that. It would be great if someone can
>> point me to documentation/code for it. Thanks in advance.
>>
>> Thanks,
>> Dharmesh
>>
>>
>