You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-dev@hadoop.apache.org by "Gera Shegalov (JIRA)" <ji...@apache.org> on 2014/10/16 22:27:34 UTC

[jira] [Resolved] (MAPREDUCE-6129) Job failed due to counter out of limited in MRAppMaster

     [ https://issues.apache.org/jira/browse/MAPREDUCE-6129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gera Shegalov resolved MAPREDUCE-6129.
--------------------------------------
    Resolution: Duplicate

[~kasha] , [~coderplay], Yes, this is a subset of MAPREDUCE-5875. 

> Job failed due to counter out of limited in MRAppMaster
> -------------------------------------------------------
>
>                 Key: MAPREDUCE-6129
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6129
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: applicationmaster
>    Affects Versions: 3.0.0, 2.3.0, 2.5.0, 2.4.1, 2.5.1
>            Reporter: Min Zhou
>         Attachments: MAPREDUCE-6129.diff
>
>
> Lots of of cluster's job use more than 120 counters, those kind of jobs  failed with exception like below
> {noformat}
> 2014-10-15 22:55:43,742 WARN [Socket Reader #1 for port 45673] org.apache.hadoop.ipc.Server: Unable to read call parameters for client 10.180.216.12on connection protocol org.apache.hadoop.mapred.TaskUmbilicalProtocol for rpcKind RPC_WRITABLE
> org.apache.hadoop.mapreduce.counters.LimitExceededException: Too many counters: 121 max=120
> 	at org.apache.hadoop.mapreduce.counters.Limits.checkCounters(Limits.java:103)
> 	at org.apache.hadoop.mapreduce.counters.Limits.incrCounters(Limits.java:110)
> 	at org.apache.hadoop.mapreduce.counters.AbstractCounterGroup.readFields(AbstractCounterGroup.java:175)
> 	at org.apache.hadoop.mapred.Counters$Group.readFields(Counters.java:324)
> 	at org.apache.hadoop.mapreduce.counters.AbstractCounters.readFields(AbstractCounters.java:314)
> 	at org.apache.hadoop.mapred.TaskStatus.readFields(TaskStatus.java:489)
> 	at org.apache.hadoop.mapred.ReduceTaskStatus.readFields(ReduceTaskStatus.java:140)
> 	at org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:285)
> 	at org.apache.hadoop.ipc.WritableRpcEngine$Invocation.readFields(WritableRpcEngine.java:157)
> 	at org.apache.hadoop.ipc.Server$Connection.processRpcRequest(Server.java:1802)
> 	at org.apache.hadoop.ipc.Server$Connection.processOneRpc(Server.java:1734)
> 	at org.apache.hadoop.ipc.Server$Connection.readAndProcess(Server.java:1494)
> 	at org.apache.hadoop.ipc.Server$Listener.doRead(Server.java:732)
> 	at org.apache.hadoop.ipc.Server$Listener$Reader.doRunLoop(Server.java:606)
> 	at org.apache.hadoop.ipc.Server$Listener$Reader.run(Server.java:577)
> {noformat}
> The class org.apache.hadoop.mapreduce.counters.Limits load the mapred-site.xml on nodemanager node for JobConf if it hasn't been inited. 
> If the mapred-site.xml on nodemanager node is not exist or the mapreduce.job.counters.max hasn't been defined on that file, Class org.apache.hadoop.mapreduce.counters.Limits will just  use the default value 120. 
> Instead, we should read user job's conf file rather than config files on nodemanager for checking counters limits.
> I will submitt a patch later.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)