You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tez.apache.org by "Jonathan Turner Eagles (Jira)" <ji...@apache.org> on 2021/08/12 14:26:00 UTC
[jira] [Resolved] (TEZ-2262) DAG/Tasks should not fail if counter
limits are exceeded.
[ https://issues.apache.org/jira/browse/TEZ-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jonathan Turner Eagles resolved TEZ-2262.
-----------------------------------------
Resolution: Incomplete
Creating an option to make counters limits exceeded non-fatal is valid. This can be achieved by setting a very high limit as a work around. Closing as incomplete
> DAG/Tasks should not fail if counter limits are exceeded.
> ---------------------------------------------------------
>
> Key: TEZ-2262
> URL: https://issues.apache.org/jira/browse/TEZ-2262
> Project: Apache Tez
> Issue Type: Bug
> Affects Versions: 0.5.0
> Reporter: Mostafa Mokhtar
> Priority: Major
>
> Running TPC-DS Q64 failed due to exceeding the max number of counters.
> DAG should succeed and include a warning in the diagnostics stating that the error got truncated.
> {code}
> 18043560327-2015-04-01 16:23:08,509 INFO [AsyncDispatcher event handler] impl.DAGImpl: No output committers for vertex: Reducer 9
> 18043560445-2015-04-01 16:23:08,857 FATAL [AsyncDispatcher event handler] event.AsyncDispatcher: Error in dispatcher thread
> 18043560557:org.apache.tez.common.counters.LimitExceededException: Too many counters: 1201 max=1200
> 18043560645- at org.apache.tez.common.counters.Limits.checkCounters(Limits.java:87)
> 18043560717- at org.apache.tez.common.counters.Limits.incrCounters(Limits.java:94)
> 18043560788- at org.apache.tez.common.counters.AbstractCounterGroup.addCounter(AbstractCounterGroup.java:75)
> 18043560885- at org.apache.tez.common.counters.AbstractCounterGroup.addCounterImpl(AbstractCounterGroup.java:92)
> 18043560986- at org.apache.tez.common.counters.AbstractCounterGroup.findCounter(AbstractCounterGroup.java:103)
> 18043561085- at org.apache.tez.common.counters.AbstractCounterGroup.incrAllCounters(AbstractCounterGroup.java:198)
> 18043561188- at org.apache.tez.common.counters.AbstractCounters.incrAllCounters(AbstractCounters.java:363)
> 18043561283- at org.apache.tez.dag.app.dag.impl.DAGImpl.incrTaskCounters(DAGImpl.java:598)
> 18043561362- at org.apache.tez.dag.app.dag.impl.DAGImpl.getAllCounters(DAGImpl.java:588)
> 18043561439- at org.apache.tez.dag.app.dag.impl.DAGImpl.logJobHistoryFinishedEvent(DAGImpl.java:994)
> 18043561528- at org.apache.tez.dag.app.dag.impl.DAGImpl.finished(DAGImpl.java:1135)
> 18043561600- at org.apache.tez.dag.app.dag.impl.DAGImpl.checkDAGForCompletion(DAGImpl.java:1048)
> 18043561685- at org.apache.tez.dag.app.dag.impl.DAGImpl$VertexCompletedTransition.transition(DAGImpl.java:1708)
> 18043561785- at org.apache.tez.dag.app.dag.impl.DAGImpl$VertexCompletedTransition.transition(DAGImpl.java:1665)
> 18043561885- at org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
> 18043562001- at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
> 18043562097- at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
> 18043562190- at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
> 18043562307- at org.apache.tez.dag.app.dag.impl.DAGImpl.handle(DAGImpl.java:944)
> 18043562376- at org.apache.tez.dag.app.dag.impl.DAGImpl.handle(DAGImpl.java:126)
> 18043562445- at org.apache.tez.dag.app.DAGAppMaster$DagEventDispatcher.handle(DAGAppMaster.java:1686)
> 18043562535- at org.apache.tez.dag.app.DAGAppMaster$DagEventDispatcher.handle(DAGAppMaster.java:1677)
> 18043562625- at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
> 18043562709- at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
> 18043562790- at java.lang.Thread.run(Thread.java:745)
> 18043562832-2015-04-01 16:23:08,882 INFO [AsyncDispatcher event handler] event.AsyncDispatcher: Exiting, bbye..
> 18043562932-2015-04-01 16:23:08,885 INFO [Thread-1] app.DAGAppMaster: DAGAppMasterShutdownHook invoked
> 18043563023-2015-04-01 16:23:08,885 INFO [Thread-1] app.DAGAppMaster: DAGAppMaster received a signal. Signaling TaskScheduler
> 18043563137-2015-04-01 16:23:08,885 INFO [Thread-1] rm.TaskSchedulerEventHandler: TaskScheduler notified that iSignalled was : true
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)