You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Mostafa Mokhtar (JIRA)" <ji...@apache.org> on 2015/04/02 00:10:53 UTC

[jira] [Created] (TEZ-2262) Tez : Catch counters.LimitExceededException and don't fail the DAG

Mostafa Mokhtar created TEZ-2262:
------------------------------------

             Summary: Tez : Catch counters.LimitExceededException and don't fail the DAG
                 Key: TEZ-2262
                 URL: https://issues.apache.org/jira/browse/TEZ-2262
             Project: Apache Tez
          Issue Type: Bug
    Affects Versions: 0.5.0
            Reporter: Mostafa Mokhtar
             Fix For: 0.5.0


Running TPC-DS Q64 failed due to exceeding the max number of counters.
DAG should succeed and include a warning in the diagnostics stating that the error got truncated.
{code}
18043560327-2015-04-01 16:23:08,509 INFO [AsyncDispatcher event handler] impl.DAGImpl: No output committers for vertex: Reducer 9
18043560445-2015-04-01 16:23:08,857 FATAL [AsyncDispatcher event handler] event.AsyncDispatcher: Error in dispatcher thread
18043560557:org.apache.tez.common.counters.LimitExceededException: Too many counters: 1201 max=1200
18043560645-	at org.apache.tez.common.counters.Limits.checkCounters(Limits.java:87)
18043560717-	at org.apache.tez.common.counters.Limits.incrCounters(Limits.java:94)
18043560788-	at org.apache.tez.common.counters.AbstractCounterGroup.addCounter(AbstractCounterGroup.java:75)
18043560885-	at org.apache.tez.common.counters.AbstractCounterGroup.addCounterImpl(AbstractCounterGroup.java:92)
18043560986-	at org.apache.tez.common.counters.AbstractCounterGroup.findCounter(AbstractCounterGroup.java:103)
18043561085-	at org.apache.tez.common.counters.AbstractCounterGroup.incrAllCounters(AbstractCounterGroup.java:198)
18043561188-	at org.apache.tez.common.counters.AbstractCounters.incrAllCounters(AbstractCounters.java:363)
18043561283-	at org.apache.tez.dag.app.dag.impl.DAGImpl.incrTaskCounters(DAGImpl.java:598)
18043561362-	at org.apache.tez.dag.app.dag.impl.DAGImpl.getAllCounters(DAGImpl.java:588)
18043561439-	at org.apache.tez.dag.app.dag.impl.DAGImpl.logJobHistoryFinishedEvent(DAGImpl.java:994)
18043561528-	at org.apache.tez.dag.app.dag.impl.DAGImpl.finished(DAGImpl.java:1135)
18043561600-	at org.apache.tez.dag.app.dag.impl.DAGImpl.checkDAGForCompletion(DAGImpl.java:1048)
18043561685-	at org.apache.tez.dag.app.dag.impl.DAGImpl$VertexCompletedTransition.transition(DAGImpl.java:1708)
18043561785-	at org.apache.tez.dag.app.dag.impl.DAGImpl$VertexCompletedTransition.transition(DAGImpl.java:1665)
18043561885-	at org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
18043562001-	at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
18043562097-	at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
18043562190-	at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
18043562307-	at org.apache.tez.dag.app.dag.impl.DAGImpl.handle(DAGImpl.java:944)
18043562376-	at org.apache.tez.dag.app.dag.impl.DAGImpl.handle(DAGImpl.java:126)
18043562445-	at org.apache.tez.dag.app.DAGAppMaster$DagEventDispatcher.handle(DAGAppMaster.java:1686)
18043562535-	at org.apache.tez.dag.app.DAGAppMaster$DagEventDispatcher.handle(DAGAppMaster.java:1677)
18043562625-	at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
18043562709-	at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
18043562790-	at java.lang.Thread.run(Thread.java:745)
18043562832-2015-04-01 16:23:08,882 INFO [AsyncDispatcher event handler] event.AsyncDispatcher: Exiting, bbye..
18043562932-2015-04-01 16:23:08,885 INFO [Thread-1] app.DAGAppMaster: DAGAppMasterShutdownHook invoked
18043563023-2015-04-01 16:23:08,885 INFO [Thread-1] app.DAGAppMaster: DAGAppMaster received a signal. Signaling TaskScheduler
18043563137-2015-04-01 16:23:08,885 INFO [Thread-1] rm.TaskSchedulerEventHandler: TaskScheduler notified that iSignalled was : true
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)