You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Hitesh Shah (JIRA)" <ji...@apache.org> on 2015/05/07 02:58:00 UTC

[jira] [Commented] (TEZ-2429) Tez AM does not die after hitting internal error

    [ https://issues.apache.org/jira/browse/TEZ-2429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14531810#comment-14531810 ] 

Hitesh Shah commented on TEZ-2429:
----------------------------------

Logs show that a new dag is submitted after and continues to run. 



> Tez AM does not die after hitting internal error 
> -------------------------------------------------
>
>                 Key: TEZ-2429
>                 URL: https://issues.apache.org/jira/browse/TEZ-2429
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Hitesh Shah
>            Priority: Blocker
>
> From https://builds.apache.org/job/Tez-Build/1055/: 
> 2015-05-06 23:55:54,421 ERROR [Dispatcher thread: Central] impl.DAGImpl: Can't handle this event at current state
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: DAG_VERTEX_RERUNNING at SUCCEEDED
> 	at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
> 	at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
> 	at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
> 	at org.apache.tez.state.StateMachineTez.doTransition(StateMachineTez.java:57)
> 	at org.apache.tez.dag.app.dag.impl.DAGImpl.handle(DAGImpl.java:1079)
> 	at org.apache.tez.dag.app.dag.impl.DAGImpl.handle(DAGImpl.java:143)
> 	at org.apache.tez.dag.app.DAGAppMaster$DagEventDispatcher.handle(DAGAppMaster.java:1871)
> 	at org.apache.tez.dag.app.DAGAppMaster$DagEventDispatcher.handle(DAGAppMaster.java:1862)
> 	at org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:183)
> 	at org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:114)
> 	at java.lang.Thread.run(Thread.java:662)
> 2015-05-06 23:55:54,423 INFO [Dispatcher thread: Central] app.DAGAppMaster: Cleaning up DAG: name=testRandomFailingInputs, with id=dag_1430956448478_0001_16
> 2015-05-06 23:55:54,423 INFO [Dispatcher thread: Central] app.DAGAppMaster: Completed cleanup for DAG: name=testRandomFailingInputs, with id=dag_1430956448478_0001_16
> 2015-05-06 23:55:54,424 INFO [Dispatcher thread: Central] impl.DAGImpl: dag_1430956448478_0001_16 terminating due to internal error
> 2015-05-06 23:55:54,433 INFO [IPC Server handler 0 on 47432] app.DAGAppMaster: Starting DAG submitted via RPC: testBasicInputFailureWithExit
> 2015-05-06 23:55:54,455 ERROR [Dispatcher thread: Central] common.AsyncDispatcher: Error in dispatcher thread
> java.lang.NullPointerException
> 	at org.apache.tez.dag.history.recovery.RecoveryService.doFlush(RecoveryService.java:458)
> 	at org.apache.tez.dag.history.recovery.RecoveryService.handle(RecoveryService.java:289)
> 	at org.apache.tez.dag.history.HistoryEventHandler.handleCriticalEvent(HistoryEventHandler.java:102)
> 	at org.apache.tez.dag.app.dag.impl.DAGImpl.logJobHistoryUnsuccesfulEvent(DAGImpl.java:1161)
> 	at org.apache.tez.dag.app.dag.impl.DAGImpl.finished(DAGImpl.java:1275)
> 	at org.apache.tez.dag.app.dag.impl.DAGImpl.access$2600(DAGImpl.java:144)
> 	at org.apache.tez.dag.app.dag.impl.DAGImpl$InternalErrorTransition.transition(DAGImpl.java:2151)
> 	at org.apache.tez.dag.app.dag.impl.DAGImpl$InternalErrorTransition.transition(DAGImpl.java:2140)
> 	at org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
> 	at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
> 	at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
> 	at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
> 	at org.apache.tez.state.StateMachineTez.doTransition(StateMachineTez.java:57)
> 	at org.apache.tez.dag.app.dag.impl.DAGImpl.handle(DAGImpl.java:1079)
> 	at org.apache.tez.dag.app.dag.impl.DAGImpl.handle(DAGImpl.java:143)
> 	at org.apache.tez.dag.app.DAGAppMaster$DagEventDispatcher.handle(DAGAppMaster.java:1871)
> 	at org.apache.tez.dag.app.DAGAppMaster$DagEventDispatcher.handle(DAGAppMaster.java:1862)
> 	at org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:183)
> 	at org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:114)
> 	at java.lang.Thread.run(Thread.java:662)
> 2015-05-06 23:55:54,456 INFO [Dispatcher thread: Central] impl.VertexImpl: Killing tasks in vertex: vertex_1430956448478_0001_16_10 [l4v1] due to trigger: INTERNAL_ERROR
> 2015-05-06 23:55:54,456 INFO [Dispatcher thread: Central] impl.VertexImpl: vertex_1430956448478_0001_16_10 [l4v1] transitioned from RUNNING to TERMINATING due to event V_TERMINATE
> 2015-05-06 23:55:54,456 INFO [AsyncDispatcher ShutDown handler] common.AsyncDispatcher: Exiting, bbye..



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)