You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Bikas Saha (JIRA)" <ji...@apache.org> on 2015/05/03 00:21:06 UTC

[jira] [Comment Edited] (TEZ-2379) org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: T_ATTEMPT_KILLED at KILLED

    [ https://issues.apache.org/jira/browse/TEZ-2379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14525521#comment-14525521 ] 

Bikas Saha edited comment on TEZ-2379 at 5/2/15 10:20 PM:
----------------------------------------------------------

If you still think that ignoring the attempt killed event in the task killed state is the right fix, then please go ahead and make the change. This race condition is not quite relevant since the job is anyways going to killed. However, the attempt killed event would need to be ignored in all task terminal states (success, failed and killed) since the race could happen for any of these transitions if the attempt actually succeeded while the task was trying to kill it. The patch is ignoring it only in the killed state.


was (Author: bikassaha):
If you still think that ignoring the attempt killed event in the task killed state is the right fix, then please go ahead and make the change. This race condition is not quite relevant since the job is anyways going to killed. However, the attempt killed event would need to be ignored in all task terminal states (success, failed and killed) since the race could happen for any of these transitions if the attempt actually succeeded while the task was trying to kill it.

> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: T_ATTEMPT_KILLED at KILLED
> ------------------------------------------------------------------------------------------------------
>
>                 Key: TEZ-2379
>                 URL: https://issues.apache.org/jira/browse/TEZ-2379
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Rajesh Balamohan
>            Assignee: Hitesh Shah
>            Priority: Blocker
>         Attachments: TEZ-2379.1.patch
>
>
> {noformat}
> 2015-04-28 04:49:32,455 ERROR [Dispatcher thread: Central] impl.TaskImpl: Can't handle this event at current state for task_1429683757595_0479_1_03_000013
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: T_ATTEMPT_KILLED at KILLED
>         at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
>         at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>         at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>         at org.apache.tez.state.StateMachineTez.doTransition(StateMachineTez.java:57)
>         at org.apache.tez.dag.app.dag.impl.TaskImpl.handle(TaskImpl.java:853)
>         at org.apache.tez.dag.app.dag.impl.TaskImpl.handle(TaskImpl.java:106)
>         at org.apache.tez.dag.app.DAGAppMaster$TaskEventDispatcher.handle(DAGAppMaster.java:1874)
>         at org.apache.tez.dag.app.DAGAppMaster$TaskEventDispatcher.handle(DAGAppMaster.java:1860)
>         at org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:182)
>         at org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:113)
>         at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Additional notes:
> ============
> Hive - latest build 
> Tez - master
> tpch-200 gb scale q_17 (kill the job in the middle of execution)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)