You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Mike Liddell (JIRA)" <ji...@apache.org> on 2013/07/09 02:53:52 UTC

[jira] [Commented] (TEZ-148) DAG state machine does not handle successfully completing vertices at kill wait

    [ https://issues.apache.org/jira/browse/TEZ-148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13702713#comment-13702713 ] 

Mike Liddell commented on TEZ-148:
----------------------------------

commit 999d16ae1022884db4a17e2b803abe52ea82fcf0
Author: Mike Liddell <ml...@apache.org>
Date:   Mon Jul 8 17:48:18 2013 -0700

    TEZ-141;TEZ-143;TEZ-148;TEZ-283;TEZ-284: Improve failure/kill handling, track TerminationCauses, add dag.kill()
                
> DAG state machine does not handle successfully completing vertices at kill wait
> -------------------------------------------------------------------------------
>
>                 Key: TEZ-148
>                 URL: https://issues.apache.org/jira/browse/TEZ-148
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Hitesh Shah
>            Assignee: Mike Liddell
>              Labels: TEZ-0.2.0, TEZ-1
>
> A race between a kill and vertices completing successfully seems to not be handled.
> impl.DAGImpl (DAGImpl.java:handle(592)) - Can't handle this event at current state
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: DAG_VERTEX_COMPLETED at KILL_WAIT
>         at org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
>         at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:299)
>         at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
>         at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:445)
> checkForCompletion ignores the current kill wait state and sends back a final state of succeeded causing a problem in the state machine as it can only go to KILL_WAIT or KILLED from KILL_WAIT.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira