You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Jeff Zhang (JIRA)" <ji...@apache.org> on 2016/02/25 01:34:18 UTC

[jira] [Commented] (TEZ-3137) Tez task failed with illegal state exception in recovery

    [ https://issues.apache.org/jira/browse/TEZ-3137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15166424#comment-15166424 ] 

Jeff Zhang commented on TEZ-3137:
---------------------------------

StartWhileInitializingTransition is for the root vertex where V_START is sent from DAGAppMaster, but it is not true for recovery. In recovery, each vertex recover its state by itself, and will send V_START to itself. 

See VertexImpl#RecoverTransition
{noformat}
switch (vertex.recoveredState) {
        case NEW:
          // Drop all root events if not inited properly
          Iterator<TezEvent> iterator = vertex.recoveredEvents.iterator();
          while (iterator.hasNext()) {
            if (iterator.next().getEventType().equals(
                EventType.ROOT_INPUT_DATA_INFORMATION_EVENT)) {
              iterator.remove();
            }
          }
          // Trigger init if all sources initialized
          if (vertex.numInitedSourceVertices == vertex.getInputVerticesCount()) {
            vertex.eventHandler.handle(new VertexEvent(vertex.vertexId,
                VertexEventType.V_INIT));
          }
          if (vertex.numStartedSourceVertices == vertex.getInputVerticesCount()) {
            vertex.eventHandler.handle(new VertexEvent(vertex.vertexId,
                VertexEventType.V_START));
          }
          endState = VertexState.NEW;
{noformat}

> Tez task failed with illegal state exception in recovery
> --------------------------------------------------------
>
>                 Key: TEZ-3137
>                 URL: https://issues.apache.org/jira/browse/TEZ-3137
>             Project: Apache Tez
>          Issue Type: Bug
>    Affects Versions: 0.7.0
>            Reporter: Jeff Zhang
>            Assignee: Jeff Zhang
>
> {noformat}
> 2016-02-19 02:33:18,917 [INFO] [Dispatcher thread {Central}] |impl.VertexImpl|: vertex_1455323976018_32442_1_20 [       Map 31] transitioned from NEW to INITIALIZING due to event V_INIT
> 154089 2016-02-19 02:33:18,917 [INFO] [InputInitializer {Map 31} #0] |dag.RootInputInitializerManager|: Starting InputIn       itializer for Input: web_sales on vertex vertex_1455323976018_32442_1_20 [Map 31]
> 154090 2016-02-19 02:33:18,917 [ERROR] [Dispatcher thread {Central}] |impl.VertexImpl|: Uncaught Exception when handling        event V_START on vertex Map 31 with vertexId vertex_1455323976018_32442_1_20 at current state INITIALIZING
>  java.lang.IllegalStateException: Vertex: vertex_1455323976018_32442_1_20 [Map 31] got invalid start event
>      at com.google.common.base.Preconditions.checkState(Preconditions.java:149)
>      at org.apache.tez.dag.app.dag.impl.VertexImpl$StartWhileInitializingTransition.transition(VertexImpl.java:360       8)
>     at org.apache.tez.dag.app.dag.impl.VertexImpl$StartWhileInitializingTransition.transition(VertexImpl.java:360       0)
>     at org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:3       62)
>     at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
>      at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>      at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.jav       a:448)
>      at org.apache.tez.state.StateMachineTez.doTransition(StateMachineTez.java:57)
>      at org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:1862)
>     at org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:201)
>      at org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:1978)
>      at org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:1964)
>      at org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:183)
>      at org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:114)
>      at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)