You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@oozie.apache.org by Denis Yuen <De...@oicr.on.ca> on 2014/06/20 17:05:38 UTC

Invalid execution path on job rerun

Hi,

We're running into an issue where workflows that fail and have to be re-run (with oozie.wf.rerun.failnodes=true ) immediately fail again with a message in the Oozie log "invalid execution path."

The consistent pattern that we observe is that in a workflow with a fork (fork_2) leading to a join (join_2) which leads to a fork (fork_3), if a failure occurs in the jobs that fork_3 leads to, then on retry, the failure will immediately occur. If there is no failure, the workflow executes to completion normally.  What we've also observed is that if fork_2 leads to a number of jobs (bash_cp_3, bash_cp_4, bash_cp6, bash_cp_8, bash_cp_10, bash_cp_12), then the apparently invalid execution paths are any of the first five. In other words, if any of the first five are seemingly randomly set by Oozie in Oozie's wf_actions table for the execution_path for join_2, the re-run will fail. Only if "bash_cp_12" is set then the workflow will successfully re-run.

Another thing that might be relevant is that we are using a custom action executor that submits to SGE (for legacy reasons). The code is available at https://github.com/SeqWare/oozie-sge/tree/1.0.2 This is with Oozie version 3.3.2-cdh4.5.0

Are there any thoughts on whether there is some API call that we're failing to make in our custom action executor that affects execution path?
Are we structuring our workflows in some unexpected manner?
What is the meaning of an execution path for a control node such as join anyways?

Thanks for any insight!

Large amounts of text follow ....

Relevant error in log:
2014-06-05 14:06:01,599 DEBUG SignalXCommand:545 - USER[dyuen] GROUP[-] TOKEN[] APP[HelloWorld] JOB[0000000-140605140030484-oozie-oozi-W] ACTION[0000000-140605140030484-oozie-oozi-W@join_2] STARTED SignalCommand for jobid=0000000-140605140030484-oozie-oozi-W, actionId=0000000-140605140030484-oozie-oozi-W@join_2
2014-06-05 14:06:01,600 DEBUG LiteWorkflowInstance:545 - USER[dyuen] GROUP[-] TOKEN[] APP[HelloWorld] JOB[0000000-140605140030484-oozie-oozi-W] ACTION[0000000-140605140030484-oozie-oozi-W@join_2] Signaling job execution path [/bash_cp_3/] signal value [OK]
2014-06-05 14:06:01,600 ERROR LiteWorkflowInstance:536 - USER[dyuen] GROUP[-] TOKEN[] APP[HelloWorld] JOB[0000000-140605140030484-oozie-oozi-W] ACTION[0000000-140605140030484-oozie-oozi-W@join_2] invalid execution path [/bash_cp_3/]
2014-06-05 14:06:01,601  WARN LiteWorkflowInstance:542 - USER[dyuen] GROUP[-] TOKEN[] APP[HelloWorld] JOB[0000000-140605140030484-oozie-oozi-W] ACTION[0000000-140605140030484-oozie-oozi-W@join_2] Workflow completed [FAILED], failing [0] running nodes
Oozie wf_actions table for the relevant workflow:

                               id                               |           name            | signal_value | status |        transition         |     execution_path
----------------------------------------------------------------+---------------------------+--------------+--------+---------------------------+------------------------
 0000000-140605140030484-oozie-oozi-W@:start:                   | :start:                   | OK           | OK     | start_0                   | /
 0000000-140605140030484-oozie-oozi-W@start_0                   | start_0                   | OK           | OK     | provisionFile_file_in_0_1 | /
 0000000-140605140030484-oozie-oozi-W@provisionFile_file_in_0_1 | provisionFile_file_in_0_1 | OK           | OK     | bash_mkdir_2              | /
 0000000-140605140030484-oozie-oozi-W@bash_mkdir_2              | bash_mkdir_2              | OK           | OK     | fork_2                    | /
 0000000-140605140030484-oozie-oozi-W@fork_2                    | fork_2                    | OK           | OK     | *                         | /
 0000000-140605140030484-oozie-oozi-W@bash_cp_3                 | bash_cp_3                 | OK           | OK     | join_2                    | /bash_cp_3/
 0000000-140605140030484-oozie-oozi-W@bash_cp_4                 | bash_cp_4                 | OK           | OK     | join_2                    | /bash_cp_4/
 0000000-140605140030484-oozie-oozi-W@bash_cp_6                 | bash_cp_6                 | OK           | OK     | join_2                    | /bash_cp_6/
 0000000-140605140030484-oozie-oozi-W@bash_cp_8                 | bash_cp_8                 | OK           | OK     | join_2                    | /bash_cp_8/
 0000000-140605140030484-oozie-oozi-W@bash_cp_10                | bash_cp_10                | OK           | OK     | join_2                    | /bash_cp_10/
 0000000-140605140030484-oozie-oozi-W@bash_cp_12                | bash_cp_12                | OK           | OK     | join_2                    | /bash_cp_12/
 0000000-140605140030484-oozie-oozi-W@join_2                    | join_2                    | OK           | OK     | fork_3                    | /bash_cp_3/
 0000000-140605140030484-oozie-oozi-W@fork_3                    | fork_3                    | OK           | OK     | *                         | /
 0000000-140605140030484-oozie-oozi-W@provisionFile_out_5       | provisionFile_out_5       | OK           | OK     | join_3                    | /provisionFile_out_5/
 0000000-140605140030484-oozie-oozi-W@provisionFile_out_7       | provisionFile_out_7       | OK           | OK     | join_3                    | /provisionFile_out_7/
 0000000-140605140030484-oozie-oozi-W@provisionFile_out_9       | provisionFile_out_9       | OK           | OK     | join_3                    | /provisionFile_out_9/
 0000000-140605140030484-oozie-oozi-W@provisionFile_out_11      | provisionFile_out_11      | OK           | OK     | join_3                    | /provisionFile_out_11/
 0000000-140605140030484-oozie-oozi-W@provisionFile_out_13      | provisionFile_out_13      | OK           | OK     | join_3                    | /provisionFile_out_13/
 0000000-140605140030484-oozie-oozi-W@fail                      | fail                      | OK           | OK     |                           | /bash_cp_14/
(19 rows)


The workflow:

<?xml version="1.0" encoding="UTF-8"?>
<workflow-app xmlns="uri:oozie:workflow:0.4" name="HelloWorld">
  <start to="start_0" />
  <action name="start_0" retry-max="5" retry-interval="5">
    <sge xmlns="uri:oozie:sge-action:1.0">
      <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/start_0-runner.sh</script>
      <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/start_0-qsub.opts</options-file>
    </sge>
    <ok to="provisionFile_file_in_0_1" />
    <error to="fail" />
  </action>
  <action name="provisionFile_file_in_0_1" retry-max="5" retry-interval="5">
    <sge xmlns="uri:oozie:sge-action:1.0">
      <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_file_in_0_1-runner.sh</script>
      <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_file_in_0_1-qsub.opts</options-file>
    </sge>
    <ok to="bash_mkdir_2" />
    <error to="fail" />
  </action>
  <action name="bash_mkdir_2" retry-max="5" retry-interval="5">
    <sge xmlns="uri:oozie:sge-action:1.0">
      <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_mkdir_2-runner.sh</script>
      <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_mkdir_2-qsub.opts</options-file>
    </sge>
    <ok to="fork_2" />
    <error to="fail" />
  </action>
  <fork name="fork_2">
    <path start="bash_cp_3" />
    <path start="bash_cp_4" />
    <path start="bash_cp_6" />
    <path start="bash_cp_8" />
    <path start="bash_cp_10" />
    <path start="bash_cp_12" />
  </fork>
  <action name="bash_cp_3" retry-max="5" retry-interval="5">
    <sge xmlns="uri:oozie:sge-action:1.0">
      <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_3-runner.sh</script>
      <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_3-qsub.opts</options-file>
    </sge>
    <ok to="join_2" />
    <error to="fail" />
  </action>
  <action name="bash_cp_4" retry-max="5" retry-interval="5">
    <sge xmlns="uri:oozie:sge-action:1.0">
      <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_4-runner.sh</script>
      <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_4-qsub.opts</options-file>
    </sge>
    <ok to="join_2" />
    <error to="fail" />
  </action>
  <action name="bash_cp_6" retry-max="5" retry-interval="5">
    <sge xmlns="uri:oozie:sge-action:1.0">
      <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_6-runner.sh</script>
      <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_6-qsub.opts</options-file>
    </sge>
    <ok to="join_2" />
    <error to="fail" />
  </action>
  <action name="bash_cp_8" retry-max="5" retry-interval="5">
    <sge xmlns="uri:oozie:sge-action:1.0">
      <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_8-runner.sh</script>
      <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_8-qsub.opts</options-file>
    </sge>
    <ok to="join_2" />
    <error to="fail" />
  </action>
  <action name="bash_cp_10" retry-max="5" retry-interval="5">
    <sge xmlns="uri:oozie:sge-action:1.0">
      <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_10-runner.sh</script>
      <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_10-qsub.opts</options-file>
    </sge>
    <ok to="join_2" />
    <error to="fail" />
  </action>
  <action name="bash_cp_12" retry-max="5" retry-interval="5">
    <sge xmlns="uri:oozie:sge-action:1.0">
      <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_12-runner.sh</script>
      <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_12-qsub.opts</options-file>
    </sge>
    <ok to="join_2" />
    <error to="fail" />
  </action>
  <join name="join_2" to="fork_3" />
  <fork name="fork_3">
    <path start="bash_cp_14" />
    <path start="provisionFile_out_5" />
    <path start="provisionFile_out_7" />
    <path start="provisionFile_out_9" />
    <path start="provisionFile_out_11" />
    <path start="provisionFile_out_13" />
  </fork>
  <action name="bash_cp_14" retry-max="5" retry-interval="5">
    <sge xmlns="uri:oozie:sge-action:1.0">
      <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_14-runner.sh</script>
      <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_14-qsub.opts</options-file>
    </sge>
    <ok to="join_3" />
    <error to="fail" />
  </action>
  <action name="provisionFile_out_5" retry-max="5" retry-interval="5">
    <sge xmlns="uri:oozie:sge-action:1.0">
      <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_out_5-runner.sh</script>
      <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_out_5-qsub.opts</options-file>
    </sge>
    <ok to="join_3" />
    <error to="fail" />
  </action>
  <action name="provisionFile_out_7" retry-max="5" retry-interval="5">
    <sge xmlns="uri:oozie:sge-action:1.0">
      <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_out_7-runner.sh</script>
      <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_out_7-qsub.opts</options-file>
    </sge>
    <ok to="join_3" />
    <error to="fail" />
  </action>
  <action name="provisionFile_out_9" retry-max="5" retry-interval="5">
    <sge xmlns="uri:oozie:sge-action:1.0">
      <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_out_9-runner.sh</script>
      <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_out_9-qsub.opts</options-file>
    </sge>
    <ok to="join_3" />
    <error to="fail" />
  </action>
  <action name="provisionFile_out_11" retry-max="5" retry-interval="5">
    <sge xmlns="uri:oozie:sge-action:1.0">
      <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_out_11-runner.sh</script>
      <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_out_11-qsub.opts</options-file>
    </sge>
    <ok to="join_3" />
    <error to="fail" />
  </action>
  <action name="provisionFile_out_13" retry-max="5" retry-interval="5">
    <sge xmlns="uri:oozie:sge-action:1.0">
      <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_out_13-runner.sh</script>
      <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_out_13-qsub.opts</options-file>
    </sge>
    <ok to="join_3" />
    <error to="fail" />
  </action>
  <join name="join_3" to="fork_4" />
  <fork name="fork_4">
    <path start="bash_cp_15" />
    <path start="bash_cp_17" />
    <path start="bash_cp_19" />
    <path start="bash_cp_21" />
    <path start="bash_cp_23" />
  </fork>
  <action name="bash_cp_15" retry-max="5" retry-interval="5">
    <sge xmlns="uri:oozie:sge-action:1.0">
      <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_15-runner.sh</script>
      <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_15-qsub.opts</options-file>
    </sge>
    <ok to="join_4" />
    <error to="fail" />
  </action>
  <action name="bash_cp_17" retry-max="5" retry-interval="5">
    <sge xmlns="uri:oozie:sge-action:1.0">
      <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_17-runner.sh</script>
      <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_17-qsub.opts</options-file>
    </sge>
    <ok to="join_4" />
    <error to="fail" />
  </action>
  <action name="bash_cp_19" retry-max="5" retry-interval="5">
    <sge xmlns="uri:oozie:sge-action:1.0">
      <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_19-runner.sh</script>
      <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_19-qsub.opts</options-file>
    </sge>
    <ok to="join_4" />
    <error to="fail" />
  </action>
  <action name="bash_cp_21" retry-max="5" retry-interval="5">
    <sge xmlns="uri:oozie:sge-action:1.0">
      <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_21-runner.sh</script>
      <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_21-qsub.opts</options-file>
    </sge>
    <ok to="join_4" />
    <error to="fail" />
  </action>
  <action name="bash_cp_23" retry-max="5" retry-interval="5">
    <sge xmlns="uri:oozie:sge-action:1.0">
      <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_23-runner.sh</script>
      <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_23-qsub.opts</options-file>
    </sge>
    <ok to="join_4" />
    <error to="fail" />
  </action>
  <join name="join_4" to="fork_5" />
  <fork name="fork_5">
    <path start="provisionFile_out_16" />
    <path start="provisionFile_out_18" />
    <path start="provisionFile_out_20" />
    <path start="provisionFile_out_22" />
    <path start="provisionFile_out_24" />
  </fork>
  <action name="provisionFile_out_16" retry-max="5" retry-interval="5">
    <sge xmlns="uri:oozie:sge-action:1.0">
      <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_out_16-runner.sh</script>
      <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_out_16-qsub.opts</options-file>
    </sge>
    <ok to="join_5" />
    <error to="fail" />
  </action>
  <action name="provisionFile_out_18" retry-max="5" retry-interval="5">
    <sge xmlns="uri:oozie:sge-action:1.0">
      <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_out_18-runner.sh</script>
      <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_out_18-qsub.opts</options-file>
    </sge>
    <ok to="join_5" />
    <error to="fail" />
  </action>
  <action name="provisionFile_out_20" retry-max="5" retry-interval="5">
    <sge xmlns="uri:oozie:sge-action:1.0">
      <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_out_20-runner.sh</script>
      <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_out_20-qsub.opts</options-file>
    </sge>
    <ok to="join_5" />
    <error to="fail" />
  </action>
  <action name="provisionFile_out_22" retry-max="5" retry-interval="5">
    <sge xmlns="uri:oozie:sge-action:1.0">
      <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_out_22-runner.sh</script>
      <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_out_22-qsub.opts</options-file>
    </sge>
    <ok to="join_5" />
    <error to="fail" />
  </action>
  <action name="provisionFile_out_24" retry-max="5" retry-interval="5">
    <sge xmlns="uri:oozie:sge-action:1.0">
      <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_out_24-runner.sh</script>
      <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_out_24-qsub.opts</options-file>
    </sge>
    <ok to="join_5" />
    <error to="fail" />
  </action>
  <join name="join_5" to="done" />
  <join name="join_274314800376896" to="done" />
  <action name="done">
    <fs>
      <delete path="hdfs://localhost:8020/user/dyuen/seqware_workflow/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b" />
    </fs>
    <ok to="end" />
    <error to="fail" />
  </action>
  <kill name="fail">
    <message>Java failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
  </kill>
  <end name="end" />
</workflow-app>


RE: Invalid execution path on job rerun

Posted by Denis Yuen <De...@oicr.on.ca>.
Hi,
I think this is exactly what we encountered. Good timing too on the ticket. 
Thanks for the heads-up!

________________________________________
From: Robert Kanter [rkanter@cloudera.com]
Sent: June 20, 2014 1:39 PM
To: user@oozie.apache.org
Subject: Re: Invalid execution path on job rerun

Hi Denis,

This sounds like it could be OOZIE-1879
<https://issues.apache.org/jira/browse/OOZIE-1879>, which I recently
committed a patch for.  In it, the issue was a workflow where an action
after a fork failed, and on rerun, you'd get an "invalid execution path".
 It wasn't easy to track down, but it turns out that during a rerun, Oozie
goes through all of the actions and for a fork, it goes in the order they
are listed in the fork action XML.  But if the forked actions finished in a
different order during the original run, then you'd get this error.  A
workaround would be to list the actions in the fork in the order that
they're likely to complete, but that's probably not really practical.
 Otherwise, you'll need OOZIE-1879 to fix this.

- Robert


On Fri, Jun 20, 2014 at 8:05 AM, Denis Yuen <De...@oicr.on.ca> wrote:

> Hi,
>
> We're running into an issue where workflows that fail and have to be
> re-run (with oozie.wf.rerun.failnodes=true ) immediately fail again with a
> message in the Oozie log "invalid execution path."
>
> The consistent pattern that we observe is that in a workflow with a fork
> (fork_2) leading to a join (join_2) which leads to a fork (fork_3), if a
> failure occurs in the jobs that fork_3 leads to, then on retry, the failure
> will immediately occur. If there is no failure, the workflow executes to
> completion normally.  What we've also observed is that if fork_2 leads to a
> number of jobs (bash_cp_3, bash_cp_4, bash_cp6, bash_cp_8, bash_cp_10,
> bash_cp_12), then the apparently invalid execution paths are any of the
> first five. In other words, if any of the first five are seemingly randomly
> set by Oozie in Oozie's wf_actions table for the execution_path for join_2,
> the re-run will fail. Only if "bash_cp_12" is set then the workflow will
> successfully re-run.
>
> Another thing that might be relevant is that we are using a custom action
> executor that submits to SGE (for legacy reasons). The code is available at
> https://github.com/SeqWare/oozie-sge/tree/1.0.2 This is with Oozie
> version 3.3.2-cdh4.5.0
>
> Are there any thoughts on whether there is some API call that we're
> failing to make in our custom action executor that affects execution path?
> Are we structuring our workflows in some unexpected manner?
> What is the meaning of an execution path for a control node such as join
> anyways?
>
> Thanks for any insight!
>
> Large amounts of text follow ....
>
> Relevant error in log:
> 2014-06-05 14:06:01,599 DEBUG SignalXCommand:545 - USER[dyuen] GROUP[-]
> TOKEN[] APP[HelloWorld] JOB[0000000-140605140030484-oozie-oozi-W]
> ACTION[0000000-140605140030484-oozie-oozi-W@join_2] STARTED SignalCommand
> for jobid=0000000-140605140030484-oozie-oozi-W,
> actionId=0000000-140605140030484-oozie-oozi-W@join_2
> 2014-06-05 14:06:01,600 DEBUG LiteWorkflowInstance:545 - USER[dyuen]
> GROUP[-] TOKEN[] APP[HelloWorld] JOB[0000000-140605140030484-oozie-oozi-W]
> ACTION[0000000-140605140030484-oozie-oozi-W@join_2] Signaling job
> execution path [/bash_cp_3/] signal value [OK]
> 2014-06-05 14:06:01,600 ERROR LiteWorkflowInstance:536 - USER[dyuen]
> GROUP[-] TOKEN[] APP[HelloWorld] JOB[0000000-140605140030484-oozie-oozi-W]
> ACTION[0000000-140605140030484-oozie-oozi-W@join_2] invalid execution
> path [/bash_cp_3/]
> 2014-06-05 14:06:01,601  WARN LiteWorkflowInstance:542 - USER[dyuen]
> GROUP[-] TOKEN[] APP[HelloWorld] JOB[0000000-140605140030484-oozie-oozi-W]
> ACTION[0000000-140605140030484-oozie-oozi-W@join_2] Workflow completed
> [FAILED], failing [0] running nodes
> Oozie wf_actions table for the relevant workflow:
>
>                                id                               |
>   name            | signal_value | status |        transition         |
> execution_path
>
> ----------------------------------------------------------------+---------------------------+--------------+--------+---------------------------+------------------------
>  0000000-140605140030484-oozie-oozi-W@:start:                   | :start:
>                   | OK           | OK     | start_0                   | /
>  0000000-140605140030484-oozie-oozi-W@start_0                   | start_0
>                   | OK           | OK     | provisionFile_file_in_0_1 | /
>  0000000-140605140030484-oozie-oozi-W@provisionFile_file_in_0_1 |
> provisionFile_file_in_0_1 | OK           | OK     | bash_mkdir_2
>    | /
>  0000000-140605140030484-oozie-oozi-W@bash_mkdir_2              |
> bash_mkdir_2              | OK           | OK     | fork_2
>    | /
>  0000000-140605140030484-oozie-oozi-W@fork_2                    | fork_2
>                    | OK           | OK     | *                         | /
>  0000000-140605140030484-oozie-oozi-W@bash_cp_3                 |
> bash_cp_3                 | OK           | OK     | join_2
>    | /bash_cp_3/
>  0000000-140605140030484-oozie-oozi-W@bash_cp_4                 |
> bash_cp_4                 | OK           | OK     | join_2
>    | /bash_cp_4/
>  0000000-140605140030484-oozie-oozi-W@bash_cp_6                 |
> bash_cp_6                 | OK           | OK     | join_2
>    | /bash_cp_6/
>  0000000-140605140030484-oozie-oozi-W@bash_cp_8                 |
> bash_cp_8                 | OK           | OK     | join_2
>    | /bash_cp_8/
>  0000000-140605140030484-oozie-oozi-W@bash_cp_10                |
> bash_cp_10                | OK           | OK     | join_2
>    | /bash_cp_10/
>  0000000-140605140030484-oozie-oozi-W@bash_cp_12                |
> bash_cp_12                | OK           | OK     | join_2
>    | /bash_cp_12/
>  0000000-140605140030484-oozie-oozi-W@join_2                    | join_2
>                    | OK           | OK     | fork_3                    |
> /bash_cp_3/
>  0000000-140605140030484-oozie-oozi-W@fork_3                    | fork_3
>                    | OK           | OK     | *                         | /
>  0000000-140605140030484-oozie-oozi-W@provisionFile_out_5       |
> provisionFile_out_5       | OK           | OK     | join_3
>    | /provisionFile_out_5/
>  0000000-140605140030484-oozie-oozi-W@provisionFile_out_7       |
> provisionFile_out_7       | OK           | OK     | join_3
>    | /provisionFile_out_7/
>  0000000-140605140030484-oozie-oozi-W@provisionFile_out_9       |
> provisionFile_out_9       | OK           | OK     | join_3
>    | /provisionFile_out_9/
>  0000000-140605140030484-oozie-oozi-W@provisionFile_out_11      |
> provisionFile_out_11      | OK           | OK     | join_3
>    | /provisionFile_out_11/
>  0000000-140605140030484-oozie-oozi-W@provisionFile_out_13      |
> provisionFile_out_13      | OK           | OK     | join_3
>    | /provisionFile_out_13/
>  0000000-140605140030484-oozie-oozi-W@fail                      | fail
>                    | OK           | OK     |                           |
> /bash_cp_14/
> (19 rows)
>
>
> The workflow:
>
> <?xml version="1.0" encoding="UTF-8"?>
> <workflow-app xmlns="uri:oozie:workflow:0.4" name="HelloWorld">
>   <start to="start_0" />
>   <action name="start_0" retry-max="5" retry-interval="5">
>     <sge xmlns="uri:oozie:sge-action:1.0">
>
> <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/start_0-runner.sh</script>
>
> <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/start_0-qsub.opts</options-file>
>     </sge>
>     <ok to="provisionFile_file_in_0_1" />
>     <error to="fail" />
>   </action>
>   <action name="provisionFile_file_in_0_1" retry-max="5"
> retry-interval="5">
>     <sge xmlns="uri:oozie:sge-action:1.0">
>
> <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_file_in_0_1-runner.sh</script>
>
> <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_file_in_0_1-qsub.opts</options-file>
>     </sge>
>     <ok to="bash_mkdir_2" />
>     <error to="fail" />
>   </action>
>   <action name="bash_mkdir_2" retry-max="5" retry-interval="5">
>     <sge xmlns="uri:oozie:sge-action:1.0">
>
> <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_mkdir_2-runner.sh</script>
>
> <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_mkdir_2-qsub.opts</options-file>
>     </sge>
>     <ok to="fork_2" />
>     <error to="fail" />
>   </action>
>   <fork name="fork_2">
>     <path start="bash_cp_3" />
>     <path start="bash_cp_4" />
>     <path start="bash_cp_6" />
>     <path start="bash_cp_8" />
>     <path start="bash_cp_10" />
>     <path start="bash_cp_12" />
>   </fork>
>   <action name="bash_cp_3" retry-max="5" retry-interval="5">
>     <sge xmlns="uri:oozie:sge-action:1.0">
>
> <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_3-runner.sh</script>
>
> <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_3-qsub.opts</options-file>
>     </sge>
>     <ok to="join_2" />
>     <error to="fail" />
>   </action>
>   <action name="bash_cp_4" retry-max="5" retry-interval="5">
>     <sge xmlns="uri:oozie:sge-action:1.0">
>
> <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_4-runner.sh</script>
>
> <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_4-qsub.opts</options-file>
>     </sge>
>     <ok to="join_2" />
>     <error to="fail" />
>   </action>
>   <action name="bash_cp_6" retry-max="5" retry-interval="5">
>     <sge xmlns="uri:oozie:sge-action:1.0">
>
> <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_6-runner.sh</script>
>
> <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_6-qsub.opts</options-file>
>     </sge>
>     <ok to="join_2" />
>     <error to="fail" />
>   </action>
>   <action name="bash_cp_8" retry-max="5" retry-interval="5">
>     <sge xmlns="uri:oozie:sge-action:1.0">
>
> <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_8-runner.sh</script>
>
> <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_8-qsub.opts</options-file>
>     </sge>
>     <ok to="join_2" />
>     <error to="fail" />
>   </action>
>   <action name="bash_cp_10" retry-max="5" retry-interval="5">
>     <sge xmlns="uri:oozie:sge-action:1.0">
>
> <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_10-runner.sh</script>
>
> <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_10-qsub.opts</options-file>
>     </sge>
>     <ok to="join_2" />
>     <error to="fail" />
>   </action>
>   <action name="bash_cp_12" retry-max="5" retry-interval="5">
>     <sge xmlns="uri:oozie:sge-action:1.0">
>
> <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_12-runner.sh</script>
>
> <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_12-qsub.opts</options-file>
>     </sge>
>     <ok to="join_2" />
>     <error to="fail" />
>   </action>
>   <join name="join_2" to="fork_3" />
>   <fork name="fork_3">
>     <path start="bash_cp_14" />
>     <path start="provisionFile_out_5" />
>     <path start="provisionFile_out_7" />
>     <path start="provisionFile_out_9" />
>     <path start="provisionFile_out_11" />
>     <path start="provisionFile_out_13" />
>   </fork>
>   <action name="bash_cp_14" retry-max="5" retry-interval="5">
>     <sge xmlns="uri:oozie:sge-action:1.0">
>
> <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_14-runner.sh</script>
>
> <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_14-qsub.opts</options-file>
>     </sge>
>     <ok to="join_3" />
>     <error to="fail" />
>   </action>
>   <action name="provisionFile_out_5" retry-max="5" retry-interval="5">
>     <sge xmlns="uri:oozie:sge-action:1.0">
>
> <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_out_5-runner.sh</script>
>
> <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_out_5-qsub.opts</options-file>
>     </sge>
>     <ok to="join_3" />
>     <error to="fail" />
>   </action>
>   <action name="provisionFile_out_7" retry-max="5" retry-interval="5">
>     <sge xmlns="uri:oozie:sge-action:1.0">
>
> <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_out_7-runner.sh</script>
>
> <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_out_7-qsub.opts</options-file>
>     </sge>
>     <ok to="join_3" />
>     <error to="fail" />
>   </action>
>   <action name="provisionFile_out_9" retry-max="5" retry-interval="5">
>     <sge xmlns="uri:oozie:sge-action:1.0">
>
> <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_out_9-runner.sh</script>
>
> <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_out_9-qsub.opts</options-file>
>     </sge>
>     <ok to="join_3" />
>     <error to="fail" />
>   </action>
>   <action name="provisionFile_out_11" retry-max="5" retry-interval="5">
>     <sge xmlns="uri:oozie:sge-action:1.0">
>
> <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_out_11-runner.sh</script>
>
> <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_out_11-qsub.opts</options-file>
>     </sge>
>     <ok to="join_3" />
>     <error to="fail" />
>   </action>
>   <action name="provisionFile_out_13" retry-max="5" retry-interval="5">
>     <sge xmlns="uri:oozie:sge-action:1.0">
>
> <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_out_13-runner.sh</script>
>
> <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_out_13-qsub.opts</options-file>
>     </sge>
>     <ok to="join_3" />
>     <error to="fail" />
>   </action>
>   <join name="join_3" to="fork_4" />
>   <fork name="fork_4">
>     <path start="bash_cp_15" />
>     <path start="bash_cp_17" />
>     <path start="bash_cp_19" />
>     <path start="bash_cp_21" />
>     <path start="bash_cp_23" />
>   </fork>
>   <action name="bash_cp_15" retry-max="5" retry-interval="5">
>     <sge xmlns="uri:oozie:sge-action:1.0">
>
> <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_15-runner.sh</script>
>
> <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_15-qsub.opts</options-file>
>     </sge>
>     <ok to="join_4" />
>     <error to="fail" />
>   </action>
>   <action name="bash_cp_17" retry-max="5" retry-interval="5">
>     <sge xmlns="uri:oozie:sge-action:1.0">
>
> <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_17-runner.sh</script>
>
> <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_17-qsub.opts</options-file>
>     </sge>
>     <ok to="join_4" />
>     <error to="fail" />
>   </action>
>   <action name="bash_cp_19" retry-max="5" retry-interval="5">
>     <sge xmlns="uri:oozie:sge-action:1.0">
>
> <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_19-runner.sh</script>
>
> <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_19-qsub.opts</options-file>
>     </sge>
>     <ok to="join_4" />
>     <error to="fail" />
>   </action>
>   <action name="bash_cp_21" retry-max="5" retry-interval="5">
>     <sge xmlns="uri:oozie:sge-action:1.0">
>
> <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_21-runner.sh</script>
>
> <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_21-qsub.opts</options-file>
>     </sge>
>     <ok to="join_4" />
>     <error to="fail" />
>   </action>
>   <action name="bash_cp_23" retry-max="5" retry-interval="5">
>     <sge xmlns="uri:oozie:sge-action:1.0">
>
> <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_23-runner.sh</script>
>
> <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_23-qsub.opts</options-file>
>     </sge>
>     <ok to="join_4" />
>     <error to="fail" />
>   </action>
>   <join name="join_4" to="fork_5" />
>   <fork name="fork_5">
>     <path start="provisionFile_out_16" />
>     <path start="provisionFile_out_18" />
>     <path start="provisionFile_out_20" />
>     <path start="provisionFile_out_22" />
>     <path start="provisionFile_out_24" />
>   </fork>
>   <action name="provisionFile_out_16" retry-max="5" retry-interval="5">
>     <sge xmlns="uri:oozie:sge-action:1.0">
>
> <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_out_16-runner.sh</script>
>
> <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_out_16-qsub.opts</options-file>
>     </sge>
>     <ok to="join_5" />
>     <error to="fail" />
>   </action>
>   <action name="provisionFile_out_18" retry-max="5" retry-interval="5">
>     <sge xmlns="uri:oozie:sge-action:1.0">
>
> <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_out_18-runner.sh</script>
>
> <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_out_18-qsub.opts</options-file>
>     </sge>
>     <ok to="join_5" />
>     <error to="fail" />
>   </action>
>   <action name="provisionFile_out_20" retry-max="5" retry-interval="5">
>     <sge xmlns="uri:oozie:sge-action:1.0">
>
> <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_out_20-runner.sh</script>
>
> <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_out_20-qsub.opts</options-file>
>     </sge>
>     <ok to="join_5" />
>     <error to="fail" />
>   </action>
>   <action name="provisionFile_out_22" retry-max="5" retry-interval="5">
>     <sge xmlns="uri:oozie:sge-action:1.0">
>
> <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_out_22-runner.sh</script>
>
> <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_out_22-qsub.opts</options-file>
>     </sge>
>     <ok to="join_5" />
>     <error to="fail" />
>   </action>
>   <action name="provisionFile_out_24" retry-max="5" retry-interval="5">
>     <sge xmlns="uri:oozie:sge-action:1.0">
>
> <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_out_24-runner.sh</script>
>
> <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_out_24-qsub.opts</options-file>
>     </sge>
>     <ok to="join_5" />
>     <error to="fail" />
>   </action>
>   <join name="join_5" to="done" />
>   <join name="join_274314800376896" to="done" />
>   <action name="done">
>     <fs>
>       <delete
> path="hdfs://localhost:8020/user/dyuen/seqware_workflow/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b"
> />
>     </fs>
>     <ok to="end" />
>     <error to="fail" />
>   </action>
>   <kill name="fail">
>     <message>Java failed, error
> message[${wf:errorMessage(wf:lastErrorNode())}]</message>
>   </kill>
>   <end name="end" />
> </workflow-app>
>
>

Re: Invalid execution path on job rerun

Posted by Robert Kanter <rk...@cloudera.com>.
Hi Denis,

This sounds like it could be OOZIE-1879
<https://issues.apache.org/jira/browse/OOZIE-1879>, which I recently
committed a patch for.  In it, the issue was a workflow where an action
after a fork failed, and on rerun, you'd get an "invalid execution path".
 It wasn't easy to track down, but it turns out that during a rerun, Oozie
goes through all of the actions and for a fork, it goes in the order they
are listed in the fork action XML.  But if the forked actions finished in a
different order during the original run, then you'd get this error.  A
workaround would be to list the actions in the fork in the order that
they're likely to complete, but that's probably not really practical.
 Otherwise, you'll need OOZIE-1879 to fix this.

- Robert


On Fri, Jun 20, 2014 at 8:05 AM, Denis Yuen <De...@oicr.on.ca> wrote:

> Hi,
>
> We're running into an issue where workflows that fail and have to be
> re-run (with oozie.wf.rerun.failnodes=true ) immediately fail again with a
> message in the Oozie log "invalid execution path."
>
> The consistent pattern that we observe is that in a workflow with a fork
> (fork_2) leading to a join (join_2) which leads to a fork (fork_3), if a
> failure occurs in the jobs that fork_3 leads to, then on retry, the failure
> will immediately occur. If there is no failure, the workflow executes to
> completion normally.  What we've also observed is that if fork_2 leads to a
> number of jobs (bash_cp_3, bash_cp_4, bash_cp6, bash_cp_8, bash_cp_10,
> bash_cp_12), then the apparently invalid execution paths are any of the
> first five. In other words, if any of the first five are seemingly randomly
> set by Oozie in Oozie's wf_actions table for the execution_path for join_2,
> the re-run will fail. Only if "bash_cp_12" is set then the workflow will
> successfully re-run.
>
> Another thing that might be relevant is that we are using a custom action
> executor that submits to SGE (for legacy reasons). The code is available at
> https://github.com/SeqWare/oozie-sge/tree/1.0.2 This is with Oozie
> version 3.3.2-cdh4.5.0
>
> Are there any thoughts on whether there is some API call that we're
> failing to make in our custom action executor that affects execution path?
> Are we structuring our workflows in some unexpected manner?
> What is the meaning of an execution path for a control node such as join
> anyways?
>
> Thanks for any insight!
>
> Large amounts of text follow ....
>
> Relevant error in log:
> 2014-06-05 14:06:01,599 DEBUG SignalXCommand:545 - USER[dyuen] GROUP[-]
> TOKEN[] APP[HelloWorld] JOB[0000000-140605140030484-oozie-oozi-W]
> ACTION[0000000-140605140030484-oozie-oozi-W@join_2] STARTED SignalCommand
> for jobid=0000000-140605140030484-oozie-oozi-W,
> actionId=0000000-140605140030484-oozie-oozi-W@join_2
> 2014-06-05 14:06:01,600 DEBUG LiteWorkflowInstance:545 - USER[dyuen]
> GROUP[-] TOKEN[] APP[HelloWorld] JOB[0000000-140605140030484-oozie-oozi-W]
> ACTION[0000000-140605140030484-oozie-oozi-W@join_2] Signaling job
> execution path [/bash_cp_3/] signal value [OK]
> 2014-06-05 14:06:01,600 ERROR LiteWorkflowInstance:536 - USER[dyuen]
> GROUP[-] TOKEN[] APP[HelloWorld] JOB[0000000-140605140030484-oozie-oozi-W]
> ACTION[0000000-140605140030484-oozie-oozi-W@join_2] invalid execution
> path [/bash_cp_3/]
> 2014-06-05 14:06:01,601  WARN LiteWorkflowInstance:542 - USER[dyuen]
> GROUP[-] TOKEN[] APP[HelloWorld] JOB[0000000-140605140030484-oozie-oozi-W]
> ACTION[0000000-140605140030484-oozie-oozi-W@join_2] Workflow completed
> [FAILED], failing [0] running nodes
> Oozie wf_actions table for the relevant workflow:
>
>                                id                               |
>   name            | signal_value | status |        transition         |
> execution_path
>
> ----------------------------------------------------------------+---------------------------+--------------+--------+---------------------------+------------------------
>  0000000-140605140030484-oozie-oozi-W@:start:                   | :start:
>                   | OK           | OK     | start_0                   | /
>  0000000-140605140030484-oozie-oozi-W@start_0                   | start_0
>                   | OK           | OK     | provisionFile_file_in_0_1 | /
>  0000000-140605140030484-oozie-oozi-W@provisionFile_file_in_0_1 |
> provisionFile_file_in_0_1 | OK           | OK     | bash_mkdir_2
>    | /
>  0000000-140605140030484-oozie-oozi-W@bash_mkdir_2              |
> bash_mkdir_2              | OK           | OK     | fork_2
>    | /
>  0000000-140605140030484-oozie-oozi-W@fork_2                    | fork_2
>                    | OK           | OK     | *                         | /
>  0000000-140605140030484-oozie-oozi-W@bash_cp_3                 |
> bash_cp_3                 | OK           | OK     | join_2
>    | /bash_cp_3/
>  0000000-140605140030484-oozie-oozi-W@bash_cp_4                 |
> bash_cp_4                 | OK           | OK     | join_2
>    | /bash_cp_4/
>  0000000-140605140030484-oozie-oozi-W@bash_cp_6                 |
> bash_cp_6                 | OK           | OK     | join_2
>    | /bash_cp_6/
>  0000000-140605140030484-oozie-oozi-W@bash_cp_8                 |
> bash_cp_8                 | OK           | OK     | join_2
>    | /bash_cp_8/
>  0000000-140605140030484-oozie-oozi-W@bash_cp_10                |
> bash_cp_10                | OK           | OK     | join_2
>    | /bash_cp_10/
>  0000000-140605140030484-oozie-oozi-W@bash_cp_12                |
> bash_cp_12                | OK           | OK     | join_2
>    | /bash_cp_12/
>  0000000-140605140030484-oozie-oozi-W@join_2                    | join_2
>                    | OK           | OK     | fork_3                    |
> /bash_cp_3/
>  0000000-140605140030484-oozie-oozi-W@fork_3                    | fork_3
>                    | OK           | OK     | *                         | /
>  0000000-140605140030484-oozie-oozi-W@provisionFile_out_5       |
> provisionFile_out_5       | OK           | OK     | join_3
>    | /provisionFile_out_5/
>  0000000-140605140030484-oozie-oozi-W@provisionFile_out_7       |
> provisionFile_out_7       | OK           | OK     | join_3
>    | /provisionFile_out_7/
>  0000000-140605140030484-oozie-oozi-W@provisionFile_out_9       |
> provisionFile_out_9       | OK           | OK     | join_3
>    | /provisionFile_out_9/
>  0000000-140605140030484-oozie-oozi-W@provisionFile_out_11      |
> provisionFile_out_11      | OK           | OK     | join_3
>    | /provisionFile_out_11/
>  0000000-140605140030484-oozie-oozi-W@provisionFile_out_13      |
> provisionFile_out_13      | OK           | OK     | join_3
>    | /provisionFile_out_13/
>  0000000-140605140030484-oozie-oozi-W@fail                      | fail
>                    | OK           | OK     |                           |
> /bash_cp_14/
> (19 rows)
>
>
> The workflow:
>
> <?xml version="1.0" encoding="UTF-8"?>
> <workflow-app xmlns="uri:oozie:workflow:0.4" name="HelloWorld">
>   <start to="start_0" />
>   <action name="start_0" retry-max="5" retry-interval="5">
>     <sge xmlns="uri:oozie:sge-action:1.0">
>
> <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/start_0-runner.sh</script>
>
> <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/start_0-qsub.opts</options-file>
>     </sge>
>     <ok to="provisionFile_file_in_0_1" />
>     <error to="fail" />
>   </action>
>   <action name="provisionFile_file_in_0_1" retry-max="5"
> retry-interval="5">
>     <sge xmlns="uri:oozie:sge-action:1.0">
>
> <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_file_in_0_1-runner.sh</script>
>
> <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_file_in_0_1-qsub.opts</options-file>
>     </sge>
>     <ok to="bash_mkdir_2" />
>     <error to="fail" />
>   </action>
>   <action name="bash_mkdir_2" retry-max="5" retry-interval="5">
>     <sge xmlns="uri:oozie:sge-action:1.0">
>
> <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_mkdir_2-runner.sh</script>
>
> <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_mkdir_2-qsub.opts</options-file>
>     </sge>
>     <ok to="fork_2" />
>     <error to="fail" />
>   </action>
>   <fork name="fork_2">
>     <path start="bash_cp_3" />
>     <path start="bash_cp_4" />
>     <path start="bash_cp_6" />
>     <path start="bash_cp_8" />
>     <path start="bash_cp_10" />
>     <path start="bash_cp_12" />
>   </fork>
>   <action name="bash_cp_3" retry-max="5" retry-interval="5">
>     <sge xmlns="uri:oozie:sge-action:1.0">
>
> <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_3-runner.sh</script>
>
> <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_3-qsub.opts</options-file>
>     </sge>
>     <ok to="join_2" />
>     <error to="fail" />
>   </action>
>   <action name="bash_cp_4" retry-max="5" retry-interval="5">
>     <sge xmlns="uri:oozie:sge-action:1.0">
>
> <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_4-runner.sh</script>
>
> <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_4-qsub.opts</options-file>
>     </sge>
>     <ok to="join_2" />
>     <error to="fail" />
>   </action>
>   <action name="bash_cp_6" retry-max="5" retry-interval="5">
>     <sge xmlns="uri:oozie:sge-action:1.0">
>
> <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_6-runner.sh</script>
>
> <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_6-qsub.opts</options-file>
>     </sge>
>     <ok to="join_2" />
>     <error to="fail" />
>   </action>
>   <action name="bash_cp_8" retry-max="5" retry-interval="5">
>     <sge xmlns="uri:oozie:sge-action:1.0">
>
> <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_8-runner.sh</script>
>
> <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_8-qsub.opts</options-file>
>     </sge>
>     <ok to="join_2" />
>     <error to="fail" />
>   </action>
>   <action name="bash_cp_10" retry-max="5" retry-interval="5">
>     <sge xmlns="uri:oozie:sge-action:1.0">
>
> <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_10-runner.sh</script>
>
> <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_10-qsub.opts</options-file>
>     </sge>
>     <ok to="join_2" />
>     <error to="fail" />
>   </action>
>   <action name="bash_cp_12" retry-max="5" retry-interval="5">
>     <sge xmlns="uri:oozie:sge-action:1.0">
>
> <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_12-runner.sh</script>
>
> <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_12-qsub.opts</options-file>
>     </sge>
>     <ok to="join_2" />
>     <error to="fail" />
>   </action>
>   <join name="join_2" to="fork_3" />
>   <fork name="fork_3">
>     <path start="bash_cp_14" />
>     <path start="provisionFile_out_5" />
>     <path start="provisionFile_out_7" />
>     <path start="provisionFile_out_9" />
>     <path start="provisionFile_out_11" />
>     <path start="provisionFile_out_13" />
>   </fork>
>   <action name="bash_cp_14" retry-max="5" retry-interval="5">
>     <sge xmlns="uri:oozie:sge-action:1.0">
>
> <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_14-runner.sh</script>
>
> <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_14-qsub.opts</options-file>
>     </sge>
>     <ok to="join_3" />
>     <error to="fail" />
>   </action>
>   <action name="provisionFile_out_5" retry-max="5" retry-interval="5">
>     <sge xmlns="uri:oozie:sge-action:1.0">
>
> <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_out_5-runner.sh</script>
>
> <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_out_5-qsub.opts</options-file>
>     </sge>
>     <ok to="join_3" />
>     <error to="fail" />
>   </action>
>   <action name="provisionFile_out_7" retry-max="5" retry-interval="5">
>     <sge xmlns="uri:oozie:sge-action:1.0">
>
> <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_out_7-runner.sh</script>
>
> <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_out_7-qsub.opts</options-file>
>     </sge>
>     <ok to="join_3" />
>     <error to="fail" />
>   </action>
>   <action name="provisionFile_out_9" retry-max="5" retry-interval="5">
>     <sge xmlns="uri:oozie:sge-action:1.0">
>
> <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_out_9-runner.sh</script>
>
> <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_out_9-qsub.opts</options-file>
>     </sge>
>     <ok to="join_3" />
>     <error to="fail" />
>   </action>
>   <action name="provisionFile_out_11" retry-max="5" retry-interval="5">
>     <sge xmlns="uri:oozie:sge-action:1.0">
>
> <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_out_11-runner.sh</script>
>
> <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_out_11-qsub.opts</options-file>
>     </sge>
>     <ok to="join_3" />
>     <error to="fail" />
>   </action>
>   <action name="provisionFile_out_13" retry-max="5" retry-interval="5">
>     <sge xmlns="uri:oozie:sge-action:1.0">
>
> <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_out_13-runner.sh</script>
>
> <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_out_13-qsub.opts</options-file>
>     </sge>
>     <ok to="join_3" />
>     <error to="fail" />
>   </action>
>   <join name="join_3" to="fork_4" />
>   <fork name="fork_4">
>     <path start="bash_cp_15" />
>     <path start="bash_cp_17" />
>     <path start="bash_cp_19" />
>     <path start="bash_cp_21" />
>     <path start="bash_cp_23" />
>   </fork>
>   <action name="bash_cp_15" retry-max="5" retry-interval="5">
>     <sge xmlns="uri:oozie:sge-action:1.0">
>
> <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_15-runner.sh</script>
>
> <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_15-qsub.opts</options-file>
>     </sge>
>     <ok to="join_4" />
>     <error to="fail" />
>   </action>
>   <action name="bash_cp_17" retry-max="5" retry-interval="5">
>     <sge xmlns="uri:oozie:sge-action:1.0">
>
> <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_17-runner.sh</script>
>
> <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_17-qsub.opts</options-file>
>     </sge>
>     <ok to="join_4" />
>     <error to="fail" />
>   </action>
>   <action name="bash_cp_19" retry-max="5" retry-interval="5">
>     <sge xmlns="uri:oozie:sge-action:1.0">
>
> <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_19-runner.sh</script>
>
> <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_19-qsub.opts</options-file>
>     </sge>
>     <ok to="join_4" />
>     <error to="fail" />
>   </action>
>   <action name="bash_cp_21" retry-max="5" retry-interval="5">
>     <sge xmlns="uri:oozie:sge-action:1.0">
>
> <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_21-runner.sh</script>
>
> <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_21-qsub.opts</options-file>
>     </sge>
>     <ok to="join_4" />
>     <error to="fail" />
>   </action>
>   <action name="bash_cp_23" retry-max="5" retry-interval="5">
>     <sge xmlns="uri:oozie:sge-action:1.0">
>
> <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_23-runner.sh</script>
>
> <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/bash_cp_23-qsub.opts</options-file>
>     </sge>
>     <ok to="join_4" />
>     <error to="fail" />
>   </action>
>   <join name="join_4" to="fork_5" />
>   <fork name="fork_5">
>     <path start="provisionFile_out_16" />
>     <path start="provisionFile_out_18" />
>     <path start="provisionFile_out_20" />
>     <path start="provisionFile_out_22" />
>     <path start="provisionFile_out_24" />
>   </fork>
>   <action name="provisionFile_out_16" retry-max="5" retry-interval="5">
>     <sge xmlns="uri:oozie:sge-action:1.0">
>
> <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_out_16-runner.sh</script>
>
> <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_out_16-qsub.opts</options-file>
>     </sge>
>     <ok to="join_5" />
>     <error to="fail" />
>   </action>
>   <action name="provisionFile_out_18" retry-max="5" retry-interval="5">
>     <sge xmlns="uri:oozie:sge-action:1.0">
>
> <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_out_18-runner.sh</script>
>
> <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_out_18-qsub.opts</options-file>
>     </sge>
>     <ok to="join_5" />
>     <error to="fail" />
>   </action>
>   <action name="provisionFile_out_20" retry-max="5" retry-interval="5">
>     <sge xmlns="uri:oozie:sge-action:1.0">
>
> <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_out_20-runner.sh</script>
>
> <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_out_20-qsub.opts</options-file>
>     </sge>
>     <ok to="join_5" />
>     <error to="fail" />
>   </action>
>   <action name="provisionFile_out_22" retry-max="5" retry-interval="5">
>     <sge xmlns="uri:oozie:sge-action:1.0">
>
> <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_out_22-runner.sh</script>
>
> <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_out_22-qsub.opts</options-file>
>     </sge>
>     <ok to="join_5" />
>     <error to="fail" />
>   </action>
>   <action name="provisionFile_out_24" retry-max="5" retry-interval="5">
>     <sge xmlns="uri:oozie:sge-action:1.0">
>
> <script>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_out_24-runner.sh</script>
>
> <options-file>/usr/tmp/oozie/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b/generated-scripts/provisionFile_out_24-qsub.opts</options-file>
>     </sge>
>     <ok to="join_5" />
>     <error to="fail" />
>   </action>
>   <join name="join_5" to="done" />
>   <join name="join_274314800376896" to="done" />
>   <action name="done">
>     <fs>
>       <delete
> path="hdfs://localhost:8020/user/dyuen/seqware_workflow/oozie-8d157b87-5f1a-496f-b66c-8374cd05233b"
> />
>     </fs>
>     <ok to="end" />
>     <error to="fail" />
>   </action>
>   <kill name="fail">
>     <message>Java failed, error
> message[${wf:errorMessage(wf:lastErrorNode())}]</message>
>   </kill>
>   <end name="end" />
> </workflow-app>
>
>