You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@oozie.apache.org by Nathan T <na...@vertile.com> on 2014/10/10 21:18:52 UTC

Hadoop Oozie Workflow not getting Coordinator properties

Just posted this question on SO an github... it would be rad if anyone has
some direction:

so:
http://stackoverflow.com/questions/26306538/hadoop-oozie-workflow-not-getting-coordinator-properties
gist:
https://gist.github.com/nathantsoi/dc8caac7109a57c99399#file-awesome-oozie-config-md

I have a simple Oozie coordinator and workflow. I'm trying to pass the
coordinator's dataIn property to the workflow as described here:
https://oozie.apache.org/docs/3.2.0-incubating/CoordinatorFunctionalSpec.html#a6.7.1._coord:dataInString_name_EL_Function

For some reason, the value is empty in the workflow's properties and EL
variable is empty${inputDir} in the following example.

The actual error is: variable [inputDir] cannot be resolved
<https://gist.github.com/nathantsoi/dc8caac7109a57c99399#config>Config
<https://gist.github.com/nathantsoi/dc8caac7109a57c99399#coordinatorxml>
coordinator.xml

<?xml version="1.0" encoding="UTF-8"?>
<coordinator-app xmlns="uri:oozie:coordinator:0.4" name="awesome"
frequency="${coord:days(1)}" start="2014-10-06T00:01Z"
end="2050-01-01T00:01Z" timezone="UTC">
  <controls>
    <!-- Wait 23 hours before giving up -->
    <timeout>1380</timeout>
    <concurrency>1</concurrency>
    <execution>LIFO</execution>
  </controls>
  <datasets>
    <dataset name="itsready" frequency="${coord:days(1)}"
initial-instance="2014-10-06T08:00Z" timezone="America/Los_Angeles">
      <uri-template>${s3DataPath}/${YEAR}-${MONTH}-${DAY}</uri-template>
      <!-- with the done-flag set to none, this will look for the
folder's existance -->
      <done-flag></done-flag>
    </dataset>
    <!-- output dataset -->
    <dataset name="itsdone" frequency="${coord:days(1)}"
initial-instance="2014-10-06T08:00Z" timezone="America/Los_Angeles">
      <uri-template>${dataPath}/awesome/sql-schema-tab-delim-load/${YEAR}-${MONTH}-${DAY}/loaded</uri-template>
    </dataset>
  </datasets>
  <input-events>
    <data-in name="input" dataset="itsready">
      <instance>${coord:current(0)}</instance>
    </data-in>
  </input-events>
  <output-events>
    <data-out name="output" dataset="itsdone">
      <instance>${coord:current(0)}</instance>
    </data-out>
  </output-events>
  <action>
    <workflow>
      <app-path>${workflowApplicationPath}</app-path>
      <configuration>
        <property>
          <name>inputDir</name>
          <value>${coord:dataIn('input')}</value>
        </property>
      </configuration>
    </workflow>
  </action>
</coordinator-app>

<https://gist.github.com/nathantsoi/dc8caac7109a57c99399#workflowxml>
workflow.xml

<?xml version="1.0" encoding="UTF-8"?>
<workflow-app xmlns="uri:oozie:workflow:0.4" name="awesome-wf">
  <start to="shell-import"/>
  <action name="shell-import">
    <shell xmlns="uri:oozie:shell-action:0.2">
      <job-tracker>${jobTracker}</job-tracker>
      <name-node>${nameNode}</name-node>
      <exec>${importFile}</exec>
      <env-var>INPUT_DIR=${inputDir}</env-var>
      <file>${importFile}#${importFile}</file>
    </shell>
    <ok to="end"/>
    <error to="fail"/>
  </action>
  <kill name="fail">
    <message>it failed, error
message[${wf:errorMessage(wf:lastErrorNode())}]</message>
  </kill>
  <end name="end"/>
</workflow-app>

<https://gist.github.com/nathantsoi/dc8caac7109a57c99399#jobproperties>
job.properties

hadoopMaster=myawesome.server.com
nameNode=hdfs://${hadoopMaster}:8020
jobTracker=${hadoopMaster}:8050
tzOffset=-8
oozie.use.system.libpath=true
oozie.libpath=/user/oozie/share/lib
appPath=${nameNode}/apps
dataPath=${appPath}/data
s3DataPath=s3n://an/awesome/s3/data/path

oozie.wf.action.notification.url=https://zapier.com/mysecreturl
workflowApplicationPath=${appPath}/awesome

oozie.coord.application.path=${workflowApplicationPath}

importFile=import.sh

Re: Hadoop Oozie Workflow not getting Coordinator properties

Posted by Mohammad Islam <mi...@yahoo.com.INVALID>.
It appears "initial-instance" of the dataset  is after the "start" time of coordinator. It means the first coordinator will get empty for current(0) because any data reference before initial-instance returns empty.

Resolution: please try either to move the initial-instance earlier than start or move start to sometime later.  

More details: https://oozie.apache.org/docs/4.0.1/CoordinatorFunctionalSpec.html#a6.6.10._Dataset_Instance_Resolution_for_Instances_Before_the_Initial_Instance

Regards,
Mohammad


On Friday, October 10, 2014 12:18 PM, Nathan T <na...@vertile.com> wrote:
 


Just posted this question on SO an github... it would be rad if anyone has
some direction:

so:
http://stackoverflow.com/questions/26306538/hadoop-oozie-workflow-not-getting-coordinator-properties
gist:
https://gist.github.com/nathantsoi/dc8caac7109a57c99399#file-awesome-oozie-config-md

I have a simple Oozie coordinator and workflow. I'm trying to pass the
coordinator's dataIn property to the workflow as described here:
https://oozie.apache.org/docs/3.2.0-incubating/CoordinatorFunctionalSpec.html#a6.7.1._coord:dataInString_name_EL_Function

For some reason, the value is empty in the workflow's properties and EL
variable is empty${inputDir} in the following example.

The actual error is: variable [inputDir] cannot be resolved
<https://gist.github.com/nathantsoi/dc8caac7109a57c99399#config>Config
<https://gist.github.com/nathantsoi/dc8caac7109a57c99399#coordinatorxml>
coordinator.xml

<?xml version="1.0" encoding="UTF-8"?>
<coordinator-app xmlns="uri:oozie:coordinator:0.4" name="awesome"
frequency="${coord:days(1)}" start="2014-10-06T00:01Z"
end="2050-01-01T00:01Z" timezone="UTC">
  <controls>
    <!-- Wait 23 hours before giving up -->
    <timeout>1380</timeout>
    <concurrency>1</concurrency>
    <execution>LIFO</execution>
  </controls>
  <datasets>
    <dataset name="itsready" frequency="${coord:days(1)}"
initial-instance="2014-10-06T08:00Z" timezone="America/Los_Angeles">
      <uri-template>${s3DataPath}/${YEAR}-${MONTH}-${DAY}</uri-template>
      <!-- with the done-flag set to none, this will look for the
folder's existance -->
      <done-flag></done-flag>
    </dataset>
    <!-- output dataset -->
    <dataset name="itsdone" frequency="${coord:days(1)}"
initial-instance="2014-10-06T08:00Z" timezone="America/Los_Angeles">
      <uri-template>${dataPath}/awesome/sql-schema-tab-delim-load/${YEAR}-${MONTH}-${DAY}/loaded</uri-template>
    </dataset>
  </datasets>
  <input-events>
    <data-in name="input" dataset="itsready">
      <instance>${coord:current(0)}</instance>
    </data-in>
  </input-events>
  <output-events>
    <data-out name="output" dataset="itsdone">
      <instance>${coord:current(0)}</instance>
    </data-out>
  </output-events>
  <action>
    <workflow>
      <app-path>${workflowApplicationPath}</app-path>
      <configuration>
        <property>
          <name>inputDir</name>
          <value>${coord:dataIn('input')}</value>
        </property>
      </configuration>
    </workflow>
  </action>
</coordinator-app>

<https://gist.github.com/nathantsoi/dc8caac7109a57c99399#workflowxml>
workflow.xml

<?xml version="1.0" encoding="UTF-8"?>
<workflow-app xmlns="uri:oozie:workflow:0.4" name="awesome-wf">
  <start to="shell-import"/>
  <action name="shell-import">
    <shell xmlns="uri:oozie:shell-action:0.2">
      <job-tracker>${jobTracker}</job-tracker>
      <name-node>${nameNode}</name-node>
      <exec>${importFile}</exec>
      <env-var>INPUT_DIR=${inputDir}</env-var>
      <file>${importFile}#${importFile}</file>
    </shell>
    <ok to="end"/>
    <error to="fail"/>
  </action>
  <kill name="fail">
    <message>it failed, error
message[${wf:errorMessage(wf:lastErrorNode())}]</message>
  </kill>
  <end name="end"/>
</workflow-app>

<https://gist.github.com/nathantsoi/dc8caac7109a57c99399#jobproperties>
job.properties

hadoopMaster=myawesome.server.com
nameNode=hdfs://${hadoopMaster}:8020
jobTracker=${hadoopMaster}:8050
tzOffset=-8
oozie.use.system.libpath=true
oozie.libpath=/user/oozie/share/lib
appPath=${nameNode}/apps
dataPath=${appPath}/data
s3DataPath=s3n://an/awesome/s3/data/path

oozie.wf.action.notification.url=https://zapier.com/mysecreturl
workflowApplicationPath=${appPath}/awesome

oozie.coord.application.path=${workflowApplicationPath}

importFile=import.sh