You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@oozie.apache.org by "purshotam shah (JIRA)" <ji...@apache.org> on 2013/09/24 00:09:06 UTC
[jira] [Updated] (OOZIE-1554) Support variables for coord
data-in/data-out dataset
[ https://issues.apache.org/jira/browse/OOZIE-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
purshotam shah updated OOZIE-1554:
----------------------------------
Description:
One would like to have a centralized list of datasets,
and use the <include> tag to make them available to every coordinator. One
would like to re-use our coordinator code, as most of his processing follows
the same steps, but with differing input and output feeds.
He need to be able to set the data-in and data-out dataset
values to variables;
My bundle coordinator entry looks like this:
<coordinator name="data1-2">
<app-path>/user/harveyc/oozie_test/src/test_coordA.xml</app-path>
<configuration>
<property><name>wf_name</name><value>1-2</value></property>
<property><name>dataset_A</name><value>dataA</value></property>
<property><name>dataset_B</name><value>dataB</value></property>
</configuration>
</coordinator>
Coord looks
<coordinator-app name="COORD_A_TEST" frequency="${coord:minutes(1)}"
start="${startTime}" end="${endTime}" timezone="${timezoneCode}"
xmlns:sla="uri:oozie:sla:0.1" xmlns="uri:oozie:coordinator:0.2">
<datasets>
<include>${nameNode}/user/harveyc/oozie_test/datasets/test_datasets.xml</include>
</datasets>
<input-events>
<data-in name="inputDataA" dataset="${dataset_A}">
<instance>${coord:current(0)}</instance>
</data-in>
</input-events>
<output-events>
<data-out name="outputDataB" dataset="${dataset_B}">
<instance>${coord:current(0)}</instance>
</data-out>
</output-events>
<action>
<workflow>
<app-path>/user/harveyc/oozie_test/src/wf_touchz.xml</app-path>
<configuration>
<property><name>name</name><value>${wf_name}</value></property>
<property><name>touchzpathb</name><value>${coord:dataOut('outputDataB')}</value></property>
</configuration>
</workflow>
</action>
</coordinator-app>
Test_datasets.xml looks like this:
<datasets>
<dataset name="dataA" frequency="${coord:minutes(1)}"
initial-instance="${ds_startTime}" timezone="${timezoneCode}">
<uri-template>${nameNode}/user/harveyc/oozie_test/data1/${YEAR}${MONTH}${DAY}${HOUR}${MINUTE}</uri-template>
</dataset>
<dataset name="dataB" frequency="${coord:minutes(1)}"
initial-instance="${ds_startTime}" timezone="${timezoneCode}">
<uri-template>${nameNode}/user/harveyc/oozie_test/data2/${YEAR}${MONTH}${DAY}${HOUR}${MINUTE}</uri-template>
</dataset>
</datasets>
The error is:
Error: Invalid workflow-app, org.xml.sax.SAXParseException; lineNumber: 8;
columnNumber: 57; cvc-pattern-valid: Value '${dataset_A}' is not facet-valid
with respect to pattern '([a-zA-Z]([\-_a-zA-Z0-9])*){1,39}' for type
'IDENTIFIER'.
> Support variables for coord data-in/data-out dataset
> -----------------------------------------------------
>
> Key: OOZIE-1554
> URL: https://issues.apache.org/jira/browse/OOZIE-1554
> Project: Oozie
> Issue Type: Bug
> Reporter: purshotam shah
>
> One would like to have a centralized list of datasets,
> and use the <include> tag to make them available to every coordinator. One
> would like to re-use our coordinator code, as most of his processing follows
> the same steps, but with differing input and output feeds.
> He need to be able to set the data-in and data-out dataset
> values to variables;
> My bundle coordinator entry looks like this:
> <coordinator name="data1-2">
> <app-path>/user/harveyc/oozie_test/src/test_coordA.xml</app-path>
> <configuration>
> <property><name>wf_name</name><value>1-2</value></property>
> <property><name>dataset_A</name><value>dataA</value></property>
> <property><name>dataset_B</name><value>dataB</value></property>
> </configuration>
> </coordinator>
> Coord looks
> <coordinator-app name="COORD_A_TEST" frequency="${coord:minutes(1)}"
> start="${startTime}" end="${endTime}" timezone="${timezoneCode}"
> xmlns:sla="uri:oozie:sla:0.1" xmlns="uri:oozie:coordinator:0.2">
> <datasets>
>
> <include>${nameNode}/user/harveyc/oozie_test/datasets/test_datasets.xml</include>
> </datasets>
> <input-events>
> <data-in name="inputDataA" dataset="${dataset_A}">
> <instance>${coord:current(0)}</instance>
> </data-in>
> </input-events>
> <output-events>
> <data-out name="outputDataB" dataset="${dataset_B}">
> <instance>${coord:current(0)}</instance>
> </data-out>
> </output-events>
> <action>
> <workflow>
> <app-path>/user/harveyc/oozie_test/src/wf_touchz.xml</app-path>
> <configuration>
> <property><name>name</name><value>${wf_name}</value></property>
>
> <property><name>touchzpathb</name><value>${coord:dataOut('outputDataB')}</value></property>
> </configuration>
> </workflow>
> </action>
> </coordinator-app>
> Test_datasets.xml looks like this:
> <datasets>
> <dataset name="dataA" frequency="${coord:minutes(1)}"
> initial-instance="${ds_startTime}" timezone="${timezoneCode}">
>
> <uri-template>${nameNode}/user/harveyc/oozie_test/data1/${YEAR}${MONTH}${DAY}${HOUR}${MINUTE}</uri-template>
> </dataset>
> <dataset name="dataB" frequency="${coord:minutes(1)}"
> initial-instance="${ds_startTime}" timezone="${timezoneCode}">
>
> <uri-template>${nameNode}/user/harveyc/oozie_test/data2/${YEAR}${MONTH}${DAY}${HOUR}${MINUTE}</uri-template>
> </dataset>
> </datasets>
> The error is:
> Error: Invalid workflow-app, org.xml.sax.SAXParseException; lineNumber: 8;
> columnNumber: 57; cvc-pattern-valid: Value '${dataset_A}' is not facet-valid
> with respect to pattern '([a-zA-Z]([\-_a-zA-Z0-9])*){1,39}' for type
> 'IDENTIFIER'.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira