You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@oozie.apache.org by "purshotam shah (JIRA)" <ji...@apache.org> on 2013/11/11 22:13:17 UTC

[jira] [Updated] (OOZIE-1554) Support variables for coord data-in/data-out dataset

     [ https://issues.apache.org/jira/browse/OOZIE-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

purshotam shah updated OOZIE-1554:
----------------------------------

    Attachment: OOZIE-1554_v3.patch

patch

> Support variables for coord data-in/data-out dataset	
> -----------------------------------------------------
>
>                 Key: OOZIE-1554
>                 URL: https://issues.apache.org/jira/browse/OOZIE-1554
>             Project: Oozie
>          Issue Type: Bug
>            Reporter: purshotam shah
>            Assignee: purshotam shah
>         Attachments: OOZIE-1554_v1.patch, OOZIE-1554_v2.patch, OOZIE-1554_v3.patch
>
>
> One would like to have a centralized list of datasets,
> and use the <include> tag to make them available to every coordinator.  One 
> would like to re-use our coordinator code, as most of his processing follows
> the same steps, but with differing input and output feeds.  
> He need to be able to set the data-in and data-out dataset
> values to variables; 
> My bundle coordinator entry looks like this:
>    <coordinator name="data1-2">
>         <app-path>/user/harveyc/oozie_test/src/test_coordA.xml</app-path>
>         <configuration>
>             <property><name>wf_name</name><value>1-2</value></property>
>             <property><name>dataset_A</name><value>dataA</value></property>
>             <property><name>dataset_B</name><value>dataB</value></property>
>         </configuration>
>     </coordinator>
> Coord looks 
> <coordinator-app name="COORD_A_TEST" frequency="${coord:minutes(1)}"
> start="${startTime}" end="${endTime}" timezone="${timezoneCode}" 
> xmlns:sla="uri:oozie:sla:0.1" xmlns="uri:oozie:coordinator:0.2">
>   <datasets>
>    
> <include>${nameNode}/user/harveyc/oozie_test/datasets/test_datasets.xml</include>
>   </datasets>
>   <input-events>
>       <data-in name="inputDataA" dataset="${dataset_A}">
>         <instance>${coord:current(0)}</instance>
>       </data-in>
>   </input-events>
>   <output-events>
>       <data-out name="outputDataB" dataset="${dataset_B}">
>         <instance>${coord:current(0)}</instance>
>       </data-out>
>   </output-events>
>   <action>
>    <workflow>
>       <app-path>/user/harveyc/oozie_test/src/wf_touchz.xml</app-path>
>        <configuration>
>          <property><name>name</name><value>${wf_name}</value></property>
>         
> <property><name>touchzpathb</name><value>${coord:dataOut('outputDataB')}</value></property>
>        </configuration>
>     </workflow>
>   </action>     
> </coordinator-app>
> Test_datasets.xml looks like this:
>   <datasets>
>       <dataset name="dataA" frequency="${coord:minutes(1)}"
> initial-instance="${ds_startTime}" timezone="${timezoneCode}">
>        
> <uri-template>${nameNode}/user/harveyc/oozie_test/data1/${YEAR}${MONTH}${DAY}${HOUR}${MINUTE}</uri-template>
>       </dataset>
>       <dataset name="dataB" frequency="${coord:minutes(1)}"
> initial-instance="${ds_startTime}" timezone="${timezoneCode}">
>        
> <uri-template>${nameNode}/user/harveyc/oozie_test/data2/${YEAR}${MONTH}${DAY}${HOUR}${MINUTE}</uri-template>
>       </dataset>
>   </datasets>
> The error is:
> Error: Invalid workflow-app, org.xml.sax.SAXParseException; lineNumber: 8;
> columnNumber: 57; cvc-pattern-valid: Value '${dataset_A}' is not facet-valid
> with respect to pattern '([a-zA-Z]([\-_a-zA-Z0-9])*){1,39}' for type
> 'IDENTIFIER'.



--
This message was sent by Atlassian JIRA
(v6.1#6144)