You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@falcon.apache.org by "sandeep samudrala (JIRA)" <ji...@apache.org> on 2016/06/15 10:41:09 UTC

[jira] [Commented] (FALCON-1852) Optional Input for a process not truly optional

    [ https://issues.apache.org/jira/browse/FALCON-1852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15331519#comment-15331519 ] 

sandeep samudrala commented on FALCON-1852:
-------------------------------------------

This patch adds optional inputs to data sets to coordinator definition. Upon releasing this version(pushing el extensions jar to oozie), the older coordinators already running would not be having this input as data set , while the new workflow instance triggered expects this to be a data set and there by the workflow fails with below message.
{noformat}
variable [optionalInput] cannot be resolved
{noformat}

The processes have to be touched/updated accordingly since the update got pushed for the instances to run successfully.

> Optional Input for a process not truly optional
> -----------------------------------------------
>
>                 Key: FALCON-1852
>                 URL: https://issues.apache.org/jira/browse/FALCON-1852
>             Project: Falcon
>          Issue Type: Bug
>            Reporter: Pallavi Rao
>            Assignee: Pallavi Rao
>              Labels: backward-incompatible
>             Fix For: 0.10
>
>
> Currently, when a feed input is marked as optional, we do not add it to the coordinator definition's datasets. This means we do not wait for all instances (for a given data window) to arrive. Instead, we just resolve the paths for a data window and pass it as a parameter.
> For example:
> {noformat}
> <inputs>
>         <!-- In the workflow, the input paths will be available in a variable 'inpaths' -->
>         <input name="inpaths" feed="in" start="now(0,-5)" end="now(0,-1)"/>
>         <input name="in2paths" feed="in2" start="now(0,-5)" end="now(0,-1)" optional="true"/>
>     </inputs>
> {noformat}
> For a process instance 2013-01-01T00:00Z, the optional input, in2paths, will be resolved as below:
> {noformat}
>  <property>
>     <name>in2paths</name>
>     <value>hdfs://localhost:9000/data/in2/2013/11/15/00/04,hdfs://localhost:9000/data/in2/2013/11/15/00/03,hdfs://localhost:9000/data/in2/2013/11/15/00/02,hdfs://localhost:9000/data/in2/2013/11/15/00/01,hdfs://localhost:9000/data/in2/2013/11/15/00/00</value>
>   </property>
> {noformat}
> If one of the instance of in2paths (example, hdfs://localhost:9000/data/in2/2013/11/15/00/04) is missing, the workflow will fail anyway.
> Hence, input, in2paths is not truly optional. Only that the triggering of instance is not gated on it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)