You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@falcon.apache.org by "Pallavi Rao (JIRA)" <ji...@apache.org> on 2016/03/11 12:26:40 UTC

[jira] [Commented] (FALCON-1852) Optional Input for a process not truly optional

    [ https://issues.apache.org/jira/browse/FALCON-1852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15190819#comment-15190819 ] 

Pallavi Rao commented on FALCON-1852:
-------------------------------------

This can be addressed by resolving the optional input path to a glob pattern as below:
{noformat}
hdfs://localhost:9000/data/in/2013/11/15/00/1{4,3,2,1,0}.
{noformat}
This way even if one of the folders is missing, the workflow will run.

> Optional Input for a process not truly optional
> -----------------------------------------------
>
>                 Key: FALCON-1852
>                 URL: https://issues.apache.org/jira/browse/FALCON-1852
>             Project: Falcon
>          Issue Type: Bug
>            Reporter: Pallavi Rao
>            Assignee: Pallavi Rao
>
> Currently, when a feed input is marked as optional, we do not add it to the coordinator definition's datasets. This means we do not wait for all instances (for a given data window) to arrive. Instead, we just resolve the paths for a data window and pass it as a parameter.
> For example:
> {noformat}
> <inputs>
>         <!-- In the workflow, the input paths will be available in a variable 'inpaths' -->
>         <input name="inpaths" feed="in" start="now(0,-5)" end="now(0,-1)"/>
>         <input name="in2paths" feed="in2" start="now(0,-5)" end="now(0,-1)" optional="true"/>
>     </inputs>
> {noformat}
> For a process instance 2013-01-01T00:00Z, the optional input, in2paths, will be resolved as below:
> {noformat}
>  <property>
>     <name>in2paths</name>
>     <value>hdfs://localhost:9000/data/in2/2013/11/15/00/04,hdfs://localhost:9000/data/in2/2013/11/15/00/03,hdfs://localhost:9000/data/in2/2013/11/15/00/02,hdfs://localhost:9000/data/in2/2013/11/15/00/01,hdfs://localhost:9000/data/in2/2013/11/15/00/00</value>
>   </property>
> {noformat}
> If one of the instance of in2paths (example, hdfs://localhost:9000/data/in2/2013/11/15/00/04) is missing, the workflow will fail anyway.
> Hence, input, in2paths is not truly optional. Only that the triggering of instance is not gated on it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)