You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Raphael Lopez Kaufman (JIRA)" <ji...@apache.org> on 2017/07/17 09:28:00 UTC

[jira] [Commented] (AIRFLOW-1398) Add ability for ExternalTaskSensor to wait on multiple runs of a task

    [ https://issues.apache.org/jira/browse/AIRFLOW-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16089537#comment-16089537 ] 

Raphael Lopez Kaufman commented on AIRFLOW-1398:
------------------------------------------------

[~bolke] Any though on that this (the corresponding PR is https://github.com/apache/incubator-airflow/pull/2431)
We are trying to switch from Oozie to Airflow at Booking.com and would really need this feature (or maybe there's already a way to achieve this) for us to go forward with the migration.

> Add ability for ExternalTaskSensor to wait on multiple runs of a task
> ---------------------------------------------------------------------
>
>                 Key: AIRFLOW-1398
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-1398
>             Project: Apache Airflow
>          Issue Type: Improvement
>            Reporter: Raphael Lopez Kaufman
>
> Currently using the execution_date_fn parameter of the ExternalTaskSensor sensors only allows to wait for the completion of one given run of the task the ExternalTaskSensor is sensing.
> However, this prevents users to have setups where dags don't have the same schedule frequency but still depend on one another. For example, let's say you have a dag scheduled hourly that transforms log data and is owned by the team in charge of logging. In the current setup you cannot have other higher level teams, that want to use this transformed data, create dags processing transformed log data in daily batches, while making sure the logged transformed data was properly created. Note that simply waiting for the data to be present (using e.g. the HivePartitionSensor if the data is in hive) might not be satisfactory because the data being present doesn't mean it is ready to be used.
> Adding the ability for an ExternalTaskSensor to wait for multiple runs of the task it is sensing to have finished would allow higher level teams to setup dags with an ExternalTaskSensor sensing the end task of the dag that transforms the log data and to wait for the successful completion of 24 of its hourly runs.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)