You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Raphael Lopez Kaufman (JIRA)" <ji...@apache.org> on 2017/07/11 08:14:00 UTC

[jira] [Created] (AIRFLOW-1398) Add ability for ExternalTaskSensor to wait on multiple runs of a task

Raphael Lopez Kaufman created AIRFLOW-1398:
----------------------------------------------

             Summary: Add ability for ExternalTaskSensor to wait on multiple runs of a task
                 Key: AIRFLOW-1398
                 URL: https://issues.apache.org/jira/browse/AIRFLOW-1398
             Project: Apache Airflow
          Issue Type: Improvement
            Reporter: Raphael Lopez Kaufman


Currently using the execution_date_fn parameter of the ExternalTaskSensor sensors only allows to wait for the completion of one given run of the task the ExternalTaskSensor is sensing.

However, this prevents users to have setups where dags don't have the same schedule frequency but still depend on one another. For example, let's say you have a dag scheduled hourly that transforms log data and is owned by the team in charge of logging. In the current setup you cannot have other higher level teams, that want to use this transformed data, create dags processing transformed log data in daily batches, while making sure the logged transformed data was properly created. Note that simply waiting for the data to be present (using e.g. the HivePartitionSensor if the data is in hive) might not be satisfactory because the data being present doesn't mean it is ready to be used.

Adding the ability for an ExternalTaskSensor to wait for multiple runs of the task it is sensing to have finished would allow higher level teams to setup dags with an ExternalTaskSensor sensing the end task of the dag that transforms the log data and to wait for the successful completion of 24 of its hourly runs.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)