You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Raphael Lopez Kaufman (JIRA)" <ji...@apache.org> on 2017/07/11 08:14:00 UTC
[jira] [Created] (AIRFLOW-1398) Add ability for ExternalTaskSensor
to wait on multiple runs of a task
Raphael Lopez Kaufman created AIRFLOW-1398:
----------------------------------------------
Summary: Add ability for ExternalTaskSensor to wait on multiple runs of a task
Key: AIRFLOW-1398
URL: https://issues.apache.org/jira/browse/AIRFLOW-1398
Project: Apache Airflow
Issue Type: Improvement
Reporter: Raphael Lopez Kaufman
Currently using the execution_date_fn parameter of the ExternalTaskSensor sensors only allows to wait for the completion of one given run of the task the ExternalTaskSensor is sensing.
However, this prevents users to have setups where dags don't have the same schedule frequency but still depend on one another. For example, let's say you have a dag scheduled hourly that transforms log data and is owned by the team in charge of logging. In the current setup you cannot have other higher level teams, that want to use this transformed data, create dags processing transformed log data in daily batches, while making sure the logged transformed data was properly created. Note that simply waiting for the data to be present (using e.g. the HivePartitionSensor if the data is in hive) might not be satisfactory because the data being present doesn't mean it is ready to be used.
Adding the ability for an ExternalTaskSensor to wait for multiple runs of the task it is sensing to have finished would allow higher level teams to setup dags with an ExternalTaskSensor sensing the end task of the dag that transforms the log data and to wait for the successful completion of 24 of its hourly runs.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)