You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Gabriel Silk (JIRA)" <ji...@apache.org> on 2019/01/21 18:25:00 UTC

[jira] [Commented] (AIRFLOW-2279) Clearing Tasks Across DAGs

    [ https://issues.apache.org/jira/browse/AIRFLOW-2279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16748143#comment-16748143 ] 

Gabriel Silk commented on AIRFLOW-2279:
---------------------------------------

At Dropbox, we have the exact same need. We were going to embark on building this, but perhaps it would make sense to use your patch as a starting point?

One hard requirement we have is the ability to limit the number of tasks cleared when doing a cross-DAG clear. At Dropbox, a lot of the tasks we run are *very* data- and compute- intensive, so if we (for example) cleared 10,000 tasks accidentally, it would be difficult to recover from.

Another aspect of this is the time window of the ExternalTaskSensor – for example if I have a task that runs at time _t_ in dag A and depends on the set of tasks in another dag B between [_t - 7, t_), then clearing the task at time _t - 4_ in dag B should also clear the task at time _t_ in dag A. Does your patch do this already?

> Clearing Tasks Across DAGs
> --------------------------
>
>                 Key: AIRFLOW-2279
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-2279
>             Project: Apache Airflow
>          Issue Type: Improvement
>            Reporter: Achal Soni
>            Priority: Major
>         Attachments: cross_dag_ui_screenshot.png
>
>
> At Stripe, we commonly have discrete dags that depend on each other by leveraging ExternalTaskSensors. We also find ourselves routinely wanting to not only clear tasks and their downstream tasks in a particular dag, but also their downstream tasks in their dependent dags (linked by ExternalTaskSensors). 
> We currently have extended Airflow to handle this by modifying the webapp and cli tool to optionally clear dependent tasks across multiple dags (see attached screenshot). 
> We want to open the floor for discussion with the larger Airflow community about the usage of ExternalTaskSensors and specifically how to handle clearing across dags. We are interested in learning more about the accepted practices in this regard, and are very open/willing to contribute in this area if there is interest!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)