You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Jonathan Bender (JIRA)" <ji...@apache.org> on 2017/11/17 16:36:00 UTC

[jira] [Commented] (AIRFLOW-1825) Set Multi dag dependency

    [ https://issues.apache.org/jira/browse/AIRFLOW-1825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16257188#comment-16257188 ] 

Jonathan Bender commented on AIRFLOW-1825:
------------------------------------------

bq. Can someone please help me in solving this? I understand that explains external task sensor Operator can be used. But it continuously polls if task in dag A and B is complete which might create performance hit over a period of time.

The performance hit of polling the database for successful task instances every interval seems reasonable, if you have sane polling intervals.

We have strayed away from lots of ExternalTaskSensors as each one requires a task slot, which is some non-trivial memory overhead if you have enough dependencies which could wait for long periods of time. Additionally, each one means more task instances which puts more pressure on the db.

To combat this we just rolled a simple "multi external" sensor class which only completes when all of its child tasks are completed:
https://gist.github.com/jonbender/4da675e9385e2fbf66ff0bb591cc74d7

You could use something like that, or you could have a single DummyOperator like "all_a_tasks_completed" which is downstream of all tasks in a, and same goes for b, then you'd only have a single external dependency per dag.





> Set Multi dag dependency 
> -------------------------
>
>                 Key: AIRFLOW-1825
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-1825
>             Project: Apache Airflow
>          Issue Type: Task
>            Reporter: Snigdha Nair
>
> I have 3 dags A, B and C. Dag C should get triggered only after tasks in dag A and B completes. Is there a way to implement this in airflow? I am able to set dependency between dag A and C using Triggerdagrun Operator. But when I try to set dependency between dag B and C, C is getting triggered when either A or B completes. Can someone please help me in solving this? I understand that explains external task sensor Operator can be used. But it continuously polls if task in dag A and B is complete which might create performance hit over a period of time.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)