You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Tao Feng (JIRA)" <ji...@apache.org> on 2018/03/04 07:50:00 UTC

[jira] [Commented] (AIRFLOW-2128) 'Tall' DAGs scale worse than 'wide' DAGs

    [ https://issues.apache.org/jira/browse/AIRFLOW-2128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16385011#comment-16385011 ] 

Tao Feng commented on AIRFLOW-2128:
-----------------------------------

"For the wide DAG it was about 80 successfully executed tasks in 10 minutes, for the tall one it was 0."

No successfully tasks in tall dag executed in 10 minutes?

> 'Tall' DAGs scale worse than 'wide' DAGs
> ----------------------------------------
>
>                 Key: AIRFLOW-2128
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-2128
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: DAG, DagRun, scheduler
>    Affects Versions: 1.9.0
>            Reporter: Máté Szabó
>            Priority: Major
>              Labels: performance, usability
>         Attachments: tall_dag.py, wide_dag.py
>
>
> Tall DAG = a DAG with long chains of dependencies, e.g.: 0 -> 1 -> 2 -> ... -> 998 -> 999
> Wide DAG = a DAG with many short, parallel dependencies e.g. 0 -> 1; 0 -> 2; ... 0 -> 999
> Take a super simple case where both graphs are of 1000 tasks, and all the tasks are just "sleep 0.03" bash commands (see the attached files).
> With the default SequentialExecutor (without paralellism), I would expect my 2 example DAGs to take (approximately) the same time to run, but apprently this is not the case.
> For the wide DAG it was about 80 successfully executed tasks in 10 minutes, for the tall one it was 0.
> This anomaly also seem to affect the web UI. Opening up the graph view or the tree view for the wide DAG takes about 6 seconds on my machine, but for the tall one it takes significantly longer, in fact currently it does not load at all.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)