You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Gary Harpaz (Jira)" <ji...@apache.org> on 2019/10/30 14:21:00 UTC

[jira] [Created] (AIRFLOW-5818) Very bad webserver performance when defining many dags with many operators

Gary Harpaz created AIRFLOW-5818:
------------------------------------

             Summary: Very bad webserver performance when defining many dags with many operators
                 Key: AIRFLOW-5818
                 URL: https://issues.apache.org/jira/browse/AIRFLOW-5818
             Project: Apache Airflow
          Issue Type: Bug
          Components: webserver
    Affects Versions: 1.10.5, 1.10.4, 1.10.3, 1.10.2, 1.10.1, 1.10.0, 1.9.0
            Reporter: Gary Harpaz
         Attachments: dup_dags.py, my_dag.template

In my scenario I have defined 500 dags, each dag has approximately 1500 operators.

This makes webserver impossible to work with even when all dags are paused and nothing is running. The cpu spikes all the time and webserver consumes huge amounts of  memory for no reason.

To reproduce this use the attched my_dag.template file and duplicate it using the attached dup_dags.py script.

 

The root cause of this issue is that dagbag will load all dags into memory which takes huge cpu and memory unnecessarily. 

I have already fixed this in:

[https://github.com/gary-harpaz/airflow/tree/improve-performance]

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)