You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Ash Berlin-Taylor (Jira)" <ji...@apache.org> on 2019/10/30 21:43:00 UTC

[jira] [Commented] (AIRFLOW-5818) Very bad webserver performance when defining many dags with many operators

    [ https://issues.apache.org/jira/browse/AIRFLOW-5818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16963455#comment-16963455 ] 

Ash Berlin-Taylor commented on AIRFLOW-5818:
--------------------------------------------

I think we have fixed this in a different way already via AIP-24: https://github.com/apache/airflow/pull/5743 (not included in the 1.10.6 release but should be behind a config flag on 1.10.7 via https://github.com/apache/airflow/pull/5743 and default/only option from 2.0.0) - it would be great if you could test this out and see if it improves things for you.

> Very bad webserver performance when defining many dags with many operators
> --------------------------------------------------------------------------
>
>                 Key: AIRFLOW-5818
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-5818
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: webserver
>    Affects Versions: 1.9.0, 1.10.0, 1.10.1, 1.10.2, 1.10.3, 1.10.4, 1.10.5
>            Reporter: Gary Harpaz
>            Priority: Blocker
>         Attachments: dup_dags.py, my_dag.template
>
>
> In my scenario I have defined 500 dags, each dag has approximately 1500 operators.
> This makes webserver impossible to work with even when all dags are paused and nothing is running. The cpu spikes all the time and webserver consumes huge amounts of  memory for no reason.
> To reproduce this use the attched my_dag.template file and duplicate it using the attached dup_dags.py script.
>  
> The root cause of this issue is that dagbag will load all dags into memory which takes huge cpu and memory unnecessarily. 
> I have already fixed this in:
> [https://github.com/gary-harpaz/airflow/tree/improve-performance]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)