You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Sergio Kef (Jira)" <ji...@apache.org> on 2019/10/11 13:06:00 UTC

[jira] [Commented] (AIRFLOW-5214) Airflow leaves too many TIME_WAIT TCP connections

    [ https://issues.apache.org/jira/browse/AIRFLOW-5214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16949426#comment-16949426 ] 

Sergio Kef commented on AIRFLOW-5214:
-------------------------------------

Hey [~oricken] thanks for this ticket. Can you provide an example dag and some instructions to reproduce it? Where do you host airflow and db?

> Airflow leaves too many TIME_WAIT TCP connections
> -------------------------------------------------
>
>                 Key: AIRFLOW-5214
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-5214
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: DagRun, database
>    Affects Versions: 1.10.2, 1.10.4
>         Environment: CentOS 7, Airflow 1.10.4, Maria DB
>            Reporter: Oliver Ricken
>            Priority: Critical
>
> Dear experts,
> in Airflow version 1.10.2 as well as 1.10.4, we experience a severe problem with the limitation of the number of concurrent tasks.
> We observe that for more than 8 tasks being started and executed in parallel, that the majority of those tasks fails with the error "Can't connect to MySQL server" and error code 2006(99). This error code boils down to "Cannot bind socket to resource", which is why we started looking into the TCP conenctions of our Airflow host (a single node that hosts the webserver, scheduler and worker).
> When the 8 tasks are simultaneously running, we observe more than 15,000 TIME_WAIT connections while less than 50 are established. Given, that the number of available ports is somewhat smaller than 30,000, this large number of blocked but unused TCP connections would explain the failing of further task executions.
> Can anyone explain how these many open connections blocking ports/sockets come about? Given that we have connection pooling enabled, we do not see any explanation yet.
> Your help is very much appreciated, this issue strongly limits our current performance!
> Cheers
> Oliver



--
This message was sent by Atlassian Jira
(v8.3.4#803005)