You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Brian Nutt (JIRA)" <ji...@apache.org> on 2019/04/28 05:12:00 UTC

[jira] [Updated] (AIRFLOW-4424) Scheduler does not terminate after num_runs when executor is KubernetesExecutor

     [ https://issues.apache.org/jira/browse/AIRFLOW-4424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Brian Nutt updated AIRFLOW-4424:
--------------------------------
    Priority: Blocker  (was: Major)

> Scheduler does not terminate after num_runs when executor is KubernetesExecutor
> -------------------------------------------------------------------------------
>
>                 Key: AIRFLOW-4424
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-4424
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: kubernetes, scheduler
>    Affects Versions: 1.10.3
>         Environment: EKS, deployed with stable airflow helm chart
>            Reporter: Brian Nutt
>            Priority: Blocker
>             Fix For: 1.10.3, 1.10.4
>
>
> When using the executor like the CeleryExecutor and num_runs is set on the scheduler, the scheduler pod restarts after num runs have completed. After switching to KubernetesExecutor, the scheduler logs:
> [2019-04-26 19:20:43,562] \{{kubernetes_executor.py:770}} INFO - Shutting down Kubernetes executor
> However, the scheduler process does not complete. This leads to the scheduler pod never restarting and running num_runs again. Resulted in having to roll back to CeleryExecutor because if num_runs is -1, the scheduler builds up tons of defunct processes, which is eventually making tasks not able to be scheduled as the underlying nodes have run out of file descriptors.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)