You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Dan Davydov (JIRA)" <ji...@apache.org> on 2017/04/25 03:27:04 UTC

[jira] [Updated] (AIRFLOW-1143) Tasks rejected by workers get stuck in QUEUED

     [ https://issues.apache.org/jira/browse/AIRFLOW-1143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dan Davydov updated AIRFLOW-1143:
---------------------------------
    Description: 
If the scheduler schedules a task that is sent to a worker that then rejects the task (e.g. because one of the dependencies of the tasks became bad, like the pool became full), the task will be stuck in the QUEUED state. We hit this trying to switch from invoking the scheduler "airflow scheduler -n 5" to just "airflow scheduler".

Restarting the scheduler fixes this because it cleans up orphans, but we shouldn't have to restart the scheduler to fix these problems (the missing job heartbeats should make the scheduler requeue the task).

  was:
If the scheduler schedules a task that is sent to a worker that then rejects the task (e.g. because one of the dependencies of the tasks became bad, like the pool became full), the task will be stuck in the QUEUED state.

Restarting the scheduler fixes this because it cleans up orphans, but we shouldn't have to restart the scheduler to fix these problems (the missing job heartbeats should make the scheduler requeue the task).


> Tasks rejected by workers get stuck in QUEUED
> ---------------------------------------------
>
>                 Key: AIRFLOW-1143
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-1143
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: scheduler
>            Reporter: Dan Davydov
>
> If the scheduler schedules a task that is sent to a worker that then rejects the task (e.g. because one of the dependencies of the tasks became bad, like the pool became full), the task will be stuck in the QUEUED state. We hit this trying to switch from invoking the scheduler "airflow scheduler -n 5" to just "airflow scheduler".
> Restarting the scheduler fixes this because it cleans up orphans, but we shouldn't have to restart the scheduler to fix these problems (the missing job heartbeats should make the scheduler requeue the task).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)