You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/04/20 14:32:08 UTC
[GitHub] [airflow] bensonnd commented on issue #13542: Task stuck in "scheduled" or "queued" state, pool has all slots queued, nothing is executing
bensonnd commented on issue #13542:
URL: https://github.com/apache/airflow/issues/13542#issuecomment-823322869
Following on what @pelaprat mentioned, we are not running with either the CeleryExecutor or KubernetesExecutor, but the LocalExecutor in a Docker container. We get tasks stuck in scheduled or queued and the dag is marked as running, but is not. It seems like the scheduler falls asleep or misses queued tasks.
Either clearing the queued tasks or restarting the scheduler with `airflow scheduler` inside the container gets it moving again.
We've observed two different sets of logs over and over again when it does get in this stuck state. One detecting zombie jobs, and the other just checking for the regular heartbeat.
```
61 File Path PID Runtime # DAGs # Errors Last Runtime Last Run
62 ------------------------------------------- ------ --------- -------- ---------- -------------- -------------------
63 /opt/ingest/batch_ingest/dags/ingest_dag.py 120318 4.02s 1 0 5.43s 2021-04-08T16:37:43
64 ================================================================================
65 [2021-04-08 16:37:58,444] {dag_processing.py:1071} INFO - Finding 'running' jobs without a recent heartbeat
66 [2021-04-08 16:37:58,445] {dag_processing.py:1075} INFO - Failing jobs without heartbeat after 2021-04-08 16:32:58.445055+00:00
67 [2021-04-08 16:37:58,455] {dag_processing.py:1098} INFO - Detected zombie job: {'full_filepath': '/opt/ingest/batch_ingest/dags/ingest_dag.py', 'msg': 'Detected as zombie', 'simple_task_instance': <airflow.models.taskinstance.Si>
68 [2021-04-08 16:38:08,595] {dag_processing.py:1071} INFO - Finding 'running' jobs without a recent heartbeat
69 [2021-04-08 16:38:08,596] {dag_processing.py:1075} INFO - Failing jobs without heartbeat after 2021-04-08 16:33:08.596291+00:00
70 [2021-04-08 16:38:08,607] {dag_processing.py:1098} INFO - Detected zombie job: {'full_filepath': '/opt/ingest/batch_ingest/dags/ingest_dag.py', 'msg': 'Detected as zombie', 'simple_task_instance': <airflow.models.taskinstance.Si>
71 [2021-04-08 16:38:18,650] {dag_processing.py:1071} INFO - Finding 'running' jobs without a recent heartbeat
72 [2021-04-08 16:38:18,651] {dag_processing.py:1075} INFO - Failing jobs without heartbeat after 2021-04-08 16:33:18.651308+00:00
73 [2021-04-08 16:38:18,661] {dag_processing.py:1098} INFO - Detected zombie job: {'full_filepath': '/opt/ingest/batch_ingest/dags/ingest_dag.py', 'msg': 'Detected as zombie', 'simple_task_instance': <airflow.models.taskinstance.Si>
74 [2021-04-08 16:38:22,690] {dag_processing.py:838} INFO -
75 ================================================================================
76 DAG File Processing Stats
```
or
```
File Path PID Runtime # DAGs # Errors Last Runtime Last Run
------------------------------------------- ----- --------- -------- ---------- -------------- -------------------
/opt/ingest/batch_ingest/dags/ingest_dag.py 1 0 1.52s 2021-04-08T18:29:22
================================================================================
[2021-04-08 18:29:33,015] {dag_processing.py:1071} INFO - Finding 'running' jobs without a recent heartbeat
[2021-04-08 18:29:33,016] {dag_processing.py:1075} INFO - Failing jobs without heartbeat after 2021-04-08 18:24:33.016077+00:00
[2021-04-08 18:29:43,036] {dag_processing.py:1071} INFO - Finding 'running' jobs without a recent heartbeat
[2021-04-08 18:29:43,037] {dag_processing.py:1075} INFO - Failing jobs without heartbeat after 2021-04-08 18:24:43.037136+00:00
[2021-04-08 18:29:53,072] {dag_processing.py:1071} INFO - Finding 'running' jobs without a recent heartbeat
[2021-04-08 18:29:53,072] {dag_processing.py:1075} INFO - Failing jobs without heartbeat after 2021-04-08 18:24:53.072257+00:00
[2021-04-08 18:29:53,080] {dag_processing.py:838} INFO -
================================================================================
DAG File Processing Stats
```
We are in the process of pushing 2.0.2 as @kaxil noted to see if that is the issue.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org