You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Abdul Wahab (Jira)" <ji...@apache.org> on 2020/06/25 09:13:00 UTC

[jira] [Comment Edited] (AIRFLOW-5858) Airflow celery worker missing heartbeat

    [ https://issues.apache.org/jira/browse/AIRFLOW-5858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17144777#comment-17144777 ] 

Abdul Wahab edited comment on AIRFLOW-5858 at 6/25/20, 9:12 AM:
----------------------------------------------------------------

I am using 1.10.10 and still having the same problem. After a Worker misses a heartbeat it stops taking the new tasks. Manual restart of worker does the job.

P.S. I would also like to ask, is there any way around for it for now? Like if a Worker misses a heartbeat, it should get restarted or maybe a script as a temporary solution?


was (Author: wahab.icp):
I am having the same problem. After a Worker misses a heartbeat it stops taking the new tasks. Manual restart of worker does the job.

P.S. I would also like to ask, is there any way around for it for now? Like if a Worker misses a heartbeat, it should get restarted or maybe a script as a temporary solution?

> Airflow celery worker missing heartbeat
> ---------------------------------------
>
>                 Key: AIRFLOW-5858
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-5858
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: celery
>    Affects Versions: 1.10.2
>            Reporter: Nikhil SInghal
>            Priority: Major
>
> Our Airflow setup uses Celery Executors and Redis as a broker. We are facing a issue of missing heartbeat from celery workers. Once this heartbeat is missed the worker stops taking any new task. It also appears as offline in the Celery flower UI. Manual restart of workers fixes this problem.
> Wanted to know if this is a known issue or being faced by other users in the community
> These are the logs from failure. 
> [2019-11-06 02:30:36,368: INFO/MainProcess] missed heartbeat from celery@dp-airflow-worker-6cb4b596f8-nzdrt
> worker: Warm shutdown (MainProcess)
> -------------- celery@dp-airflow-worker-6cb4b596f8-4qkfg v4.1.1 (latentcall)
> ---- **** -----
> --- * *** * -- Linux-4.9.0-9-amd64-x86_64-with-debian-10.1 2019-11-05 15:24:34
> -- * - **** ---
> - ** ---------- [config]
> - ** ---------- .> app: airflow.executors.celery_executor:0x7f2a01250cf8
> - ** ---------- .> transport: redis://:**@redis-11313.internal.c3160.ap-southeast-1-mz.ec2.cloud.rlrcp.com:11313//
> - ** ---------- .> results: postgresql://airflow:**@airflowdbprod.ckvce9fjaook.ap-southeast-1.rds.amazonaws.com:5432/airflowdb
> - *** --- * --- .> concurrency: 64 (prefork)
> -- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)
> --- ***** -----
>  -------------- [queues]
>  .> default exchange=default(direct) key=default
> [tasks]
>  . airflow.executors.celery_executor.execute_command



--
This message was sent by Atlassian Jira
(v8.3.4#803005)