You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Jacob Ward (Jira)" <ji...@apache.org> on 2019/11/06 17:06:00 UTC

[jira] [Created] (AIRFLOW-5859) Tasks locking and heartbeat warnings

Jacob Ward created AIRFLOW-5859:
-----------------------------------

             Summary: Tasks locking and heartbeat warnings
                 Key: AIRFLOW-5859
                 URL: https://issues.apache.org/jira/browse/AIRFLOW-5859
             Project: Apache Airflow
          Issue Type: Bug
          Components: DagRun
    Affects Versions: 1.10.6
         Environment: Airflow using LocalExecutor and Postgres
            Reporter: Jacob Ward


Having two potentially related issues.

Issue 1:

Some of my tasks (has only been PythonOperators so far) are starting and then doing nothing. I had a task that usually executes in 10-30 minutes running for over 24hrs without any error messages in the logs (other than the heartbeat warnings shown below) and without failing.

So far this has only happened to tasks inside sub-dags, not sure if that's to do with it?

The logs for the sub-dag show: 


{code:java}
[2019-11-05 16:41:34,364] {logging_mixin.py:112} INFO - [2019-11-05 16:41:34,364] {backfill_job.py:363} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 7 | succeeded: 5 | running: 1 | failed: 0 | skipped: 1 | deadlocked: 0 | not ready: 7
[2019-11-05 16:41:34,831] {logging_mixin.py:112} INFO - [2019-11-05 16:41:34,830] {local_task_job.py:124} WARNING - Time since last heartbeat(0.01 s) < heartrate(5.0 s), sleeping for 4.986693 s
[2019-11-05 16:41:39,376] {logging_mixin.py:112} INFO - [2019-11-05 16:41:39,376] {backfill_job.py:363} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 7 | succeeded: 5 | running: 1 | failed: 0 | skipped: 1 | deadlocked: 0 | not ready: 7
[2019-11-05 16:41:39,859] {logging_mixin.py:112} INFO - [2019-11-05 16:41:39,859] {local_task_job.py:124} WARNING - Time since last heartbeat(0.01 s) < heartrate(5.0 s), sleeping for 4.986141 s
{code}
repeated during that 24hr period.


Issue 2:

In all of the logs this warning message is printed every 5 seconds:

{{}}
{code:java}
[2019-11-06 14:25:29,466] {logging_mixin.py:112} INFO - [2019-11-06 14:25:29,465] {local_task_job.py:124} WARNING - Time since last heartbeat(0.01 s) < heartrate(5.0 s), sleeping for 4.987354 s {code}
{{}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)