You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Alessio Palma (JIRA)" <ji...@apache.org> on 2017/03/23 08:53:41 UTC
[jira] [Created] (AIRFLOW-1029)
https://issues.apache.org/jira/browse/AIRFLOW
Alessio Palma created AIRFLOW-1029:
--------------------------------------
Summary: https://issues.apache.org/jira/browse/AIRFLOW
Key: AIRFLOW-1029
URL: https://issues.apache.org/jira/browse/AIRFLOW-1029
Project: Apache Airflow
Issue Type: Bug
Components: scheduler
Affects Versions: 1.8.0rc5
Reporter: Alessio Palma
Attachments: image (1).png, PannelloAIRFLOW 2.png
I'm using:
AIRFLOW 1.8.0RC5
ERLANG 19.2
RABBIT 3.6.7
PYTHON 2.7
When I start a DAG from the panel ( see picture ), Scheduler stop working.
After some investigation the problem raises here:
83 def sync(self):
84
85 self.logger.debug(
86 "Inquiring about {} celery task(s)".format(len(self.tasks)))
87
88 for key, async in list(self.tasks.items()):
90 state = async.state <---- HERE
Python stack trace says that the connection is closed; capturing some TCP traffic I can see that the connection to RABBITMQ is closed ( TCP FIN ) before to send a STOMP, so RABBITMQ replies with TCP RST. ( see picture 2: 172.1.0.2 -> rabbitmq node, 172.1.0.1 -> airflow node )
This exception stops the scheduler.
If you are using airflow-scheduler-failover-controller the scheduler is restarted, but this is just a work around and does not fixes the problem at the root.
Is safe to trap the exception ?
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)