You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@airflow.apache.org by "Shaw, Damian P. " <da...@credit-suisse.com> on 2020/02/24 16:17:30 UTC
Airflow Worker settings for retrying to connect to Metadata DB?
Hi all,
Is there a way to get the Airflow Worker (started by Celery) to retry connecting to the Metadata DB by default when it times out?
And if it's related what does setting the worker precheck do when set to True? Will it retry connections if fails?
Details of the issue:
I'm currently on Airflow 1.10.6 using CeleryExecutor with Redis and MySQL DB, recently been getting a few tasks failing before they start. Airflow sends out an email that says:
Executor reports task instance finished (failed) although the task says its queued. Was the task killed externally?
Digging in to the Airflow Worker Stderr I see the exception:
[2020-02-24 06:08:14,718: INFO/ForkPoolWorker-9] Executing command in Celery: ['airflow', 'run', 'my_dag_id', 'my_task_id', '2020-02-23T10:00:00+00:00', '--local', '--pool', 'default_pool', '-sd', '.../dag_creator.py']
[2020-02-24 06:08:26,986: ERROR/ForkPoolWorker-9] execute_command encountered a CalledProcessError
Traceback (most recent call last):
File ".../lib/python3.7/site-packages/airflow/executors/celery_executor.py", line 67, in execute_command
close_fds=True, env=env)
File ".../lib/python3.7/subprocess.py", line 363, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['airflow', 'run', 'my_dag_id', 'my_task_id', '2020-02-23T10:00:00+00:00', '--local', '--pool', 'default_pool', '-sd', '.../dag_creator.py']' returned non-zero exit status 1.
And digging in to the Airflow Worker Stdout at this time I see a disconnection to the Metadata DB:
[2020-02-24 06:08:16,620] {cli.py:545} INFO - Running <TaskInstance: my_dag_id.my_task_id 2020-02-23T10:00:00+00:00 [queued]> on host my_app_host.my.internal_domain.net
Traceback (most recent call last):
File "/home/qthft/.conda/envs/qt_data_airflow_106/lib/python3.7/site-packages/pymysql/connections.py", line 583, in connect
**kwargs)
File "/home/qthft/.conda/envs/qt_data_airflow_106/lib/python3.7/socket.py", line 727, in create_connection
raise err
File "/home/qthft/.conda/envs/qt_data_airflow_106/lib/python3.7/socket.py", line 716, in create_connection
sock.connect(sa)
socket.timeout: timed out
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/qthft/.conda/envs/qt_data_airflow_106/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 2276, in _wrap_pool_connect
return fn()
File "/home/qthft/.conda/envs/qt_data_airflow_106/lib/python3.7/site-packages/sqlalchemy/pool/base.py", line 363, in connect
return _ConnectionFairy._checkout(self)
File "/home/qthft/.conda/envs/qt_data_airflow_106/lib/python3.7/site-packages/sqlalchemy/pool/base.py", line 760, in _checkout
fairy = _ConnectionRecord.checkout(pool)
File "/home/qthft/.conda/envs/qt_data_airflow_106/lib/python3.7/site-packages/sqlalchemy/pool/base.py", line 492, in checkout
rec = pool._do_get()
File "/home/qthft/.conda/envs/qt_data_airflow_106/lib/python3.7/site-packages/sqlalchemy/pool/impl.py", line 238, in _do_get
return self._create_connection()
File "/home/qthft/.conda/envs/qt_data_airflow_106/lib/python3.7/site-packages/sqlalchemy/pool/base.py", line 308, in _create_connection
return _ConnectionRecord(self)
File "/home/qthft/.conda/envs/qt_data_airflow_106/lib/python3.7/site-packages/sqlalchemy/pool/base.py", line 437, in __init__
self.__connect(first_connect_check=True)
File "/home/qthft/.conda/envs/qt_data_airflow_106/lib/python3.7/site-packages/sqlalchemy/pool/base.py", line 639, in __connect
connection = pool._invoke_creator(self)
File "/home/qthft/.conda/envs/qt_data_airflow_106/lib/python3.7/site-packages/sqlalchemy/engine/strategies.py", line 114, in connect
return dialect.connect(*cargs, **cparams)
File "/home/qthft/.conda/envs/qt_data_airflow_106/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 482, in connect
return self.dbapi.connect(*cargs, **cparams)
File "/home/qthft/.conda/envs/qt_data_airflow_106/lib/python3.7/site-packages/pymysql/__init__.py", line 94, in Connect
return Connection(*args, **kwargs)
File "/home/qthft/.conda/envs/qt_data_airflow_106/lib/python3.7/site-packages/pymysql/connections.py", line 325, in __init__
self.connect()
File "/home/qthft/.conda/envs/qt_data_airflow_106/lib/python3.7/site-packages/pymysql/connections.py", line 630, in connect
raise exc
pymysql.err.OperationalError: (2003, "Can't connect to MySQL server on 'my_db_host.my.internal_domain.net' (timed out)")
Any help is appreciated.
Regards
Damian
===============================================================================
Please access the attached hyperlink for an important electronic communications disclaimer:
http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html
===============================================================================