You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Avik Aggarwal (JIRA)" <ji...@apache.org> on 2018/08/23 14:04:00 UTC

[jira] [Updated] (AIRFLOW-2946) Connection times out on airflow worker

     [ https://issues.apache.org/jira/browse/AIRFLOW-2946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Avik Aggarwal updated AIRFLOW-2946:
-----------------------------------
    Description: 
Hi 

I have Airflow cluster setup running Celery executors with Postgresql installed on same machine as webserver and scheduler.

After sometime, remote worker shows error 'Connection timed out' and Airflow queues number of configured tasks in pool in queue and flow hungs up there until queue tasks are deleted manually after stopping the scheduler service.

 

Logs:

[2018-08-23 13:44:03,954: ERROR/MainProcess] Pool callback raised exception: OperationalError('(psycopg2.OperationalError) could not connect to server: Connection timed out\n\tIs the server running on host "<host>" and accepting\n\tTCP/IP connections on port 5432?\n',)
 Traceback (most recent call last):
 File "/home/ubuntu/.local/lib/python2.7/site-packages/billiard/pool.py", line 1747, in safe_apply_callback
 fun(*args, **kwargs)
 File "/home/ubuntu/.local/lib/python2.7/site-packages/celery/worker/request.py", line 367, in on_failure
 self.id, exc, request=self, store_result=self.store_errors,
 File "/home/ubuntu/.local/lib/python2.7/site-packages/celery/backends/base.py", line 157, in mark_as_failure
 traceback=traceback, request=request)
 File "/home/ubuntu/.local/lib/python2.7/site-packages/celery/backends/base.py", line 322, in store_result
 request=request, **kwargs)
 File "/home/ubuntu/.local/lib/python2.7/site-packages/celery/backends/database/__init__.py", line 53, in _inner
 return fun(*args, **kwargs)
 File "/home/ubuntu/.local/lib/python2.7/site-packages/celery/backends/database/__init__.py", line 105, in _store_result
 session = self.ResultSession()
 File "/home/ubuntu/.local/lib/python2.7/site-packages/celery/backends/database/__init__.py", line 99, in ResultSession
 **self.engine_options)
 File "/home/ubuntu/.local/lib/python2.7/site-packages/celery/backends/database/session.py", line 60, in session_factory
 self.prepare_models(engine)
 File "/home/ubuntu/.local/lib/python2.7/site-packages/celery/backends/database/session.py", line 55, in prepare_models
 ResultModelBase.metadata.create_all(engine)
 File "/home/ubuntu/.local/lib/python2.7/site-packages/sqlalchemy/sql/schema.py", line 4005, in create_all
 tables=tables)
 File "/home/ubuntu/.local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1939, in _run_visitor
 with self._optional_conn_ctx_manager(connection) as conn:
 File "/usr/lib/python2.7/contextlib.py", line 17, in __enter__
 return self.gen.next()
 File "/home/ubuntu/.local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1932, in _optional_conn_ctx_manager
 with self.contextual_connect() as conn:
 File "/home/ubuntu/.local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 2123, in contextual_connect
 self._wrap_pool_connect(self.pool.connect, None),
 File "/home/ubuntu/.local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 2162, in _wrap_pool_connect
 e, dialect, self)
 File "/home/ubuntu/.local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1476, in _handle_dbapi_exception_noconnection
 exc_info
 File "/home/ubuntu/.local/lib/python2.7/site-packages/sqlalchemy/util/compat.py", line 265, in raise_from_cause
 reraise(type(exception), exception, tb=exc_tb, cause=cause)
 File "/home/ubuntu/.local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 2158, in _wrap_pool_connect
 return fn()
 File "/home/ubuntu/.local/lib/python2.7/site-packages/sqlalchemy/pool.py", line 403, in connect
 return _ConnectionFairy._checkout(self)
 File "/home/ubuntu/.local/lib/python2.7/site-packages/sqlalchemy/pool.py", line 791, in _checkout
 fairy = _ConnectionRecord.checkout(pool)
 File "/home/ubuntu/.local/lib/python2.7/site-packages/sqlalchemy/pool.py", line 532, in checkout
 rec = pool._do_get()
 File "/home/ubuntu/.local/lib/python2.7/site-packages/sqlalchemy/pool.py", line 1287, in _do_get
 return self._create_connection()
 File "/home/ubuntu/.local/lib/python2.7/site-packages/sqlalchemy/pool.py", line 350, in _create_connection
 return _ConnectionRecord(self)
 File "/home/ubuntu/.local/lib/python2.7/site-packages/sqlalchemy/pool.py", line 477, in __init__
 self.__connect(first_connect_check=True)
 File "/home/ubuntu/.local/lib/python2.7/site-packages/sqlalchemy/pool.py", line 674, in __connect
 connection = pool._invoke_creator(self)
 File "/home/ubuntu/.local/lib/python2.7/site-packages/sqlalchemy/engine/strategies.py", line 106, in connect
 return dialect.connect(*cargs, **cparams)
 File "/home/ubuntu/.local/lib/python2.7/site-packages/sqlalchemy/engine/default.py", line 412, in connect
 return self.dbapi.connect(*cargs, **cparams)
 File "/home/ubuntu/.local/lib/python2.7/site-packages/psycopg2/__init__.py", line 130, in connect
 conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
 OperationalError: (psycopg2.OperationalError) could not connect to server: Connection timed out
 Is the server running on host "<host>" and accepting
 TCP/IP connections on port 5432?

  was:
Hi 

I have Airflow cluster setup running Celery executors with Postgresql installed on same machine as webserver and scheduler.

After sometime, remote worker shows error 'Connection timed out' and Airflow queues number of configured tasks in pool in queue and flow hungs up there until queue tasks are deleted manually after stopping the scheduler service.

 

Logs:

[2018-08-23 13:44:03,954: ERROR/MainProcess] Pool callback raised exception: OperationalError('(psycopg2.OperationalError) could not connect to server: Connection timed out\n\tIs the server running on host "34.232.109.233" and accepting\n\tTCP/IP connections on port 5432?\n',)
Traceback (most recent call last):
 File "/home/ubuntu/.local/lib/python2.7/site-packages/billiard/pool.py", line 1747, in safe_apply_callback
 fun(*args, **kwargs)
 File "/home/ubuntu/.local/lib/python2.7/site-packages/celery/worker/request.py", line 367, in on_failure
 self.id, exc, request=self, store_result=self.store_errors,
 File "/home/ubuntu/.local/lib/python2.7/site-packages/celery/backends/base.py", line 157, in mark_as_failure
 traceback=traceback, request=request)
 File "/home/ubuntu/.local/lib/python2.7/site-packages/celery/backends/base.py", line 322, in store_result
 request=request, **kwargs)
 File "/home/ubuntu/.local/lib/python2.7/site-packages/celery/backends/database/__init__.py", line 53, in _inner
 return fun(*args, **kwargs)
 File "/home/ubuntu/.local/lib/python2.7/site-packages/celery/backends/database/__init__.py", line 105, in _store_result
 session = self.ResultSession()
 File "/home/ubuntu/.local/lib/python2.7/site-packages/celery/backends/database/__init__.py", line 99, in ResultSession
 **self.engine_options)
 File "/home/ubuntu/.local/lib/python2.7/site-packages/celery/backends/database/session.py", line 60, in session_factory
 self.prepare_models(engine)
 File "/home/ubuntu/.local/lib/python2.7/site-packages/celery/backends/database/session.py", line 55, in prepare_models
 ResultModelBase.metadata.create_all(engine)
 File "/home/ubuntu/.local/lib/python2.7/site-packages/sqlalchemy/sql/schema.py", line 4005, in create_all
 tables=tables)
 File "/home/ubuntu/.local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1939, in _run_visitor
 with self._optional_conn_ctx_manager(connection) as conn:
 File "/usr/lib/python2.7/contextlib.py", line 17, in __enter__
 return self.gen.next()
 File "/home/ubuntu/.local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1932, in _optional_conn_ctx_manager
 with self.contextual_connect() as conn:
 File "/home/ubuntu/.local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 2123, in contextual_connect
 self._wrap_pool_connect(self.pool.connect, None),
 File "/home/ubuntu/.local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 2162, in _wrap_pool_connect
 e, dialect, self)
 File "/home/ubuntu/.local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1476, in _handle_dbapi_exception_noconnection
 exc_info
 File "/home/ubuntu/.local/lib/python2.7/site-packages/sqlalchemy/util/compat.py", line 265, in raise_from_cause
 reraise(type(exception), exception, tb=exc_tb, cause=cause)
 File "/home/ubuntu/.local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 2158, in _wrap_pool_connect
 return fn()
 File "/home/ubuntu/.local/lib/python2.7/site-packages/sqlalchemy/pool.py", line 403, in connect
 return _ConnectionFairy._checkout(self)
 File "/home/ubuntu/.local/lib/python2.7/site-packages/sqlalchemy/pool.py", line 791, in _checkout
 fairy = _ConnectionRecord.checkout(pool)
 File "/home/ubuntu/.local/lib/python2.7/site-packages/sqlalchemy/pool.py", line 532, in checkout
 rec = pool._do_get()
 File "/home/ubuntu/.local/lib/python2.7/site-packages/sqlalchemy/pool.py", line 1287, in _do_get
 return self._create_connection()
 File "/home/ubuntu/.local/lib/python2.7/site-packages/sqlalchemy/pool.py", line 350, in _create_connection
 return _ConnectionRecord(self)
 File "/home/ubuntu/.local/lib/python2.7/site-packages/sqlalchemy/pool.py", line 477, in __init__
 self.__connect(first_connect_check=True)
 File "/home/ubuntu/.local/lib/python2.7/site-packages/sqlalchemy/pool.py", line 674, in __connect
 connection = pool._invoke_creator(self)
 File "/home/ubuntu/.local/lib/python2.7/site-packages/sqlalchemy/engine/strategies.py", line 106, in connect
 return dialect.connect(*cargs, **cparams)
 File "/home/ubuntu/.local/lib/python2.7/site-packages/sqlalchemy/engine/default.py", line 412, in connect
 return self.dbapi.connect(*cargs, **cparams)
 File "/home/ubuntu/.local/lib/python2.7/site-packages/psycopg2/__init__.py", line 130, in connect
 conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
OperationalError: (psycopg2.OperationalError) could not connect to server: Connection timed out
 Is the server running on host "34.232.109.233" and accepting
 TCP/IP connections on port 5432?


> Connection times out on airflow worker
> --------------------------------------
>
>                 Key: AIRFLOW-2946
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-2946
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: celery, executor, worker
>    Affects Versions: 1.10.0
>         Environment: ubuntu 16.04, AWS EC2 
>            Reporter: Avik Aggarwal
>            Priority: Critical
>
> Hi 
> I have Airflow cluster setup running Celery executors with Postgresql installed on same machine as webserver and scheduler.
> After sometime, remote worker shows error 'Connection timed out' and Airflow queues number of configured tasks in pool in queue and flow hungs up there until queue tasks are deleted manually after stopping the scheduler service.
>  
> Logs:
> [2018-08-23 13:44:03,954: ERROR/MainProcess] Pool callback raised exception: OperationalError('(psycopg2.OperationalError) could not connect to server: Connection timed out\n\tIs the server running on host "<host>" and accepting\n\tTCP/IP connections on port 5432?\n',)
>  Traceback (most recent call last):
>  File "/home/ubuntu/.local/lib/python2.7/site-packages/billiard/pool.py", line 1747, in safe_apply_callback
>  fun(*args, **kwargs)
>  File "/home/ubuntu/.local/lib/python2.7/site-packages/celery/worker/request.py", line 367, in on_failure
>  self.id, exc, request=self, store_result=self.store_errors,
>  File "/home/ubuntu/.local/lib/python2.7/site-packages/celery/backends/base.py", line 157, in mark_as_failure
>  traceback=traceback, request=request)
>  File "/home/ubuntu/.local/lib/python2.7/site-packages/celery/backends/base.py", line 322, in store_result
>  request=request, **kwargs)
>  File "/home/ubuntu/.local/lib/python2.7/site-packages/celery/backends/database/__init__.py", line 53, in _inner
>  return fun(*args, **kwargs)
>  File "/home/ubuntu/.local/lib/python2.7/site-packages/celery/backends/database/__init__.py", line 105, in _store_result
>  session = self.ResultSession()
>  File "/home/ubuntu/.local/lib/python2.7/site-packages/celery/backends/database/__init__.py", line 99, in ResultSession
>  **self.engine_options)
>  File "/home/ubuntu/.local/lib/python2.7/site-packages/celery/backends/database/session.py", line 60, in session_factory
>  self.prepare_models(engine)
>  File "/home/ubuntu/.local/lib/python2.7/site-packages/celery/backends/database/session.py", line 55, in prepare_models
>  ResultModelBase.metadata.create_all(engine)
>  File "/home/ubuntu/.local/lib/python2.7/site-packages/sqlalchemy/sql/schema.py", line 4005, in create_all
>  tables=tables)
>  File "/home/ubuntu/.local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1939, in _run_visitor
>  with self._optional_conn_ctx_manager(connection) as conn:
>  File "/usr/lib/python2.7/contextlib.py", line 17, in __enter__
>  return self.gen.next()
>  File "/home/ubuntu/.local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1932, in _optional_conn_ctx_manager
>  with self.contextual_connect() as conn:
>  File "/home/ubuntu/.local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 2123, in contextual_connect
>  self._wrap_pool_connect(self.pool.connect, None),
>  File "/home/ubuntu/.local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 2162, in _wrap_pool_connect
>  e, dialect, self)
>  File "/home/ubuntu/.local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1476, in _handle_dbapi_exception_noconnection
>  exc_info
>  File "/home/ubuntu/.local/lib/python2.7/site-packages/sqlalchemy/util/compat.py", line 265, in raise_from_cause
>  reraise(type(exception), exception, tb=exc_tb, cause=cause)
>  File "/home/ubuntu/.local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 2158, in _wrap_pool_connect
>  return fn()
>  File "/home/ubuntu/.local/lib/python2.7/site-packages/sqlalchemy/pool.py", line 403, in connect
>  return _ConnectionFairy._checkout(self)
>  File "/home/ubuntu/.local/lib/python2.7/site-packages/sqlalchemy/pool.py", line 791, in _checkout
>  fairy = _ConnectionRecord.checkout(pool)
>  File "/home/ubuntu/.local/lib/python2.7/site-packages/sqlalchemy/pool.py", line 532, in checkout
>  rec = pool._do_get()
>  File "/home/ubuntu/.local/lib/python2.7/site-packages/sqlalchemy/pool.py", line 1287, in _do_get
>  return self._create_connection()
>  File "/home/ubuntu/.local/lib/python2.7/site-packages/sqlalchemy/pool.py", line 350, in _create_connection
>  return _ConnectionRecord(self)
>  File "/home/ubuntu/.local/lib/python2.7/site-packages/sqlalchemy/pool.py", line 477, in __init__
>  self.__connect(first_connect_check=True)
>  File "/home/ubuntu/.local/lib/python2.7/site-packages/sqlalchemy/pool.py", line 674, in __connect
>  connection = pool._invoke_creator(self)
>  File "/home/ubuntu/.local/lib/python2.7/site-packages/sqlalchemy/engine/strategies.py", line 106, in connect
>  return dialect.connect(*cargs, **cparams)
>  File "/home/ubuntu/.local/lib/python2.7/site-packages/sqlalchemy/engine/default.py", line 412, in connect
>  return self.dbapi.connect(*cargs, **cparams)
>  File "/home/ubuntu/.local/lib/python2.7/site-packages/psycopg2/__init__.py", line 130, in connect
>  conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
>  OperationalError: (psycopg2.OperationalError) could not connect to server: Connection timed out
>  Is the server running on host "<host>" and accepting
>  TCP/IP connections on port 5432?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)