You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Ian Davison (JIRA)" <ji...@apache.org> on 2019/05/08 16:23:00 UTC

[jira] [Created] (AIRFLOW-4483) sqlalchemy.orm.exc.ObjectDeletedError on webserver startup

Ian Davison created AIRFLOW-4483:
------------------------------------

             Summary: sqlalchemy.orm.exc.ObjectDeletedError on webserver startup
                 Key: AIRFLOW-4483
                 URL: https://issues.apache.org/jira/browse/AIRFLOW-4483
             Project: Apache Airflow
          Issue Type: Bug
          Components: webserver
    Affects Versions: 1.10.3
            Reporter: Ian Davison


While running on Flask AppBuilder GUI, occasionally I've been seeing errors like this one on webserver startup, that causes the webserver to crash and become unresponsive:
{code:java}
| [2019-05-08 16:07:07,087] {security.py:307} INFO - Cleaning faulty perms
webserver_1 | [2019-05-08 16:07:07 +0000] [54] [ERROR] Exception in worker process
webserver_1 | Traceback (most recent call last):
webserver_1 | File "/usr/local/lib/python3.6/site-packages/gunicorn/arbiter.py", line 583, in spawn_worker
webserver_1 | worker.init_process()
webserver_1 | File "/usr/local/lib/python3.6/site-packages/gunicorn/workers/base.py", line 129, in init_process
webserver_1 | self.load_wsgi()
webserver_1 | File "/usr/local/lib/python3.6/site-packages/gunicorn/workers/base.py", line 138, in load_wsgi
webserver_1 | self.wsgi = self.app.wsgi()
webserver_1 | File "/usr/local/lib/python3.6/site-packages/gunicorn/app/base.py", line 67, in wsgi
webserver_1 | self.callable = self.load()
webserver_1 | File "/usr/local/lib/python3.6/site-packages/gunicorn/app/wsgiapp.py", line 52, in load
webserver_1 | return self.load_wsgiapp()
webserver_1 | File "/usr/local/lib/python3.6/site-packages/gunicorn/app/wsgiapp.py", line 41, in load_wsgiapp
webserver_1 | return util.import_app(self.app_uri)
webserver_1 | File "/usr/local/lib/python3.6/site-packages/gunicorn/util.py", line 362, in import_app
webserver_1 | app = eval(obj, vars(mod))
webserver_1 | File "<string>", line 1, in <module>
webserver_1 | File "/usr/local/lib/python3.6/site-packages/airflow/www_rbac/app.py", line 222, in cached_app
webserver_1 | app, _ = create_app(config, session, testing)
webserver_1 | File "/usr/local/lib/python3.6/site-packages/airflow/www_rbac/app.py", line 182, in create_app
webserver_1 | security_manager.sync_roles()
webserver_1 | File "/usr/local/lib/python3.6/site-packages/airflow/www_rbac/security.py", line 460, in sync_roles
webserver_1 | self.clean_perms()
webserver_1 | File "/usr/local/lib/python3.6/site-packages/airflow/www_rbac/security.py", line 316, in clean_perms
webserver_1 | deleted_count = pvms.delete()
webserver_1 | File "/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/query.py", line 3440, in delete
webserver_1 | delete_op.exec_()
webserver_1 | File "/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/persistence.py", line 1635, in exec_
webserver_1 | self._do_pre_synchronize()
webserver_1 | File "/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/persistence.py", line 1726, in _do_pre_synchronize
webserver_1 | ), obj in query.session.identity_map.items()
webserver_1 | File "/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/persistence.py", line 1727, in <listcomp>
webserver_1 | if issubclass(cls, target_cls) and eval_condition(obj)
webserver_1 | File "/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/evaluator.py", line 114, in evaluate
webserver_1 | value = sub_evaluate(obj)
webserver_1 | File "/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/evaluator.py", line 148, in evaluate
webserver_1 | return eval_left(obj) == eval_right(obj)
webserver_1 | File "/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/evaluator.py", line 105, in <lambda>
webserver_1 | return lambda obj: get_corresponding_attr(obj)
webserver_1 | File "/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/attributes.py", line 275, in __get__
webserver_1 | return self.impl.get(instance_state(instance), dict_)
webserver_1 | File "/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/attributes.py", line 669, in get
webserver_1 | value = state._load_expired(state, passive)
webserver_1 | File "/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/state.py", line 632, in _load_expired
webserver_1 | self.manager.deferred_scalar_loader(self, toload)
webserver_1 | File "/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/loading.py", line 985, in load_scalar_attributes
webserver_1 | raise orm_exc.ObjectDeletedError(state)
webserver_1 | sqlalchemy.orm.exc.ObjectDeletedError: Instance '<PermissionView at 0x7fc66822d550>' has been deleted, or its row is otherwise not present.
webserver_1 | [2019-05-08 16:07:07 +0000] [54] [INFO] Worker exiting (pid: 54)
webserver_1 | [2019-05-08 16:07:14,775] {security.py:307} INFO - Cleaning faulty perms
webserver_1 | [2019-05-08 16:07:14 +0000] [53] [INFO] Worker exiting (pid: 53)
webserver_1 | [2019-05-08 16:07:16,867] {security.py:307} INFO - Cleaning faulty perms
webserver_1 | [2019-05-08 16:07:17 +0000] [51] [INFO] Worker exiting (pid: 51)
webserver_1 | [2019-05-08 16:07:18,178] {security.py:307} INFO - Cleaning faulty perms
webserver_1 | [2019-05-08 16:07:18 +0000] [52] [INFO] Worker exiting (pid: 52)
webserver_1 | [2019-05-08 16:07:18,943] {cli.py:826} ERROR - [2 / 2] some workers seem to have died and gunicorndid not restart them as expected
webserver_1 | [2019-05-08 16:07:19 +0000] [44] [INFO] Shutting down: Master
webserver_1 | [2019-05-08 16:07:19 +0000] [44] [INFO] Reason: Worker failed to boot.
{code}
This only happen sometimes, and I've seen it happen while running locally via docker-compose and when I deploy onto k8s. It looks like maybe there's a race condition going on here: looking at the code for clean_perms
{code:java}
def clean_perms(self):
    """
    FAB leaves faulty permissions that need to be cleaned up
    """
    self.log.info('Cleaning faulty perms')
    sesh = self.get_session
    pvms = (
        sesh.query(sqla_models.PermissionView)
        .filter(or_(
            sqla_models.PermissionView.permission == None,  # NOQA
            sqla_models.PermissionView.view_menu == None,  # NOQA
        ))
    )
    deleted_count = pvms.delete()
    sesh.commit()
    if deleted_count:
        self.log.info('Deleted %s faulty permissions', deleted_count)
{code}
I wonder if all the workers are running if one could beat the other worker to pvms.delete()



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)