You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/01/14 07:06:00 UTC

[GitHub] [airflow] doowhtron opened a new issue #13668: scheduler dies with "MySQLdb._exceptions.OperationalError: (1213, 'Deadlock found when trying to get lock; try restarting transaction')"

doowhtron opened a new issue #13668:
URL: https://github.com/apache/airflow/issues/13668


   **Apache Airflow version**: 2.0.0
   
   
   **Kubernetes version (if you are using kubernetes)** (use `kubectl version`):
   
   **Environment**:
   
   - **Cloud provider or hardware configuration**: tencent cloud
   - **OS** (e.g. from /etc/os-release): centos7
   - **Kernel** (e.g. `uname -a`): 3.10
   - **Install tools**:
   - **Others**: mysql8
   
   **What happened**:
   
   Scheduler dies when I try to restart it. And the logs are as follows:
   ```
   {2021-01-14 13:29:05,424} {{scheduler_job.py:1293}} ERROR - Exception when executing SchedulerJob._run_scheduler_loop
   Traceback (most recent call last):
     File "/home/app/.pyenv/versions/3.8.1/envs/airflow-py381/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1276, in _execute_context
       self.dialect.do_execute(
     File "/home/app/.pyenv/versions/3.8.1/envs/airflow-py381/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 609, in do_execute
       cursor.execute(statement, parameters)
     File "/home/app/.pyenv/versions/3.8.1/envs/airflow-py381/lib/python3.8/site-packages/MySQLdb/cursors.py", line 209, in execute
       res = self._query(query)
     File "/home/app/.pyenv/versions/3.8.1/envs/airflow-py381/lib/python3.8/site-packages/MySQLdb/cursors.py", line 315, in _query
       db.query(q)
     File "/home/app/.pyenv/versions/3.8.1/envs/airflow-py381/lib/python3.8/site-packages/MySQLdb/connections.py", line 239, in query
       _mysql.connection.query(self, query)
   MySQLdb._exceptions.OperationalError: (1213, 'Deadlock found when trying to get lock; try restarting transaction')
   
   The above exception was the direct cause of the following exception:
   
   Traceback (most recent call last):
     File "/home/app/.pyenv/versions/3.8.1/envs/airflow-py381/lib/python3.8/site-packages/airflow/jobs/scheduler_job.py", line 1275, in _execute
       self._run_scheduler_loop()
     File "/home/app/.pyenv/versions/3.8.1/envs/airflow-py381/lib/python3.8/site-packages/airflow/jobs/scheduler_job.py", line 1349, in _run_scheduler_loop
       self.adopt_or_reset_orphaned_tasks()
     File "/home/app/.pyenv/versions/3.8.1/envs/airflow-py381/lib/python3.8/site-packages/airflow/utils/session.py", line 65, in wrapper
       return func(*args, session=session, **kwargs)
     File "/home/app/.pyenv/versions/3.8.1/envs/airflow-py381/lib/python3.8/site-packages/airflow/jobs/scheduler_job.py", line 1758, in adopt_or_reset_orphaned_tasks
       session.query(SchedulerJob)
     File "/home/app/.pyenv/versions/3.8.1/envs/airflow-py381/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line 4063, in update
       update_op.exec_()
     File "/home/app/.pyenv/versions/3.8.1/envs/airflow-py381/lib/python3.8/site-packages/sqlalchemy/orm/persistence.py", line 1697, in exec_
       self._do_exec()
     File "/home/app/.pyenv/versions/3.8.1/envs/airflow-py381/lib/python3.8/site-packages/sqlalchemy/orm/persistence.py", line 1895, in _do_exec
       self._execute_stmt(update_stmt)
     File "/home/app/.pyenv/versions/3.8.1/envs/airflow-py381/lib/python3.8/site-packages/sqlalchemy/orm/persistence.py", line 1702, in _execute_stmt
       self.result = self.query._execute_crud(stmt, self.mapper)
     File "/home/app/.pyenv/versions/3.8.1/envs/airflow-py381/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line 3568, in _execute_crud
       return conn.execute(stmt, self._params)
     File "/home/app/.pyenv/versions/3.8.1/envs/airflow-py381/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1011, in execute
       return meth(self, multiparams, params)
     File "/home/app/.pyenv/versions/3.8.1/envs/airflow-py381/lib/python3.8/site-packages/sqlalchemy/sql/elements.py", line 298, in _execute_on_connection
       return connection._execute_clauseelement(self, multiparams, params)
     File "/home/app/.pyenv/versions/3.8.1/envs/airflow-py381/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1124, in _execute_clauseelement
       ret = self._execute_context(
     File "/home/app/.pyenv/versions/3.8.1/envs/airflow-py381/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1316, in _execute_context
       self._handle_dbapi_exception(
     File "/home/app/.pyenv/versions/3.8.1/envs/airflow-py381/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1510, in _handle_dbapi_exception
       util.raise_(
     File "/home/app/.pyenv/versions/3.8.1/envs/airflow-py381/lib/python3.8/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
       raise exception
     File "/home/app/.pyenv/versions/3.8.1/envs/airflow-py381/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1276, in _execute_context
       self.dialect.do_execute(
     File "/home/app/.pyenv/versions/3.8.1/envs/airflow-py381/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 609, in do_execute
       cursor.execute(statement, parameters)
     File "/home/app/.pyenv/versions/3.8.1/envs/airflow-py381/lib/python3.8/site-packages/MySQLdb/cursors.py", line 209, in execute
       res = self._query(query)
     File "/home/app/.pyenv/versions/3.8.1/envs/airflow-py381/lib/python3.8/site-packages/MySQLdb/cursors.py", line 315, in _query
       db.query(q)
     File "/home/app/.pyenv/versions/3.8.1/envs/airflow-py381/lib/python3.8/site-packages/MySQLdb/connections.py", line 239, in query
       _mysql.connection.query(self, query)
   sqlalchemy.exc.OperationalError: (MySQLdb._exceptions.OperationalError) (1213, 'Deadlock found when trying to get lock; try restarting transaction')
   [SQL: UPDATE job SET state=%s WHERE job.state = %s AND job.latest_heartbeat < %s]
   [parameters: ('failed', 'running', datetime.datetime(2021, 1, 14, 5, 28, 35, 157941))]
   (Background on this error at: http://sqlalche.me/e/13/e3q8)
   {2021-01-14 13:29:06,435} {{process_utils.py:95}} INFO - Sending Signals.SIGTERM to GPID 6293
   {2021-01-14 13:29:06,677} {{process_utils.py:61}} INFO - Process psutil.Process(pid=6318, status='terminated') (6318) terminated with exit code None
   {2021-01-14 13:29:06,767} {{process_utils.py:201}} INFO - Waiting up to 5 seconds for processes to exit...
   {2021-01-14 13:29:06,850} {{process_utils.py:61}} INFO - Process psutil.Process(pid=6320, status='terminated') (6320) terminated with exit code None
   {2021-01-14 13:29:06,850} {{process_utils.py:61}} INFO - Process psutil.Process(pid=6319, status='terminated') (6319) terminated with exit code None
   {2021-01-14 13:29:06,858} {{process_utils.py:61}} INFO - Process psutil.Process(pid=6321, status='terminated') (6321) terminated with exit code None
   {2021-01-14 13:29:06,864} {{process_utils.py:201}} INFO - Waiting up to 5 seconds for processes to exit...
   {2021-01-14 13:29:06,876} {{process_utils.py:61}} INFO - Process psutil.Process(pid=6293, status='terminated') (6293) terminated with exit code 0
   {2021-01-14 13:29:06,876} {{scheduler_job.py:1296}} INFO - Exited execute loop
   ```
   
   **What you expected to happen**:
   
   Schdeduler should not die.
   
   **How to reproduce it**:
   
   I don't know how to reproduce it
   
   **Anything else we need to know**:
   
   I just upgrade airflow from 1.10.14. Now I try to fix it temporarily by catching the exception in scheduler_job.py
   
   ```python
   for dag_run in dag_runs:
       try:
           self._schedule_dag_run(dag_run, active_runs_by_dag_id.get(dag_run.dag_id, set()), session)
       except Exception as e:
           self.log.exception(e)
   ```
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on issue #13668: scheduler dies with "MySQLdb._exceptions.OperationalError: (1213, 'Deadlock found when trying to get lock; try restarting transaction')"

Posted by GitBox <gi...@apache.org>.
kaxil commented on issue #13668:
URL: https://github.com/apache/airflow/issues/13668#issuecomment-763196141


   @doowhtron Can you please let us know how to reproduce this? Does this happen consistently for you or randomly


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on issue #13668: scheduler dies with "MySQLdb._exceptions.OperationalError: (1213, 'Deadlock found when trying to get lock; try restarting transaction')"

Posted by GitBox <gi...@apache.org>.
kaxil commented on issue #13668:
URL: https://github.com/apache/airflow/issues/13668#issuecomment-760515498


   Is it MySql or Mariadb version of MySql


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil edited a comment on issue #13668: scheduler dies with "MySQLdb._exceptions.OperationalError: (1213, 'Deadlock found when trying to get lock; try restarting transaction')"

Posted by GitBox <gi...@apache.org>.
kaxil edited a comment on issue #13668:
URL: https://github.com/apache/airflow/issues/13668#issuecomment-763196141


   @doowhtron Can you please let us know how to reproduce this? Does this happen consistently for you or randomly
   
   Same question as https://github.com/apache/airflow/issues/13667#issuecomment-763199559


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] doowhtron commented on issue #13668: scheduler dies with "MySQLdb._exceptions.OperationalError: (1213, 'Deadlock found when trying to get lock; try restarting transaction')"

Posted by GitBox <gi...@apache.org>.
doowhtron commented on issue #13668:
URL: https://github.com/apache/airflow/issues/13668#issuecomment-760756570


   > Is it MySql or Mariadb version of MySql
   
   mysql


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil closed issue #13668: scheduler dies with "MySQLdb._exceptions.OperationalError: (1213, 'Deadlock found when trying to get lock; try restarting transaction')"

Posted by GitBox <gi...@apache.org>.
kaxil closed issue #13668:
URL: https://github.com/apache/airflow/issues/13668


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] doowhtron commented on issue #13668: scheduler dies with "MySQLdb._exceptions.OperationalError: (1213, 'Deadlock found when trying to get lock; try restarting transaction')"

Posted by GitBox <gi...@apache.org>.
doowhtron commented on issue #13668:
URL: https://github.com/apache/airflow/issues/13668#issuecomment-759987099


   > Which MySQL version do you have? @doowhtron ?
   
   Server version: 8.0.22 MySQL Community Server - GPL
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #13668: scheduler dies with "MySQLdb._exceptions.OperationalError: (1213, 'Deadlock found when trying to get lock; try restarting transaction')"

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #13668:
URL: https://github.com/apache/airflow/issues/13668#issuecomment-759983887


   Which MySQL version do you have? @doowhtron ?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org