You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/09/24 21:40:28 UTC

[GitHub] [airflow] jbarnettfreejazz opened a new issue #18512: airflow deadlock trying to update rendered_task_instance_fields table (mysql)

jbarnettfreejazz opened a new issue #18512:
URL: https://github.com/apache/airflow/issues/18512


   ### Apache Airflow version
   
   2.1.4 (latest released)
   
   ### Operating System
   
   linux
   
   ### Versions of Apache Airflow Providers
   
   _No response_
   
   ### Deployment
   
   Official Apache Airflow Helm Chart
   
   ### Deployment details
   
   _No response_
   
   ### What happened
   
   We have been unable to reproduce this in our testing. Our dags will utilize the S3KeySensor task. Sometimes we will have up to 100 dag_runs (running the same DAG) waiting for the S3KeySensor to poke the expected S3 documents.
   
   We are regularly seeing this deadlock with mysql:
   
   ```
   Exception: (MySQLdb._exceptions.OperationalError) (1213, 'Deadlock found when trying to get lock; try restarting transaction')
   [SQL: DELETE FROM rendered_task_instance_fields WHERE rendered_task_instance_fields.dag_id = %s AND rendered_task_instance_fields.task_id = %s AND (rendered_task_instance_fields.dag_id, rendered_task_instance_fields.task_id, rendered_task_instance_fields.execution_date) NOT IN (SELECT subq1.dag_id, subq1.task_id, subq1.execution_date 
   FROM (SELECT rendered_task_instance_fields.dag_id AS dag_id, rendered_task_instance_fields.task_id AS task_id, rendered_task_instance_fields.execution_date AS execution_date 
   FROM rendered_task_instance_fields 
   WHERE rendered_task_instance_fields.dag_id = %s AND rendered_task_instance_fields.task_id = %s ORDER BY rendered_task_instance_fields.execution_date DESC 
    LIMIT %s) AS subq1)]
   [parameters: ('refresh_hub', 'scorecard_wait', 'refresh_hub', 'scorecard_wait', 1)] Exception trying create activation error: 400:
   ```
   
   ### What you expected to happen
   
   _No response_
   
   ### How to reproduce
   
   _No response_
   
   ### Anything else
   
   Sometimes we will wait multiple days for our S3 documents to appear, This deadlock is occurring for 30%-40% of the dag_runs for an individual dags.
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] jbarnettfreejazz commented on issue #18512: airflow deadlock trying to update rendered_task_instance_fields table (mysql)

Posted by GitBox <gi...@apache.org>.
jbarnettfreejazz commented on issue #18512:
URL: https://github.com/apache/airflow/issues/18512#issuecomment-928004168


   Hi -- thanks for jumping on this -- This seems to primarily occur in the DELETE case. We're running MySql version 8.0.23
   
   Here's stack trace info 
   
   [2021-09-27 08:02:02,953] {taskinstance.py:1463} ERROR - Task failed with exception
   Traceback (most recent call last):
     File "/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1276, in _execute_context
       self.dialect.do_execute(
     File "/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 608, in do_execute
       cursor.execute(statement, parameters)
     File "/home/airflow/.local/lib/python3.8/site-packages/MySQLdb/cursors.py", line 206, in execute
       res = self._query(query)
     File "/home/airflow/.local/lib/python3.8/site-packages/MySQLdb/cursors.py", line 319, in _query
       db.query(q)
     File "/home/airflow/.local/lib/python3.8/site-packages/MySQLdb/connections.py", line 259, in query
       _mysql.connection.query(self, query)
   MySQLdb._exceptions.OperationalError: (1213, 'Deadlock found when trying to get lock; try restarting transaction')
   
   The above exception was the direct cause of the following exception:
   
   Traceback (most recent call last):
     File "/home/airflow/.local/lib/python3.8/site-packages/airflow/models/taskinstance.py", line 1165, in _run_raw_task
       self._prepare_and_execute_task_with_callbacks(context, task)
     File "/home/airflow/.local/lib/python3.8/site-packages/airflow/models/taskinstance.py", line 1248, in _prepare_and_execute_task_with_callbacks
       RenderedTaskInstanceFields.delete_old_records(self.task_id, self.dag_id)
     File "/home/airflow/.local/lib/python3.8/site-packages/airflow/utils/session.py", line 70, in wrapper
       return func(*args, session=session, **kwargs)
     File "/home/airflow/.local/lib/python3.8/site-packages/airflow/models/renderedtifields.py", line 173, in delete_old_records
       session.query(cls).filter(
     File "/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line 3926, in delete
       delete_op.exec_()
     File "/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/orm/persistence.py", line 1697, in exec_
       self._do_exec()
     File "/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/orm/persistence.py", line 1930, in _do_exec
       self._execute_stmt(delete_stmt)
     File "/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/orm/persistence.py", line 1702, in _execute_stmt
       self.result = self.query._execute_crud(stmt, self.mapper)
     File "/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line 3568, in _execute_crud
       return conn.execute(stmt, self._params)
     File "/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1011, in execute
       return meth(self, multiparams, params)
     File "/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/sql/elements.py", line 298, in _execute_on_connection
       return connection._execute_clauseelement(self, multiparams, params)
     File "/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1124, in _execute_clauseelement
       ret = self._execute_context(
     File "/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1316, in _execute_context
       self._handle_dbapi_exception(
     File "/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1510, in _handle_dbapi_exception
       util.raise_(
     File "/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
       raise exception
     File "/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1276, in _execute_context
       self.dialect.do_execute(
     File "/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 608, in do_execute
       cursor.execute(statement, parameters)
     File "/home/airflow/.local/lib/python3.8/site-packages/MySQLdb/cursors.py", line 206, in execute
       res = self._query(query)
     File "/home/airflow/.local/lib/python3.8/site-packages/MySQLdb/cursors.py", line 319, in _query
       db.query(q)
     File "/home/airflow/.local/lib/python3.8/site-packages/MySQLdb/connections.py", line 259, in query
       _mysql.connection.query(self, query)
   sqlalchemy.exc.OperationalError: (MySQLdb._exceptions.OperationalError) (1213, 'Deadlock found when trying to get lock; try restarting transaction')
   [SQL: DELETE FROM rendered_task_instance_fields WHERE rendered_task_instance_fields.dag_id = %s AND rendered_task_instance_fields.task_id = %s AND (rendered_task_instance_fields.dag_id, rendered_task_instance_fields.task_id, rendered_task_instance_fields.execution_date) NOT IN (SELECT subq1.dag_id, subq1.task_id, subq1.execution_date 
   FROM (SELECT rendered_task_instance_fields.dag_id AS dag_id, rendered_task_instance_fields.task_id AS task_id, rendered_task_instance_fields.execution_date AS execution_date 
   FROM rendered_task_instance_fields 
   WHERE rendered_task_instance_fields.dag_id = %s AND rendered_task_instance_fields.task_id = %s ORDER BY rendered_task_instance_fields.execution_date DESC 
    LIMIT %s) AS subq1)]
   [parameters: ('refresh_hub', 'scorecard_wait', 'refresh_hub', 'scorecard_wait', 1)]
   (Background on this error at: http://sqlalche.me/e/13/e3q8)
   [2021-09-27 08:02:02,955] {taskinstance.py:1506} INFO - Marking task as FAILED. dag_id=refresh_hub, task_id=scorecard_wait, execution_date=20210921T153200, start_date=20210927T080202, end_date=20210927T080202
   [2021-09-27 08:02:03,024] {local_task_job.py:151} INFO - Task exited with return code 1
   [2021-09-27 08:02:03,064] {common.py:230} INFO - Exception JSON: (MySQLdb._exceptions.OperationalError) (1213, 'Deadlock found when trying to get lock; try restarting transaction')
   [SQL: DELETE FROM rendered_task_instance_fields WHERE rendered_task_instance_fields.dag_id = %s AND rendered_task_instance_fields.task_id = %s AND (rendered_task_instance_fields.dag_id, rendered_task_instance_fields.task_id, rendered_task_instance_fields.execution_date) NOT IN (SELECT subq1.dag_id, subq1.task_id, subq1.execution_date 
   FROM (SELECT rendered_task_instance_fields.dag_id AS dag_id, rendered_task_instance_fields.task_id AS task_id, rendered_task_instance_fields.execution_date AS execution_date 
   FROM rendered_task_instance_fields 
   WHERE rendered_task_instance_fields.dag_id = %s AND rendered_task_instance_fields.task_id = %s ORDER BY rendered_task_instance_fields.execution_date DESC 
    LIMIT %s) AS subq1)]
   [parameters: ('refresh_hub', 'scorecard_wait', 'refresh_hub', 'scorecard_wait', 1)]


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #18512: airflow deadlock trying to update rendered_task_instance_fields table (mysql)

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #18512:
URL: https://github.com/apache/airflow/issues/18512#issuecomment-927107655


   Is it always the same DELETE FROM ? Could you please provide a bit more detailed stack trace from the logs - including line numbers etc. ? Also can you add more information about your MySQL deployment - which version of MySQL you have and what is your configuration of MySQL backend if possible (I am especially interested in any configuratoin that is different than default) ? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil closed issue #18512: airflow deadlock trying to update rendered_task_instance_fields table (mysql)

Posted by GitBox <gi...@apache.org>.
kaxil closed issue #18512:
URL: https://github.com/apache/airflow/issues/18512


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on issue #18512: airflow deadlock trying to update rendered_task_instance_fields table (mysql)

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #18512:
URL: https://github.com/apache/airflow/issues/18512#issuecomment-926934263


   Thanks for opening your first issue here! Be sure to follow the issue template!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org