You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/06/09 02:36:13 UTC

[GitHub] [airflow] bhavaniravi opened a new issue, #24339: airflow db cleanup - psycopg2.errors.InFailedSqlTransaction celery_taskmeta

bhavaniravi opened a new issue, #24339:
URL: https://github.com/apache/airflow/issues/24339

   ### Apache Airflow version
   
   2.3.0
   
   ### What happened
   
   The following exception on running `airflow db clean`. Though a part of the fix was released in #23698 it doesn't rollback the transaction, causing the next table queries to fail
   
   
   ```
   Traceback (most recent call last):
     File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1705, in _execute_context
       self.dialect.do_execute(
     File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 716, in do_execute
       cursor.execute(statement, parameters)
   psycopg2.errors.InFailedSqlTransaction: current transaction is aborted, commands ignored until end of transaction block
   
   
   The above exception was the direct cause of the following exception:
   
   Traceback (most recent call last):
     File "/usr/local/bin/airflow", line 8, in <module>
       sys.exit(main())
     File "/usr/local/lib/python3.9/site-packages/airflow/__main__.py", line 38, in main
       args.func(args)
     File "/usr/local/lib/python3.9/site-packages/airflow/cli/cli_parser.py", line 51, in command
       return func(*args, **kwargs)
     File "/usr/local/lib/python3.9/site-packages/airflow/utils/cli.py", line 99, in wrapper
       return f(*args, **kwargs)
     File "/usr/local/lib/python3.9/site-packages/airflow/cli/commands/db_command.py", line 195, in cleanup_tables
       run_cleanup(
     File "/usr/local/lib/python3.9/site-packages/airflow/utils/session.py", line 71, in wrapper
       return func(*args, session=session, **kwargs)
     File "/usr/local/lib/python3.9/site-packages/airflow/utils/db_cleanup.py", line 311, in run_cleanup
       _cleanup_table(
     File "/usr/local/lib/python3.9/site-packages/airflow/utils/db_cleanup.py", line 228, in _cleanup_table
       _print_entities(query=query, print_rows=False)
     File "/usr/local/lib/python3.9/site-packages/airflow/utils/db_cleanup.py", line 137, in _print_entities
       num_entities = query.count()
     File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/query.py", line 3062, in count
       return self._from_self(col).enable_eagerloads(False).scalar()
     File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/query.py", line 2803, in scalar
       ret = self.one()
     File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/query.py", line 2780, in one
       return self._iter().one()
     File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/query.py", line 2818, in _iter
       result = self.session.execute(
     File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 1670, in execute
       result = conn._execute_20(statement, params or {}, execution_options)
     File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1520, in _execute_20
       return meth(self, args_10style, kwargs_10style, execution_options)
     File "/usr/local/lib/python3.9/site-packages/sqlalchemy/sql/elements.py", line 313, in _execute_on_connection
       return connection._execute_clauseelement(
     File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1389, in _execute_clauseelement
       ret = self._execute_context(
     File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1748, in _execute_context
       self._handle_dbapi_exception(
     File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1929, in _handle_dbapi_exception
       util.raise_(
     File "/usr/local/lib/python3.9/site-packages/sqlalchemy/util/compat.py", line 211, in raise_
       raise exception
     File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1705, in _execute_context
       self.dialect.do_execute(
     File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 716, in do_execute
       cursor.execute(statement, parameters)
   sqlalchemy.exc.InternalError: (psycopg2.errors.InFailedSqlTransaction) current transaction is aborted, commands ignored until end of transaction block
   ```
   
   ### What you think should happen instead
   
   The current table transaction should be rolled back and proceed to delete other tables
   
   ### How to reproduce
   
   airflow db clean -v --clean-before-timestamp 2022-06-06 -y
   
   ### Operating System
   
   Debian GNU/Linux 10 (buster)
   
   ### Versions of Apache Airflow Providers
   
   _No response_
   
   ### Deployment
   
   Official Apache Airflow Helm Chart
   
   ### Deployment details
   
   _No response_
   
   ### Anything else
   
   As a fix I'm thinking of adding `session` paramter to `_warn_if_missing` and rollback when the error is caught. How does that look?
   
   ```
   class _warn_if_missing(AbstractContextManager):
       def __init__(self, table, suppress, session=None):
           ...
           self.session = session
   
       def __exit__(self, exctype, excinst, exctb):
           caught_error = exctype is not None and issubclass(exctype, (OperationalError, ProgrammingError))
           if caught_error:
               logger.warning("Table %r not found.  Skipping.", self.table)
               self.session.rollback()
           return caught_error
   ```
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] eladkal closed issue #24339: airflow db cleanup - psycopg2.errors.InFailedSqlTransaction celery_taskmeta

Posted by GitBox <gi...@apache.org>.
eladkal closed issue #24339: airflow db cleanup - psycopg2.errors.InFailedSqlTransaction celery_taskmeta
URL: https://github.com/apache/airflow/issues/24339


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] dstandish commented on issue #24339: airflow db cleanup - psycopg2.errors.InFailedSqlTransaction celery_taskmeta

Posted by GitBox <gi...@apache.org>.
dstandish commented on issue #24339:
URL: https://github.com/apache/airflow/issues/24339#issuecomment-1151256010

   This gets improved in https://github.com/apache/airflow/pull/23574  ([here](https://github.com/apache/airflow/pull/23574/files#diff-01a5acd2cb6573d557ab03270ea483704cd8e0139f2aed98e71c3aebed43f665R357-R362)).
   
   It that PR I check for table existence and warn if missing, for any table.  Then I suppress errors but log the traceback at DEBUG level if there's an actual error.
   
   It's more precise this way.  Previously it could think some errors were table missing errors, erroneously.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] uranusjr commented on issue #24339: airflow db cleanup - psycopg2.errors.InFailedSqlTransaction celery_taskmeta

Posted by GitBox <gi...@apache.org>.
uranusjr commented on issue #24339:
URL: https://github.com/apache/airflow/issues/24339#issuecomment-1150618252

   I think we need to check [`session.is_active`](https://docs.sqlalchemy.org/en/14/orm/session_api.html#sqlalchemy.orm.Session.is_active), but otherwise the fix makes sense.
   
   Actually, the context manager can be simplified to
   
   ```python
   @contextlib.contextmanager
   def _warn_if_missing(table, session):
       try:
           yield
       except (OperationalError, ProgrammingError):
           logger.warning("Table %r not found.  Skipping.", table)
           if session.is_active:
               session.rollback()
           raise
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] eladkal commented on issue #24339: airflow db cleanup - psycopg2.errors.InFailedSqlTransaction celery_taskmeta

Posted by GitBox <gi...@apache.org>.
eladkal commented on issue #24339:
URL: https://github.com/apache/airflow/issues/24339#issuecomment-1162874857

   According to https://github.com/apache/airflow/pull/24340#issuecomment-1162799303 PR https://github.com/apache/airflow/pull/23574 solved the issue


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org