You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/02/02 02:26:08 UTC

[GitHub] [airflow] kaxil opened a new pull request #14016: Allow retrieving Variables, Connections, XCom without breaking HA Locks

kaxil opened a new pull request #14016:
URL: https://github.com/apache/airflow/pull/14016


   closes https://github.com/apache/airflow/issues/13811
   
   Previously, because we use `create_session` context manager with or
   without `@provide_session`: we always used to commit the session if
   the session is not explicitly passed to a function example:
   
   ```
   dag_run = task_instance.get_dagrun()
     File "/usr/local/lib/python3.7/site-packages/airflow/utils/session.py", line 65, in wrapper
       return func(*args, session=session, **kwargs)
     File "/usr/local/lib/python3.7/contextlib.py", line 119, in __exit__
       next(self.gen)
     File "/usr/local/lib/python3.7/site-packages/airflow/utils/session.py", line 32, in create_session
       session.commit()
     File "/usr/local/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 1042, in commit
       self.transaction.commit()
     File "/usr/local/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 504, in commit
       self._prepare_impl()
     File "/usr/local/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 472, in _prepare_impl
       self.session.dispatch.before_commit(self.session)
     File "/usr/local/lib/python3.7/site-packages/sqlalchemy/event/attr.py", line 322, in __call__
       fn(*args, **kw)
     File "/usr/local/lib/python3.7/site-packages/airflow/utils/sqlalchemy.py", line 217, in _validate_commit
       raise RuntimeError("UNEXPECTED COMMIT - THIS WILL BREAK HA LOCKS!")
   ```
   
   The same happens when using `Variable.get()` because of
   https://github.com/apache/airflow/blob/594069ee061e9839b2b12aa43aa3a23e05beed86/airflow/secrets/metastore.py#L55-L70
   
   This commit makes sure that we only commit the session if we add a new object or modify an existing object in the session
   by using `session._is_clean`:
   
   https://github.com/sqlalchemy/sqlalchemy/blob/25ee5a05df0daeb7dc7ba432172d6abc76ffab56/lib/sqlalchemy/orm/session.py#L3236-L3241
   
   ```python
       def _is_clean(self):
           return (
               not self.identity_map.check_modified()
               and not self._deleted
               and not self._new
           )
   
   ```
   
   <!--
   Thank you for contributing! Please make sure that your code changes
   are covered with tests. And in case of new features or big changes
   remember to adjust the documentation.
   
   Feel free to ping committers for the review!
   
   In case of existing issue, reference it using one of the following:
   
   closes: #ISSUE
   related: #ISSUE
   
   How to write a good git commit message:
   http://chris.beams.io/posts/git-commit/
   -->
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)** for more information.
   In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on a change in pull request #14016: Allow retrieving Variables, Connections, XCom without breaking HA Locks

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #14016:
URL: https://github.com/apache/airflow/pull/14016#discussion_r568280948



##########
File path: airflow/utils/session.py
##########
@@ -29,7 +29,9 @@ def create_session():
     session = settings.Session()
     try:
         yield session
-        session.commit()
+        # Only Commit is a new or a modified object exists in the session
+        if not session._is_clean():  # pylint: disable=protected-access

Review comment:
       Both the file changes are not strictly needed. Either of the two file changes are sufficient.
   
   In one cases we commit an empty session and in other case we don't commit a session if it is empty. 

##########
File path: airflow/utils/sqlalchemy.py
##########
@@ -214,6 +214,9 @@ def _validate_commit(self, _):
         if self.expected_commit:
             self.expected_commit = False
             return
+        # Don't raise error if session is clean (useful when just querying from DB)
+        if self.session._is_clean():  # pylint: disable=protected-access

Review comment:
       Yeah, I tried with `session.execute("update dag set is_active=False where dag_id='example_branch_operator'")` where `session._is_clean()` is `True` -- so this logic does not work -- going to close the PR




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] Dr-Denzy commented on a change in pull request #14016: Allow retrieving Variables, Connections, XCom without breaking HA Locks

Posted by GitBox <gi...@apache.org>.
Dr-Denzy commented on a change in pull request #14016:
URL: https://github.com/apache/airflow/pull/14016#discussion_r568446336



##########
File path: airflow/utils/session.py
##########
@@ -29,7 +29,9 @@ def create_session():
     session = settings.Session()
     try:
         yield session
-        session.commit()
+        # Only Commit is a new or a modified object exists in the session

Review comment:
       I believe the intended comment on this line is: 
   `# Only Commit if a new or a modified object exists in the session`
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on a change in pull request #14016: Allow retrieving Variables, Connections, XCom without breaking HA Locks

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #14016:
URL: https://github.com/apache/airflow/pull/14016#discussion_r568280948



##########
File path: airflow/utils/session.py
##########
@@ -29,7 +29,9 @@ def create_session():
     session = settings.Session()
     try:
         yield session
-        session.commit()
+        # Only Commit is a new or a modified object exists in the session
+        if not session._is_clean():  # pylint: disable=protected-access

Review comment:
       Both the file changes are not strictly needed. Either of the two file changes are sufficient.
   
   In one cases we commit an empty session and in other case we don't commit a session if it is empty. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil closed pull request #14016: Allow retrieving Variables, Connections, XCom without breaking HA Locks

Posted by GitBox <gi...@apache.org>.
kaxil closed pull request #14016:
URL: https://github.com/apache/airflow/pull/14016


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] Dr-Denzy commented on a change in pull request #14016: Allow retrieving Variables, Connections, XCom without breaking HA Locks

Posted by GitBox <gi...@apache.org>.
Dr-Denzy commented on a change in pull request #14016:
URL: https://github.com/apache/airflow/pull/14016#discussion_r568446336



##########
File path: airflow/utils/session.py
##########
@@ -29,7 +29,9 @@ def create_session():
     session = settings.Session()
     try:
         yield session
-        session.commit()
+        # Only Commit is a new or a modified object exists in the session

Review comment:
       I believe the intended comment on this line is: 
   `# Only Commit if a new or a modified object exists in the session`
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil closed pull request #14016: Allow retrieving Variables, Connections, XCom without breaking HA Locks

Posted by GitBox <gi...@apache.org>.
kaxil closed pull request #14016:
URL: https://github.com/apache/airflow/pull/14016


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #14016: Allow retrieving Variables, Connections, XCom without breaking HA Locks

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #14016:
URL: https://github.com/apache/airflow/pull/14016#discussion_r568443705



##########
File path: airflow/utils/sqlalchemy.py
##########
@@ -229,6 +232,8 @@ def commit(self):
 
         This is the required way to commit when the guard is in scope
         """
+        if not self.session._is_clean():  # pylint: disable=protected-access

Review comment:
       Not sure about this either

##########
File path: airflow/utils/sqlalchemy.py
##########
@@ -214,6 +214,9 @@ def _validate_commit(self, _):
         if self.expected_commit:
             self.expected_commit = False
             return
+        # Don't raise error if session is clean (useful when just querying from DB)
+        if self.session._is_clean():  # pylint: disable=protected-access

Review comment:
       This can't be here - even if the session is clean, if we allow a COMMIT the lock will be released




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on a change in pull request #14016: Allow retrieving Variables, Connections, XCom without breaking HA Locks

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #14016:
URL: https://github.com/apache/airflow/pull/14016#discussion_r568599400



##########
File path: airflow/utils/sqlalchemy.py
##########
@@ -214,6 +214,9 @@ def _validate_commit(self, _):
         if self.expected_commit:
             self.expected_commit = False
             return
+        # Don't raise error if session is clean (useful when just querying from DB)
+        if self.session._is_clean():  # pylint: disable=protected-access

Review comment:
       Yeah, I tried with `session.execute("update dag set is_active=False where dag_id='example_branch_operator'")` where `session._is_clean()` is `True` -- so this logic does not work -- going to close the PR




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #14016: Allow retrieving Variables, Connections, XCom without breaking HA Locks

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #14016:
URL: https://github.com/apache/airflow/pull/14016#discussion_r568443705



##########
File path: airflow/utils/sqlalchemy.py
##########
@@ -229,6 +232,8 @@ def commit(self):
 
         This is the required way to commit when the guard is in scope
         """
+        if not self.session._is_clean():  # pylint: disable=protected-access

Review comment:
       Not sure about this either

##########
File path: airflow/utils/sqlalchemy.py
##########
@@ -214,6 +214,9 @@ def _validate_commit(self, _):
         if self.expected_commit:
             self.expected_commit = False
             return
+        # Don't raise error if session is clean (useful when just querying from DB)
+        if self.session._is_clean():  # pylint: disable=protected-access

Review comment:
       This can't be here - even if the session is clean, if we allow a COMMIT the lock will be released




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org