You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/07/31 15:53:46 UTC

[GitHub] [airflow] FanatoniQ opened a new pull request, #25430: fixed fetch_all_handler

FanatoniQ opened a new pull request, #25430:
URL: https://github.com/apache/airflow/pull/25430

   <!--
   Thank you for contributing! Please make sure that your code changes
   are covered with tests. And in case of new features or big changes
   remember to adjust the documentation.
   
   Feel free to ping committers for the review!
   
   In case of an existing issue, reference it using one of the following:
   
   closes: #ISSUE
   related: #ISSUE
   
   How to write a good git commit message:
   http://chris.beams.io/posts/git-commit/
   -->
   `fetch_all_handler` mentioned in issues #25388 and possibly linked to #25412
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)** for more information.
   In case of fundamental code changes, an Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in a newsfragment file, named `{pr_number}.significant.rst` or `{issue_number}.significant.rst`, in [newsfragments](https://github.com/apache/airflow/tree/main/newsfragments).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on a diff in pull request #25430: Fix common sql DbApiHook fetch_all_handler

Posted by GitBox <gi...@apache.org>.
potiuk commented on code in PR #25430:
URL: https://github.com/apache/airflow/pull/25430#discussion_r937765188


##########
airflow/providers/common/sql/hooks/sql.py:
##########
@@ -30,13 +30,10 @@
 from airflow.utils.module_loading import import_string
 from airflow.version import version
 
-if TYPE_CHECKING:
-    from sqlalchemy.engine import CursorResult
 
-
-def fetch_all_handler(cursor: 'CursorResult') -> Optional[List[Tuple]]:
+def fetch_all_handler(cursor) -> Optional[List[Tuple]]:
     """Handler for DbApiHook.run() to return results"""
-    if cursor.returns_rows:
+    if cursor.description is not None:
         return cursor.fetchall()

Review Comment:
   What I particularly do not like about rowcount is this noe about `-1`:
   
   
   ```
   The attribute is -1 in case no [.execute*()](https://peps.python.org/pep-0249/#id14) 
   has been performed on the cursor or the rowcount of the last operation 
   is cannot be determined by the interface. [[7]](https://peps.python.org/pep-0249/#id46)
   
   Note
   Future versions of the DB API specification could redefine the latter case to have the object return None instead of -1.
   ```
   
   I think just having the note indicate that we should avoid it, and there is absolutely no more guarantees rowcount gives us than description:
   
   ```
   This attribute will be None for operations that do not return rows
   or if the cursor has not had an operation
   invoked via the [.execute*()](https://peps.python.org/pep-0249/#id14) method yet."
   ```
   
   This is the same, only less ambiguous IMHO.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] FanatoniQ commented on pull request #25430: Fix common sql DbApiHook fetch_all_handler

Posted by GitBox <gi...@apache.org>.
FanatoniQ commented on PR #25430:
URL: https://github.com/apache/airflow/pull/25430#issuecomment-1206190068

   @potiuk @uranusjr @kazanzhy I pushed force so that the cursor values in the test are not missleading:
   https://github.com/apache/airflow/compare/1ad0e8b4180777b2e24b6b0f75fee9a901251a72..fb1c513a454620b3e336edefaed20afa1184bd6b


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on a diff in pull request #25430: Fix common sql DbApiHook fetch_all_handler

Posted by GitBox <gi...@apache.org>.
potiuk commented on code in PR #25430:
URL: https://github.com/apache/airflow/pull/25430#discussion_r937751711


##########
airflow/providers/common/sql/hooks/sql.py:
##########
@@ -30,13 +30,10 @@
 from airflow.utils.module_loading import import_string
 from airflow.version import version
 
-if TYPE_CHECKING:
-    from sqlalchemy.engine import CursorResult
 
-
-def fetch_all_handler(cursor: 'CursorResult') -> Optional[List[Tuple]]:
+def fetch_all_handler(cursor) -> Optional[List[Tuple]]:
     """Handler for DbApiHook.run() to return results"""
-    if cursor.returns_rows:
+    if cursor.description is not None:
         return cursor.fetchall()

Review Comment:
   ```
   about cursor.description https://peps.python.org/pep-0249/#description "This attribute will be None for operations that do not return rows or if the cursor has not had an operation invoked via the [.execute*()](https://peps.python.org/pep-0249/#id14) method yet."
   ```
   
   This is indeed part of the standardm, so I do no see why we should not base the decision on that @uranusjr ? It's quite explicitly stated in the PEP that description is only present when there are some rows potentially to be returned (and it can be 0 rows as well).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] FanatoniQ commented on a diff in pull request #25430: fixed common sql DbApiHook fetch_all_handler

Posted by GitBox <gi...@apache.org>.
FanatoniQ commented on code in PR #25430:
URL: https://github.com/apache/airflow/pull/25430#discussion_r934229151


##########
airflow/providers/common/sql/hooks/sql.py:
##########
@@ -30,13 +30,10 @@
 from airflow.utils.module_loading import import_string
 from airflow.version import version
 
-if TYPE_CHECKING:
-    from sqlalchemy.engine import CursorResult
 
-
-def fetch_all_handler(cursor: 'CursorResult') -> Optional[List[Tuple]]:
+def fetch_all_handler(cursor) -> Optional[List[Tuple]]:
     """Handler for DbApiHook.run() to return results"""
-    if cursor.returns_rows:
+    if cursor.description is not None:
         return cursor.fetchall()

Review Comment:
   @uranusjr Totally agree on the all standard-compliant part. However then the sqlalchemy's documentation is wrong since for `returns_rows` it only mentions `description`. Do you have an example where description is not None and no rows where returned ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on a diff in pull request #25430: Fix common sql DbApiHook fetch_all_handler

Posted by GitBox <gi...@apache.org>.
potiuk commented on code in PR #25430:
URL: https://github.com/apache/airflow/pull/25430#discussion_r937765188


##########
airflow/providers/common/sql/hooks/sql.py:
##########
@@ -30,13 +30,10 @@
 from airflow.utils.module_loading import import_string
 from airflow.version import version
 
-if TYPE_CHECKING:
-    from sqlalchemy.engine import CursorResult
 
-
-def fetch_all_handler(cursor: 'CursorResult') -> Optional[List[Tuple]]:
+def fetch_all_handler(cursor) -> Optional[List[Tuple]]:
     """Handler for DbApiHook.run() to return results"""
-    if cursor.returns_rows:
+    if cursor.description is not None:
         return cursor.fetchall()

Review Comment:
   What I 'particularly` do not like about rowcount is this not about `-1`:
   
   
   ```
   The attribute is -1 in case no [.execute*()](https://peps.python.org/pep-0249/#id14) 
   has been performed on the cursor or the rowcount of the last operation 
   is cannot be determined by the interface. [[7]](https://peps.python.org/pep-0249/#id46)
   
   Note
   Future versions of the DB API specification could redefine the latter case to have the object return None instead of -1.
   ```
   
   I think just having the note indicate that we should avoid it, and there is absolutely no more guarantees rowcount gives us than description:
   
   ``
   This attribute will be None for operations that do not return rows
   or if the cursor has not had an operation
   invoked via the [.execute*()](https://peps.python.org/pep-0249/#id14) method yet."
   ```
   
   This is the same, only less ambiguous IMHO.



##########
airflow/providers/common/sql/hooks/sql.py:
##########
@@ -30,13 +30,10 @@
 from airflow.utils.module_loading import import_string
 from airflow.version import version
 
-if TYPE_CHECKING:
-    from sqlalchemy.engine import CursorResult
 
-
-def fetch_all_handler(cursor: 'CursorResult') -> Optional[List[Tuple]]:
+def fetch_all_handler(cursor) -> Optional[List[Tuple]]:
     """Handler for DbApiHook.run() to return results"""
-    if cursor.returns_rows:
+    if cursor.description is not None:
         return cursor.fetchall()

Review Comment:
   What I 'particularly` do not like about rowcount is this not about `-1`:
   
   
   ```
   The attribute is -1 in case no [.execute*()](https://peps.python.org/pep-0249/#id14) 
   has been performed on the cursor or the rowcount of the last operation 
   is cannot be determined by the interface. [[7]](https://peps.python.org/pep-0249/#id46)
   
   Note
   Future versions of the DB API specification could redefine the latter case to have the object return None instead of -1.
   ```
   
   I think just having the note indicate that we should avoid it, and there is absolutely no more guarantees rowcount gives us than description:
   
   ```
   This attribute will be None for operations that do not return rows
   or if the cursor has not had an operation
   invoked via the [.execute*()](https://peps.python.org/pep-0249/#id14) method yet."
   ```
   
   This is the same, only less ambiguous IMHO.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] kazanzhy commented on pull request #25430: Fix common sql DbApiHook fetch_all_handler

Posted by GitBox <gi...@apache.org>.
kazanzhy commented on PR #25430:
URL: https://github.com/apache/airflow/pull/25430#issuecomment-1201785617

   First of all, I made one more error here.
   There is `cursor.execute()` almost everywhere except ExasolHook where `conn.execute()` is called.
   And only last one is returning `CursorResult`. for other cases there are different cursors for different databases.
   
   So if I correctly understand we have to determine how to figure out if `.fetchall()` could be called in the cursor.
   I see the next solutions:
   1. `try ... except` straight but can slow down the process.
   2. Use one of the cursor attributes (https://peps.python.org/pep-0249/#cursor-attributes):
   * `description is not None`
     ... will be None for operations that do not return rows or if the cursor has not had an operation invoked via the .execute*() method yet.
     We could guarantee that the handler will be called only after the `.execute` in DbApiHook. But in DbApiHook we're calling `.execute` many times in the same cursor.
   * `cursor.rowcount > 0`
     ... specifies the number of rows that the last .execute*() produced (for DQL statements like SELECT) or affected (for DML statements like UPDATE or INSERT).
     We have to fetch results so there is no guarantee that DML statements will return some results
   * `rownumber is not None`
     ... should provide the current 0-based index of the cursor in the result set or None if the index cannot be determined.
     The index can be seen as the index of the cursor in a sequence (the result set). The next fetch operation will fetch the row indexed by .rownumber in that sequence.
     It probably couldn't be used
   
   Here's the implementation for Postgres and seems we could use description.
   https://github.com/psycopg/psycopg2/search?q=notuples
   
   
   And here are some experiments:
   ```
   from sqlalchemy import create_engine
   engine = create_engine('postgresql://reader:NWDMCE5xdipIjRrp@hh-pgsql-public.ebi.ac.uk:5432/pfmegrnargs')
   connection = engine.connect().execution_options(autocommit=True).connection
   
   cursor = connection.cursor()
   cursor.execute('SELECT 1;') 
   print(cursor.description) # (Column(name='?column?', type_code=23),)
   print(cursor.rowcount) # 1
   print(cursor.rownumber) # 0
   cursor.fetchall() # [(1,)]
   
   cursor = connection.cursor()
   query = """
   CREATE TEMP TABLE IF NOT EXISTS tmp (field TEXT);
   """
   cursor.execute(query) 
   print(cursor.description) # None
   print(cursor.rowcount) # -1
   print(cursor.rownumber) # 0
   cursor.fetchall() # psycopg2.ProgrammingError: no results to fetch
   
   cursor = connection.cursor()
   query = """
   CREATE TEMP TABLE IF NOT EXISTS tmp (field TEXT); INSERT INTO tmp (field) VALUES ('test');
   """
   cursor.execute(query) 
   print(cursor.description) # None
   print(cursor.rowcount) # -1
   print(cursor.rownumber) # 0
   cursor.fetchall() # psycopg2.ProgrammingError: no results to fetch
   
   cursor = connection.cursor()
   query = """
   CREATE TEMP TABLE IF NOT EXISTS tmp (field TEXT); INSERT INTO tmp (field) VALUES ('test') RETURNING *;
   """
   cursor.execute(query) 
   print(cursor.description) # (Column(name='field', type_code=25),)
   print(cursor.rowcount) # 1
   print(cursor.rownumber) # 0
   cursor.fetchall() # psycopg2.ProgrammingError: no results to fetch
   
   cursor = connection.cursor()
   query = """
   CREATE TEMP TABLE IF NOT EXISTS tmp (field TEXT); INSERT INTO tmp (field) VALUES ('test'); SELECT 1;
   """
   cursor.execute(query) 
   print(cursor.description) # (Column(name='?column?', type_code=23),)
   print(cursor.rowcount) # 1
   print(cursor.rownumber) # 0
   cursor.fetchall() # [(1,)]
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on a diff in pull request #25430: Fix common sql DbApiHook fetch_all_handler

Posted by GitBox <gi...@apache.org>.
potiuk commented on code in PR #25430:
URL: https://github.com/apache/airflow/pull/25430#discussion_r937751711


##########
airflow/providers/common/sql/hooks/sql.py:
##########
@@ -30,13 +30,10 @@
 from airflow.utils.module_loading import import_string
 from airflow.version import version
 
-if TYPE_CHECKING:
-    from sqlalchemy.engine import CursorResult
 
-
-def fetch_all_handler(cursor: 'CursorResult') -> Optional[List[Tuple]]:
+def fetch_all_handler(cursor) -> Optional[List[Tuple]]:
     """Handler for DbApiHook.run() to return results"""
-    if cursor.returns_rows:
+    if cursor.description is not None:
         return cursor.fetchall()

Review Comment:
   ```
   about cursor.description https://peps.python.org/pep-0249/#description "This attribute will be None for operations that do not return rows or if the cursor has not had an operation invoked via the [.execute*()](https://peps.python.org/pep-0249/#id14) method yet."
   ```
   
   This is indeed part of the standardm, so I do no see why we should not base the decision on that @uranusjr ? It's quite explicitly stated in the PEP that description is only present when there are some rows potentially to be returend.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on a diff in pull request #25430: Fix common sql DbApiHook fetch_all_handler

Posted by GitBox <gi...@apache.org>.
potiuk commented on code in PR #25430:
URL: https://github.com/apache/airflow/pull/25430#discussion_r937765188


##########
airflow/providers/common/sql/hooks/sql.py:
##########
@@ -30,13 +30,10 @@
 from airflow.utils.module_loading import import_string
 from airflow.version import version
 
-if TYPE_CHECKING:
-    from sqlalchemy.engine import CursorResult
 
-
-def fetch_all_handler(cursor: 'CursorResult') -> Optional[List[Tuple]]:
+def fetch_all_handler(cursor) -> Optional[List[Tuple]]:
     """Handler for DbApiHook.run() to return results"""
-    if cursor.returns_rows:
+    if cursor.description is not None:
         return cursor.fetchall()

Review Comment:
   What I particularly do not like about rowcount is this not about `-1`:
   
   
   ```
   The attribute is -1 in case no [.execute*()](https://peps.python.org/pep-0249/#id14) 
   has been performed on the cursor or the rowcount of the last operation 
   is cannot be determined by the interface. [[7]](https://peps.python.org/pep-0249/#id46)
   
   Note
   Future versions of the DB API specification could redefine the latter case to have the object return None instead of -1.
   ```
   
   I think just having the note indicate that we should avoid it, and there is absolutely no more guarantees rowcount gives us than description:
   
   ```
   This attribute will be None for operations that do not return rows
   or if the cursor has not had an operation
   invoked via the [.execute*()](https://peps.python.org/pep-0249/#id14) method yet."
   ```
   
   This is the same, only less ambiguous IMHO.



##########
airflow/providers/common/sql/hooks/sql.py:
##########
@@ -30,13 +30,10 @@
 from airflow.utils.module_loading import import_string
 from airflow.version import version
 
-if TYPE_CHECKING:
-    from sqlalchemy.engine import CursorResult
 
-
-def fetch_all_handler(cursor: 'CursorResult') -> Optional[List[Tuple]]:
+def fetch_all_handler(cursor) -> Optional[List[Tuple]]:
     """Handler for DbApiHook.run() to return results"""
-    if cursor.returns_rows:
+    if cursor.description is not None:
         return cursor.fetchall()

Review Comment:
   ```
   about cursor.description https://peps.python.org/pep-0249/#description 
   "This attribute will be None for operations that do not return rows
   or if the cursor has not had an operation
   invoked via the [.execute*()](https://peps.python.org/pep-0249/#id14) method yet."
   ```
   
   This is indeed part of the standardm, so I do no see why we should not base the decision on that @uranusjr ? It's quite explicitly stated in the PEP that description is only present when there are some rows potentially to be returned (and it can be 0 rows as well).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] uranusjr commented on pull request #25430: Fix common sql DbApiHook fetch_all_handler

Posted by GitBox <gi...@apache.org>.
uranusjr commented on PR #25430:
URL: https://github.com/apache/airflow/pull/25430#issuecomment-1201959797

   I think the problem with `try .. except` is less about the performance (the difference should be minimal compared to the actual SQL database connection time), but we don’t know what to catch in the first place :p Otherwise I’d use that.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] FanatoniQ commented on a diff in pull request #25430: fixed common sql DbApiHook fetch_all_handler

Posted by GitBox <gi...@apache.org>.
FanatoniQ commented on code in PR #25430:
URL: https://github.com/apache/airflow/pull/25430#discussion_r934229151


##########
airflow/providers/common/sql/hooks/sql.py:
##########
@@ -30,13 +30,10 @@
 from airflow.utils.module_loading import import_string
 from airflow.version import version
 
-if TYPE_CHECKING:
-    from sqlalchemy.engine import CursorResult
 
-
-def fetch_all_handler(cursor: 'CursorResult') -> Optional[List[Tuple]]:
+def fetch_all_handler(cursor) -> Optional[List[Tuple]]:
     """Handler for DbApiHook.run() to return results"""
-    if cursor.returns_rows:
+    if cursor.description is not None:
         return cursor.fetchall()

Review Comment:
   @uranusjr Totally agree on the all standard-compliant part. However then the sqlalchemy's documentation is wrong then since for returns_rows it only mentions description. Do you have an example where description is not None and no rows where returned ?



##########
airflow/providers/common/sql/hooks/sql.py:
##########
@@ -30,13 +30,10 @@
 from airflow.utils.module_loading import import_string
 from airflow.version import version
 
-if TYPE_CHECKING:
-    from sqlalchemy.engine import CursorResult
 
-
-def fetch_all_handler(cursor: 'CursorResult') -> Optional[List[Tuple]]:
+def fetch_all_handler(cursor) -> Optional[List[Tuple]]:
     """Handler for DbApiHook.run() to return results"""
-    if cursor.returns_rows:
+    if cursor.description is not None:
         return cursor.fetchall()

Review Comment:
   @uranusjr I find sqlalchemy notes on rowcount: https://docs.sqlalchemy.org/en/14/core/connections.html?highlight=returns_#sqlalchemy.engine.CursorResult.rowcount very interesting



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] FanatoniQ commented on a diff in pull request #25430: fixed common sql DbApiHook fetch_all_handler

Posted by GitBox <gi...@apache.org>.
FanatoniQ commented on code in PR #25430:
URL: https://github.com/apache/airflow/pull/25430#discussion_r934229151


##########
airflow/providers/common/sql/hooks/sql.py:
##########
@@ -30,13 +30,10 @@
 from airflow.utils.module_loading import import_string
 from airflow.version import version
 
-if TYPE_CHECKING:
-    from sqlalchemy.engine import CursorResult
 
-
-def fetch_all_handler(cursor: 'CursorResult') -> Optional[List[Tuple]]:
+def fetch_all_handler(cursor) -> Optional[List[Tuple]]:
     """Handler for DbApiHook.run() to return results"""
-    if cursor.returns_rows:
+    if cursor.description is not None:
         return cursor.fetchall()

Review Comment:
   @uranusjr Totally agree on the all standard-compliant part. However then this would mean the sqlalchemy's documentation is wrong since for `returns_rows` it only mentions `description`. Do you have an example where description is not None and no rows where returned ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] FanatoniQ commented on a diff in pull request #25430: fixed common sql DbApiHook fetch_all_handler

Posted by GitBox <gi...@apache.org>.
FanatoniQ commented on code in PR #25430:
URL: https://github.com/apache/airflow/pull/25430#discussion_r934213608


##########
airflow/providers/common/sql/hooks/sql.py:
##########
@@ -30,13 +30,10 @@
 from airflow.utils.module_loading import import_string
 from airflow.version import version
 
-if TYPE_CHECKING:
-    from sqlalchemy.engine import CursorResult
 
-
-def fetch_all_handler(cursor: 'CursorResult') -> Optional[List[Tuple]]:
+def fetch_all_handler(cursor) -> Optional[List[Tuple]]:
     """Handler for DbApiHook.run() to return results"""
-    if cursor.returns_rows:
+    if cursor.description is not None:
         return cursor.fetchall()

Review Comment:
   @uranusjr  This doesn't look true to me, I am using the following as reference:
   
   - https://peps.python.org/pep-0249/#description
   - https://docs.sqlalchemy.org/en/14/core/connections.html?highlight=returns_#sqlalchemy.engine.CursorResult.returns_rows
   
   Also:
   
   ```python
   >>> import pymssql
   >>> c = pymssql.connect(host, login, password)
   >>> cur = c.cursor()
   >>> cur.execute("SELECT SUSER_SNAME();")
   >>> cur.rowcount
   -1
   >>> cur.description
   (('', 1, None, None, None, None, None),)
   >>> cur.execute("PRINT('1');")
   >>> cur.rowcount
   -1
   >>> cur.description
   ```
   
   Edit: I have the same behaviour with `sqlite3` and `jaydebeapi`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] FanatoniQ commented on a diff in pull request #25430: fixed common sql DbApiHook fetch_all_handler

Posted by GitBox <gi...@apache.org>.
FanatoniQ commented on code in PR #25430:
URL: https://github.com/apache/airflow/pull/25430#discussion_r934285590


##########
airflow/providers/common/sql/hooks/sql.py:
##########
@@ -30,13 +30,10 @@
 from airflow.utils.module_loading import import_string
 from airflow.version import version
 
-if TYPE_CHECKING:
-    from sqlalchemy.engine import CursorResult
 
-
-def fetch_all_handler(cursor: 'CursorResult') -> Optional[List[Tuple]]:
+def fetch_all_handler(cursor) -> Optional[List[Tuple]]:
     """Handler for DbApiHook.run() to return results"""
-    if cursor.returns_rows:
+    if cursor.description is not None:
         return cursor.fetchall()

Review Comment:
   Quoting sqlalchemy:
   
   - about `CursorResult.return_rows` "Overall, the value of [CursorResult.returns_rows](https://docs.sqlalchemy.org/en/14/core/connections.html?highlight=returns_#sqlalchemy.engine.CursorResult.returns_rows) should always be synonymous with whether or not the DBAPI cursor had a .description attribute, indicating the presence of result columns, noting that a cursor that returns zero rows still has a .description if a row-returning statement was emitted."
   - about `row_count` (aforementioned link) "Statements that use RETURNING may not return a correct rowcount."
   - about `row_count` (aforementioned link) "Contrary to what the Python DBAPI says, it does not return the number of rows available from the results of a SELECT statement as DBAPIs cannot support this functionality when rows are unbuffered."
   
   Quoting PEP-249:
   
   - about `cursor.rowcount` https://peps.python.org/pep-0249/#id48 "The term number of affected rows generally refers to the number of rows deleted, updated or inserted by the last statement run on the database cursor."
   - about `cursor.description` https://peps.python.org/pep-0249/#description "This attribute will be None for operations that do not return rows or if the cursor has not had an operation invoked via the [.execute*()](https://peps.python.org/pep-0249/#id14) method yet."



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on pull request #25430: Fix common sql DbApiHook fetch_all_handler

Posted by GitBox <gi...@apache.org>.
potiuk commented on PR #25430:
URL: https://github.com/apache/airflow/pull/25430#issuecomment-1206354417

   :tada: 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on pull request #25430: Fix common sql DbApiHook fetch_all_handler

Posted by GitBox <gi...@apache.org>.
potiuk commented on PR #25430:
URL: https://github.com/apache/airflow/pull/25430#issuecomment-1206320674

   Yep. I definitely want to merge that one before the next provider's wave :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk merged pull request #25430: Fix common sql DbApiHook fetch_all_handler

Posted by GitBox <gi...@apache.org>.
potiuk merged PR #25430:
URL: https://github.com/apache/airflow/pull/25430


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] uranusjr commented on a diff in pull request #25430: fixed common sql DbApiHook fetch_all_handler

Posted by GitBox <gi...@apache.org>.
uranusjr commented on code in PR #25430:
URL: https://github.com/apache/airflow/pull/25430#discussion_r934095422


##########
airflow/providers/common/sql/hooks/sql.py:
##########
@@ -30,13 +30,10 @@
 from airflow.utils.module_loading import import_string
 from airflow.version import version
 
-if TYPE_CHECKING:
-    from sqlalchemy.engine import CursorResult
 
-
-def fetch_all_handler(cursor: 'CursorResult') -> Optional[List[Tuple]]:
+def fetch_all_handler(cursor) -> Optional[List[Tuple]]:
     """Handler for DbApiHook.run() to return results"""
-    if cursor.returns_rows:
+    if cursor.description is not None:
         return cursor.fetchall()

Review Comment:
   This doesn’t make sense. `description` only returns some information of the cursor and has nothing to do to whether the cursor returns data or not.
   
   According to PEP 249, whether a cursor returns information can be checked by
   
   ```python
   if cursor.rowcount is not None and cursor.rowcount >= 0
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] FanatoniQ commented on a diff in pull request #25430: fixed common sql DbApiHook fetch_all_handler

Posted by GitBox <gi...@apache.org>.
FanatoniQ commented on code in PR #25430:
URL: https://github.com/apache/airflow/pull/25430#discussion_r934213608


##########
airflow/providers/common/sql/hooks/sql.py:
##########
@@ -30,13 +30,10 @@
 from airflow.utils.module_loading import import_string
 from airflow.version import version
 
-if TYPE_CHECKING:
-    from sqlalchemy.engine import CursorResult
 
-
-def fetch_all_handler(cursor: 'CursorResult') -> Optional[List[Tuple]]:
+def fetch_all_handler(cursor) -> Optional[List[Tuple]]:
     """Handler for DbApiHook.run() to return results"""
-    if cursor.returns_rows:
+    if cursor.description is not None:
         return cursor.fetchall()

Review Comment:
   @uranusjr  This doesn't look true to me, I am using the following as reference:
   
   - https://peps.python.org/pep-0249/#description
   - https://docs.sqlalchemy.org/en/14/core/connections.html?highlight=returns_#sqlalchemy.engine.CursorResult.returns_rows
   
   Also:
   
   ```python
   >>> import pymssql
   >>> c = pymssql.connect(host, login, password)
   >>> cur = c.cursor()
   >>> cur.execute("SELECT SUSER_SNAME();")
   >>> cur.rowcount
   -1
   >>> cur.description
   (('', 1, None, None, None, None, None),)
   >>> cur.execute("PRINT('1');")
   >>> cur.rowcount
   -1
   >>> cur.description
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] FanatoniQ commented on a diff in pull request #25430: fixed common sql DbApiHook fetch_all_handler

Posted by GitBox <gi...@apache.org>.
FanatoniQ commented on code in PR #25430:
URL: https://github.com/apache/airflow/pull/25430#discussion_r934285590


##########
airflow/providers/common/sql/hooks/sql.py:
##########
@@ -30,13 +30,10 @@
 from airflow.utils.module_loading import import_string
 from airflow.version import version
 
-if TYPE_CHECKING:
-    from sqlalchemy.engine import CursorResult
 
-
-def fetch_all_handler(cursor: 'CursorResult') -> Optional[List[Tuple]]:
+def fetch_all_handler(cursor) -> Optional[List[Tuple]]:
     """Handler for DbApiHook.run() to return results"""
-    if cursor.returns_rows:
+    if cursor.description is not None:
         return cursor.fetchall()

Review Comment:
   Quoting sqlalchemy (aforementioned link):
   
   - "Statements that use RETURNING may not return a correct rowcount."
   - "Contrary to what the Python DBAPI says, it does not return the number of rows available from the results of a SELECT statement as DBAPIs cannot support this functionality when rows are unbuffered."
   
   Quoting PEP-249:
   
   - about `cursor.rowcount` https://peps.python.org/pep-0249/#id48 "The term number of affected rows generally refers to the number of rows deleted, updated or inserted by the last statement run on the database cursor."
   - about `cursor.description` https://peps.python.org/pep-0249/#description "This attribute will be None for operations that do not return rows or if the cursor has not had an operation invoked via the [.execute*()](https://peps.python.org/pep-0249/#id14) method yet."



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] uranusjr commented on a diff in pull request #25430: fixed common sql DbApiHook fetch_all_handler

Posted by GitBox <gi...@apache.org>.
uranusjr commented on code in PR #25430:
URL: https://github.com/apache/airflow/pull/25430#discussion_r934216768


##########
airflow/providers/common/sql/hooks/sql.py:
##########
@@ -30,13 +30,10 @@
 from airflow.utils.module_loading import import_string
 from airflow.version import version
 
-if TYPE_CHECKING:
-    from sqlalchemy.engine import CursorResult
 
-
-def fetch_all_handler(cursor: 'CursorResult') -> Optional[List[Tuple]]:
+def fetch_all_handler(cursor) -> Optional[List[Tuple]]:
     """Handler for DbApiHook.run() to return results"""
-    if cursor.returns_rows:
+    if cursor.description is not None:
         return cursor.fetchall()

Review Comment:
   I am not sure how pymssql does things, but according to PEP 249, `description` does not offer the same functionality as SQLAlchemy’s `return_rows`. If `rowcount` does not either, you need to find another way that actually has a backing standard. Since DbApiHook should work for _all_ standard-compliant databases, we can’t rely on individual database behaviours, but must refer to the standard.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] FanatoniQ commented on a diff in pull request #25430: fixed common sql DbApiHook fetch_all_handler

Posted by GitBox <gi...@apache.org>.
FanatoniQ commented on code in PR #25430:
URL: https://github.com/apache/airflow/pull/25430#discussion_r934213608


##########
airflow/providers/common/sql/hooks/sql.py:
##########
@@ -30,13 +30,10 @@
 from airflow.utils.module_loading import import_string
 from airflow.version import version
 
-if TYPE_CHECKING:
-    from sqlalchemy.engine import CursorResult
 
-
-def fetch_all_handler(cursor: 'CursorResult') -> Optional[List[Tuple]]:
+def fetch_all_handler(cursor) -> Optional[List[Tuple]]:
     """Handler for DbApiHook.run() to return results"""
-    if cursor.returns_rows:
+    if cursor.description is not None:
         return cursor.fetchall()

Review Comment:
   This doesn't look true to me, I am using the following as reference:
   
   - https://peps.python.org/pep-0249/#description
   - https://docs.sqlalchemy.org/en/14/core/connections.html?highlight=returns_#sqlalchemy.engine.CursorResult.returns_rows
   
   Also:
   
   ```python
   >>> import pymssql
   >>> c = pymssql.connect(host, login, password)
   >>> cur = c.cursor()
   >>> cur.execute("SELECT SUSER_SNAME();")
   >>> cur.rowcount
   -1
   >>> cur.description
   (('', 1, None, None, None, None, None),)
   >>> cur.execute("PRINT('1');")
   >>> cur.rowcount
   -1
   >>> cur.description
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] FanatoniQ commented on a diff in pull request #25430: fixed common sql DbApiHook fetch_all_handler

Posted by GitBox <gi...@apache.org>.
FanatoniQ commented on code in PR #25430:
URL: https://github.com/apache/airflow/pull/25430#discussion_r934285590


##########
airflow/providers/common/sql/hooks/sql.py:
##########
@@ -30,13 +30,10 @@
 from airflow.utils.module_loading import import_string
 from airflow.version import version
 
-if TYPE_CHECKING:
-    from sqlalchemy.engine import CursorResult
 
-
-def fetch_all_handler(cursor: 'CursorResult') -> Optional[List[Tuple]]:
+def fetch_all_handler(cursor) -> Optional[List[Tuple]]:
     """Handler for DbApiHook.run() to return results"""
-    if cursor.returns_rows:
+    if cursor.description is not None:
         return cursor.fetchall()

Review Comment:
   Quoting sqlalchemy (aforementioned link):
   
   - "Statements that use RETURNING may not return a correct rowcount."
   - "Contrary to what the Python DBAPI says, it does not return the number of rows available from the results of a SELECT statement as DBAPIs cannot support this functionality when rows are unbuffered."



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] FanatoniQ commented on pull request #25430: Fix common sql DbApiHook fetch_all_handler

Posted by GitBox <gi...@apache.org>.
FanatoniQ commented on PR #25430:
URL: https://github.com/apache/airflow/pull/25430#issuecomment-1202101011

   Pep and sqlalchemy states `rowcount` may be missleading and is only usable with UPDATE DELETE...
   
   Regarding `rownumber` it's listed in optional dbapi2 extensions...
   
   
   I don't think that the fact that we are using the same cursor in the run for loop causes an issue.
   
   Sqlalchemy's implementation of `returns_rows` is simply like they say on the documentation : `cursor.description is not None`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] kazanzhy commented on pull request #25430: Fix common sql DbApiHook fetch_all_handler

Posted by GitBox <gi...@apache.org>.
kazanzhy commented on PR #25430:
URL: https://github.com/apache/airflow/pull/25430#issuecomment-1202402758

   @FanatoniQ I don't get it.
   You're saying that  
   
   > Sqlalchemy's implementation of returns_rows is simply like they say on the documentation : cursor.description is not None.
   
   But in this PR you're changing `if cursor.returns_rows:` to `if cursor.description is not None:`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] FanatoniQ commented on pull request #25430: Fix common sql DbApiHook fetch_all_handler

Posted by GitBox <gi...@apache.org>.
FanatoniQ commented on PR #25430:
URL: https://github.com/apache/airflow/pull/25430#issuecomment-1202526975

   > @FanatoniQ I don't get it. You're saying that  
   > 
   > > Sqlalchemy's implementation of returns_rows is simply like they say on the documentation : cursor.description is not None.
   > 
   > But in this PR you're changing `if cursor.returns_rows:` to `if cursor.description is not None:`
   
   The sqlalchemy's implementation of `returns_rows` is correct, like the documentation says sqlalchemy's underlying implementation is `cursor.description is not None`.
   
   The fact is that in `DbApiHook.run` uses stock driver connections and not sqlalchemy. To be clearer if you change `fetch_all_handler` to add a strong runtime type check like so: `if not isinstance(cursor, sqlalchemy.engine.CursorResult): raise AttributeError("the cursor isn't an  sqlalchemy cursor")` it will fail with the error.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org