You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/10/08 13:50:59 UTC

[GitHub] [airflow] JavierLopezT opened a new pull request #11350: Allow execution of multiple sql statements in SnowflakeHook

JavierLopezT opened a new pull request #11350:
URL: https://github.com/apache/airflow/pull/11350


   Snowflake-connector doesn't allow natively the execution of multiple SQL statements in the same call. So for allowing to pass files or strings with several queries, a new method is included in the hook


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on a change in pull request #11350: Allow execution of multiple sql statements in SnowflakeHook

Posted by GitBox <gi...@apache.org>.
dstandish commented on a change in pull request #11350:
URL: https://github.com/apache/airflow/pull/11350#discussion_r582169587



##########
File path: airflow/providers/snowflake/hooks/snowflake.py
##########
@@ -142,3 +143,16 @@ def set_autocommit(self, conn, autocommit: Any) -> None:
 
     def get_autocommit(self, conn):
         return getattr(conn, 'autocommit_mode', False)
+
+    def run(self, sql, autocommit=False, parameters=None):
+        """
+        Snowflake-connector doesn't allow natively the execution of multiple SQL statements in the same
+        call. So for allowing to pass files or strings with several queries this method is coded,
+        that relies on run from DBApiHook
+        """
+        conn = self.get_conn()
+        self.set_autocommit(conn, autocommit)
+
+        queries = [item[0] for item in split_statements(StringIO(sql))]
+        for query in queries:
+            super().run(query, autocommit, parameters)

Review comment:
       After you execute a statement snowflake returns information
   for example it might say rows affected, or after a copy statement, the result of the load for all files
   Log should print it
   i also recomment using DictCursor if you're gonna print out (it's a snowflake class)
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on a change in pull request #11350: Allow execution of multiple sql statements in SnowflakeHook

Posted by GitBox <gi...@apache.org>.
dstandish commented on a change in pull request #11350:
URL: https://github.com/apache/airflow/pull/11350#discussion_r582108801



##########
File path: airflow/providers/snowflake/hooks/snowflake.py
##########
@@ -142,3 +143,16 @@ def set_autocommit(self, conn, autocommit: Any) -> None:
 
     def get_autocommit(self, conn):
         return getattr(conn, 'autocommit_mode', False)
+
+    def run(self, sql, autocommit=False, parameters=None):
+        """
+        Snowflake-connector doesn't allow natively the execution of multiple SQL statements in the same
+        call. So for allowing to pass files or strings with several queries this method is coded,
+        that relies on run from DBApiHook
+        """
+        conn = self.get_conn()
+        self.set_autocommit(conn, autocommit)
+
+        queries = [item[0] for item in split_statements(StringIO(sql))]
+        for query in queries:
+            super().run(query, autocommit, parameters)

Review comment:
       the problem with this @JavierLopezT is it will reconnect for every statement.
   
   this can be a nightmare when you use okta :) (though it's possible to cache the creds)
   
   but more critically, as you have implemented, it's not possible to use temp tables.
   
   instead you should connect only once.
   
   e.g. 
   
   ```
   with closing(hook.get_conn()) as cnx:
       cur = cnx.cursor()
       for query in queries:
           cur.execute(query)
           for row in cur.fetchall():
               print(row)
   ```
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish edited a comment on pull request #11350: Allow execution of multiple sql statements in SnowflakeHook

Posted by GitBox <gi...@apache.org>.
dstandish edited a comment on pull request #11350:
URL: https://github.com/apache/airflow/pull/11350#issuecomment-812038431


   Any reason why this one needs to be abandoned?
   
   I can imagine circumstances where you just want to start over and go in a different direction.
   
   I'm not sure what the conventions are and maybe it doesn't matter.
   
   But I think in general it's good to have the history for a PR contained in one place.  Anyway just a thought


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on a change in pull request #11350: Allow execution of multiple sql statements in SnowflakeHook

Posted by GitBox <gi...@apache.org>.
dstandish commented on a change in pull request #11350:
URL: https://github.com/apache/airflow/pull/11350#discussion_r582108801



##########
File path: airflow/providers/snowflake/hooks/snowflake.py
##########
@@ -142,3 +143,16 @@ def set_autocommit(self, conn, autocommit: Any) -> None:
 
     def get_autocommit(self, conn):
         return getattr(conn, 'autocommit_mode', False)
+
+    def run(self, sql, autocommit=False, parameters=None):
+        """
+        Snowflake-connector doesn't allow natively the execution of multiple SQL statements in the same
+        call. So for allowing to pass files or strings with several queries this method is coded,
+        that relies on run from DBApiHook
+        """
+        conn = self.get_conn()
+        self.set_autocommit(conn, autocommit)
+
+        queries = [item[0] for item in split_statements(StringIO(sql))]
+        for query in queries:
+            super().run(query, autocommit, parameters)

Review comment:
       the problem with this @JavierLopezT is it will reconnect for every statement.
   
   instead you should connect only once.  this way you can reuse temp tables, for example.  but also it will be faster




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #11350: Allow execution of multiple sql statements in SnowflakeHook

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #11350:
URL: https://github.com/apache/airflow/pull/11350#issuecomment-705606554


   [The Workflow run](https://github.com/apache/airflow/actions/runs/295639340) is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks$,^Build docs$,^Spell check docs$,^Backport packages$,^Checks: Helm tests$,^Test OpenAPI*.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on a change in pull request #11350: Allow execution of multiple sql statements in SnowflakeHook

Posted by GitBox <gi...@apache.org>.
dstandish commented on a change in pull request #11350:
URL: https://github.com/apache/airflow/pull/11350#discussion_r582285327



##########
File path: airflow/providers/snowflake/hooks/snowflake.py
##########
@@ -144,3 +145,17 @@ def set_autocommit(self, conn, autocommit: Any) -> None:
 
     def get_autocommit(self, conn):
         return getattr(conn, 'autocommit_mode', False)
+
+    def run(self, sql, autocommit=False, parameters=None):
+        """
+        Snowflake-connector doesn't allow natively the execution of multiple SQL statements in the same
+        call. So for allowing to pass files or strings with several queries this method is coded,

Review comment:
       @JavierLopezT does this actually support running a file?   It looks like it must be string here.
   
   I thing we absolutely should make it so `sql` can be path _or_ sql, though it doesn't have to be this PR -- I'd just suggest making sure the docstring is consistent here.
   
    Sorry missed this before.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on pull request #11350: Allow execution of multiple sql statements in SnowflakeHook

Posted by GitBox <gi...@apache.org>.
dstandish commented on pull request #11350:
URL: https://github.com/apache/airflow/pull/11350#issuecomment-812038431


   Hi @JavierLopezT I think it would be better to keep the same PR rather than creating a new one.
   
   Any reason why this one needs to be abandoned.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on a change in pull request #11350: Allow execution of multiple sql statements in SnowflakeHook

Posted by GitBox <gi...@apache.org>.
dstandish commented on a change in pull request #11350:
URL: https://github.com/apache/airflow/pull/11350#discussion_r582108801



##########
File path: airflow/providers/snowflake/hooks/snowflake.py
##########
@@ -142,3 +143,16 @@ def set_autocommit(self, conn, autocommit: Any) -> None:
 
     def get_autocommit(self, conn):
         return getattr(conn, 'autocommit_mode', False)
+
+    def run(self, sql, autocommit=False, parameters=None):
+        """
+        Snowflake-connector doesn't allow natively the execution of multiple SQL statements in the same
+        call. So for allowing to pass files or strings with several queries this method is coded,
+        that relies on run from DBApiHook
+        """
+        conn = self.get_conn()
+        self.set_autocommit(conn, autocommit)
+
+        queries = [item[0] for item in split_statements(StringIO(sql))]
+        for query in queries:
+            super().run(query, autocommit, parameters)

Review comment:
       the problem with this @JavierLopezT is it will reconnect for every statement.
   
   this can be a nightmare when you use okta :) (though it's possible to cache the creds)
   
   but more importantly, the way you've implemented, it's not possible to use temp tables (which is pretty important).  after reconnect the table will be gone.
   
   instead you should connect only once and use same connection for the whole series of statements.
   
   e.g. 
   
   ```
   with closing(hook.get_conn()) as cnx:
       cur = cnx.cursor()
       for query in queries:
           cur.execute(query)
           for row in cur.fetchall():
               print(row)
   ```
   
   (you want to print fetchall because this is how you get statement results, and operator is not meant to be used for a bare "select" statement so no worry about writing a billion rows to the log :) )
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] JavierLopezT commented on a change in pull request #11350: Allow execution of multiple sql statements in SnowflakeHook

Posted by GitBox <gi...@apache.org>.
JavierLopezT commented on a change in pull request #11350:
URL: https://github.com/apache/airflow/pull/11350#discussion_r506189681



##########
File path: airflow/providers/snowflake/hooks/snowflake.py
##########
@@ -139,3 +139,13 @@ def _get_aws_credentials(self) -> Tuple[Optional[Any], Optional[Any]]:
 
     def set_autocommit(self, conn, autocommit: Any) -> None:
         conn.autocommit(autocommit)
+
+    def run(self, sql, autocommit=False, parameters=None):
+        """Snowflake-connector doesn't allow natively the execution of multiple SQL statements in the same call. So for
+        allowing to pass files or strings with several queries this method is coded, that relies on run from DBApiHook"""
+        if isinstance(sql, str):
+            queries = sql.split(';')

Review comment:
       A query that searches for ';' has been a concern to me. I don't know how often is that, though, but I think I could add some parameter or something to make a different behavior in that case (based in re, for instance)
   
   I'll take a look at execute_string




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on a change in pull request #11350: Allow execution of multiple sql statements in SnowflakeHook

Posted by GitBox <gi...@apache.org>.
dstandish commented on a change in pull request #11350:
URL: https://github.com/apache/airflow/pull/11350#discussion_r582169587



##########
File path: airflow/providers/snowflake/hooks/snowflake.py
##########
@@ -142,3 +143,16 @@ def set_autocommit(self, conn, autocommit: Any) -> None:
 
     def get_autocommit(self, conn):
         return getattr(conn, 'autocommit_mode', False)
+
+    def run(self, sql, autocommit=False, parameters=None):
+        """
+        Snowflake-connector doesn't allow natively the execution of multiple SQL statements in the same
+        call. So for allowing to pass files or strings with several queries this method is coded,
+        that relies on run from DBApiHook
+        """
+        conn = self.get_conn()
+        self.set_autocommit(conn, autocommit)
+
+        queries = [item[0] for item in split_statements(StringIO(sql))]
+        for query in queries:
+            super().run(query, autocommit, parameters)

Review comment:
       After you execute a statement snowflake returns information
   
   Log should print it




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on a change in pull request #11350: Allow execution of multiple sql statements in SnowflakeHook

Posted by GitBox <gi...@apache.org>.
dstandish commented on a change in pull request #11350:
URL: https://github.com/apache/airflow/pull/11350#discussion_r550017727



##########
File path: airflow/providers/snowflake/hooks/snowflake.py
##########
@@ -142,3 +141,18 @@ def set_autocommit(self, conn, autocommit: Any) -> None:
 
     def get_autocommit(self, conn):
         return getattr(conn, 'autocommit_mode', False)
+
+    def run(self, sql, autocommit=False, parameters=None):
+        """
+        Snowflake-connector doesn't allow natively the execution of multiple SQL statements in the same
+        call. So for allowing to pass files or strings with several queries this method is coded,
+        that relies on run from DBApiHook
+        """
+        if isinstance(sql, str):

Review comment:
       ok so you want to run with execute_string if `str` and use dbapi.run otherwise  (e.g. if it is `List[str]`)
   
   it seems reasonable. but another option would be to use snowflake's `split_statements` function
   
   https://github.com/snowflakedb/snowflake-connector-python/blob/master/src/snowflake/connector/util_text.py#L32
   
   what you could do is always split and then you can always dbapi run.  this would save you the type checking and autocommit logic.
   
   but i think you can reasonably go either way.
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] eladkal commented on a change in pull request #11350: Allow execution of multiple sql statements in SnowflakeHook

Posted by GitBox <gi...@apache.org>.
eladkal commented on a change in pull request #11350:
URL: https://github.com/apache/airflow/pull/11350#discussion_r501772392



##########
File path: airflow/providers/snowflake/hooks/snowflake.py
##########
@@ -139,3 +139,13 @@ def _get_aws_credentials(self) -> Tuple[Optional[Any], Optional[Any]]:
 
     def set_autocommit(self, conn, autocommit: Any) -> None:
         conn.autocommit(autocommit)
+
+    def run(self, sql, autocommit=False, parameters=None):
+        """Snowflake-connector doesn't allow natively the execution of multiple SQL statements in the same call. So for
+        allowing to pass files or strings with several queries this method is coded, that relies on run from DBApiHook"""
+        if isinstance(sql, str):
+            queries = sql.split(';')

Review comment:
       Is it backward compatible?
   What happens when user has single query that search text column for substring `;` as char? I think this requires to add note about it.
   
   What do you think about exposing `execute_string` which can handle multiple SQLs in single string in addition to the run method of dbapi hook.
   
   https://docs.snowflake.com/en/user-guide/python-connector-api.html#execute_string
   
   https://github.com/snowflakedb/snowflake-connector-python/blob/master/src/snowflake/connector/connection.py#L493
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on a change in pull request #11350: Allow execution of multiple sql statements in SnowflakeHook

Posted by GitBox <gi...@apache.org>.
dstandish commented on a change in pull request #11350:
URL: https://github.com/apache/airflow/pull/11350#discussion_r582285327



##########
File path: airflow/providers/snowflake/hooks/snowflake.py
##########
@@ -144,3 +145,17 @@ def set_autocommit(self, conn, autocommit: Any) -> None:
 
     def get_autocommit(self, conn):
         return getattr(conn, 'autocommit_mode', False)
+
+    def run(self, sql, autocommit=False, parameters=None):
+        """
+        Snowflake-connector doesn't allow natively the execution of multiple SQL statements in the same
+        call. So for allowing to pass files or strings with several queries this method is coded,

Review comment:
       @JavierLopezT does this actually support running a file?   It looks like it must be string here.
   
   I think we def should make it so `sql` can be path _or_ sql _(i.e. Union[Path,str], and if str, check if it's a path to a file that exists)_, though it doesn't have to be _this_ PR -- I just want to suggest here you make sure the docstring is consistent here with the behavior.
   
    Sorry missed this before.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on a change in pull request #11350: Allow execution of multiple sql statements in SnowflakeHook

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #11350:
URL: https://github.com/apache/airflow/pull/11350#discussion_r582783816



##########
File path: airflow/providers/snowflake/hooks/snowflake.py
##########
@@ -15,6 +15,8 @@
 # KIND, either express or implied.  See the License for the
 # specific language governing permissions and limitations
 # under the License.
+from contextlib import closing

Review comment:
       Static check fails with: 
   
   ```
   airflow/providers/snowflake/hooks/snowflake.py:18:1: F401 'contextlib.closing' imported but unused
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on a change in pull request #11350: Allow execution of multiple sql statements in SnowflakeHook

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #11350:
URL: https://github.com/apache/airflow/pull/11350#discussion_r582331074



##########
File path: airflow/providers/snowflake/hooks/snowflake.py
##########
@@ -15,14 +15,17 @@
 # KIND, either express or implied.  See the License for the
 # specific language governing permissions and limitations
 # under the License.
+from contextlib import closing
 from typing import Any, Dict, Optional, Tuple
+from io import StringIO

Review comment:
       This should fix the failing static test: https://github.com/apache/airflow/pull/11350/checks?check_run_id=1973389981#step:8:244




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on a change in pull request #11350: Allow execution of multiple sql statements in SnowflakeHook

Posted by GitBox <gi...@apache.org>.
dstandish commented on a change in pull request #11350:
URL: https://github.com/apache/airflow/pull/11350#discussion_r582285327



##########
File path: airflow/providers/snowflake/hooks/snowflake.py
##########
@@ -144,3 +145,17 @@ def set_autocommit(self, conn, autocommit: Any) -> None:
 
     def get_autocommit(self, conn):
         return getattr(conn, 'autocommit_mode', False)
+
+    def run(self, sql, autocommit=False, parameters=None):
+        """
+        Snowflake-connector doesn't allow natively the execution of multiple SQL statements in the same
+        call. So for allowing to pass files or strings with several queries this method is coded,

Review comment:
       @JavierLopezT does this actually support running a file?   It looks like it must be string here.
   
   I think we absolutely should make it so `sql` can be path _or_ sql (i.e. Union[Path,str], and if str, check if it's a path to a file that exists), though it doesn't have to be _this_ PR -- I just want to suggest here you make sure the docstring is consistent here with the behavior.
   
    Sorry missed this before.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on a change in pull request #11350: Allow execution of multiple sql statements in SnowflakeHook

Posted by GitBox <gi...@apache.org>.
dstandish commented on a change in pull request #11350:
URL: https://github.com/apache/airflow/pull/11350#discussion_r582288499



##########
File path: airflow/providers/snowflake/hooks/snowflake.py
##########
@@ -144,3 +145,17 @@ def set_autocommit(self, conn, autocommit: Any) -> None:
 
     def get_autocommit(self, conn):
         return getattr(conn, 'autocommit_mode', False)
+
+    def run(self, sql, autocommit=False, parameters=None):
+        """
+        Snowflake-connector doesn't allow natively the execution of multiple SQL statements in the same
+        call. So for allowing to pass files or strings with several queries this method is coded,

Review comment:
       and small nit, `that relies on run from DBApiHook` no longer true




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #11350: Allow execution of multiple sql statements in SnowflakeHook

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #11350:
URL: https://github.com/apache/airflow/pull/11350#issuecomment-785780475


   [The Workflow run](https://github.com/apache/airflow/actions/runs/599056879) is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks,^Build docs$,^Spell check docs$,^Backport packages$,^Provider packages,^Checks: Helm tests$,^Test OpenAPI*.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] eladkal commented on a change in pull request #11350: Allow execution of multiple sql statements in SnowflakeHook

Posted by GitBox <gi...@apache.org>.
eladkal commented on a change in pull request #11350:
URL: https://github.com/apache/airflow/pull/11350#discussion_r501772392



##########
File path: airflow/providers/snowflake/hooks/snowflake.py
##########
@@ -139,3 +139,13 @@ def _get_aws_credentials(self) -> Tuple[Optional[Any], Optional[Any]]:
 
     def set_autocommit(self, conn, autocommit: Any) -> None:
         conn.autocommit(autocommit)
+
+    def run(self, sql, autocommit=False, parameters=None):
+        """Snowflake-connector doesn't allow natively the execution of multiple SQL statements in the same call. So for
+        allowing to pass files or strings with several queries this method is coded, that relies on run from DBApiHook"""
+        if isinstance(sql, str):
+            queries = sql.split(';')

Review comment:
       Is it backward compatible?
   What happens when user has single query that search text column for substring `;` as char? I think this requires to add note about it.
   
   What do you think about exposing `execute_string` which can handle multiple SQLs in single string in addition to the run method of dbapi hook.
   
   https://docs.snowflake.com/en/user-guide/python-connector-api.html#execute_string
   
   https://github.com/snowflakedb/snowflake-connector-python/blob/master/src/snowflake/connector/connection.py#L493
   

##########
File path: airflow/providers/snowflake/hooks/snowflake.py
##########
@@ -139,3 +139,13 @@ def _get_aws_credentials(self) -> Tuple[Optional[Any], Optional[Any]]:
 
     def set_autocommit(self, conn, autocommit: Any) -> None:
         conn.autocommit(autocommit)
+
+    def run(self, sql, autocommit=False, parameters=None):
+        """Snowflake-connector doesn't allow natively the execution of multiple SQL statements in the same call. So for
+        allowing to pass files or strings with several queries this method is coded, that relies on run from DBApiHook"""
+        if isinstance(sql, str):
+            queries = sql.split(';')

Review comment:
       Is it backward compatible?
   What happens when user has single query that search text column for substring `;` as char? I think this requires to add note about it.
   
   What do you think about exposing `execute_string` which can handle multiple SQLs in single string in addition to the run method of dbapi hook.
   
   https://docs.snowflake.com/en/user-guide/python-connector-api.html#execute_string
   
   https://github.com/snowflakedb/snowflake-connector-python/blob/master/src/snowflake/connector/connection.py#L493
   
   
   **Edit:** thinking about it with not sure it's possible to preserve backward compatible as even if the hook behavior is unchanged there will still be an open question regarding which one of the hook function the SnowflakeOperator should use.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish edited a comment on pull request #11350: Allow execution of multiple sql statements in SnowflakeHook

Posted by GitBox <gi...@apache.org>.
dstandish edited a comment on pull request #11350:
URL: https://github.com/apache/airflow/pull/11350#issuecomment-812038431






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on a change in pull request #11350: Allow execution of multiple sql statements in SnowflakeHook

Posted by GitBox <gi...@apache.org>.
dstandish commented on a change in pull request #11350:
URL: https://github.com/apache/airflow/pull/11350#discussion_r582108801



##########
File path: airflow/providers/snowflake/hooks/snowflake.py
##########
@@ -142,3 +143,16 @@ def set_autocommit(self, conn, autocommit: Any) -> None:
 
     def get_autocommit(self, conn):
         return getattr(conn, 'autocommit_mode', False)
+
+    def run(self, sql, autocommit=False, parameters=None):
+        """
+        Snowflake-connector doesn't allow natively the execution of multiple SQL statements in the same
+        call. So for allowing to pass files or strings with several queries this method is coded,
+        that relies on run from DBApiHook
+        """
+        conn = self.get_conn()
+        self.set_autocommit(conn, autocommit)
+
+        queries = [item[0] for item in split_statements(StringIO(sql))]
+        for query in queries:
+            super().run(query, autocommit, parameters)

Review comment:
       the problem with this @JavierLopezT is it will reconnect for every statement.
   
   this can be a nightmare when you use okta :) (though it's possible to cache the creds)
   
   instead you should connect only once.  this way you can reuse temp tables, for example.  but also it will be faster
   
   e.g. 
   
   ```
   cnx = hook.get_conn()
   cur = cnx.cursor()
   for query in queries:
       cur.execute(query)
       for row in cur.fetchall():
           print(row)
   ```
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on a change in pull request #11350: Allow execution of multiple sql statements in SnowflakeHook

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #11350:
URL: https://github.com/apache/airflow/pull/11350#discussion_r533770600



##########
File path: airflow/providers/snowflake/hooks/snowflake.py
##########
@@ -142,3 +141,18 @@ def set_autocommit(self, conn, autocommit: Any) -> None:
 
     def get_autocommit(self, conn):
         return getattr(conn, 'autocommit_mode', False)
+
+    def run(self, sql, autocommit=False, parameters=None):
+        """
+        Snowflake-connector doesn't allow natively the execution of multiple SQL statements in the same
+        call. So for allowing to pass files or strings with several queries this method is coded,
+        that relies on run from DBApiHook
+        """
+        if isinstance(sql, str):
+            with closing(self.get_conn()) as conn:
+                if self.supports_autocommit:
+                    self.set_autocommit(conn, autocommit)
+
+                    conn.execute_string(sql, parameters)
+        else:
+            super().run(sql, autocommit, parameters)

Review comment:
       Can you add a test for this @JavierLopezT 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on a change in pull request #11350: Allow execution of multiple sql statements in SnowflakeHook

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #11350:
URL: https://github.com/apache/airflow/pull/11350#discussion_r546984392



##########
File path: airflow/providers/snowflake/hooks/snowflake.py
##########
@@ -142,3 +141,18 @@ def set_autocommit(self, conn, autocommit: Any) -> None:
 
     def get_autocommit(self, conn):
         return getattr(conn, 'autocommit_mode', False)
+
+    def run(self, sql, autocommit=False, parameters=None):
+        """
+        Snowflake-connector doesn't allow natively the execution of multiple SQL statements in the same
+        call. So for allowing to pass files or strings with several queries this method is coded,
+        that relies on run from DBApiHook
+        """
+        if isinstance(sql, str):
+            with closing(self.get_conn()) as conn:
+                if self.supports_autocommit:
+                    self.set_autocommit(conn, autocommit)
+
+                    conn.execute_string(sql, parameters)
+        else:
+            super().run(sql, autocommit, parameters)

Review comment:
       Hey @JavierLopezT -- Have a look at https://github.com/apache/airflow/blob/master/tests/providers/exasol/hooks/test_exasol.py for some examples.
   
   Let me know if you need help though




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #11350: Allow execution of multiple sql statements in SnowflakeHook

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #11350:
URL: https://github.com/apache/airflow/pull/11350#issuecomment-734299961


   [The Workflow run](https://github.com/apache/airflow/actions/runs/385308851) is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks,^Build docs$,^Spell check docs$,^Backport packages$,^Provider packages,^Checks: Helm tests$,^Test OpenAPI*.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] JavierLopezT commented on a change in pull request #11350: Allow execution of multiple sql statements in SnowflakeHook

Posted by GitBox <gi...@apache.org>.
JavierLopezT commented on a change in pull request #11350:
URL: https://github.com/apache/airflow/pull/11350#discussion_r581979283



##########
File path: airflow/providers/snowflake/hooks/snowflake.py
##########
@@ -142,3 +141,18 @@ def set_autocommit(self, conn, autocommit: Any) -> None:
 
     def get_autocommit(self, conn):
         return getattr(conn, 'autocommit_mode', False)
+
+    def run(self, sql, autocommit=False, parameters=None):
+        """
+        Snowflake-connector doesn't allow natively the execution of multiple SQL statements in the same
+        call. So for allowing to pass files or strings with several queries this method is coded,
+        that relies on run from DBApiHook
+        """
+        if isinstance(sql, str):

Review comment:
       I have modified the code to implement the function split_statements, thanks for the insight




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] JavierLopezT commented on a change in pull request #11350: Allow execution of multiple sql statements in SnowflakeHook

Posted by GitBox <gi...@apache.org>.
JavierLopezT commented on a change in pull request #11350:
URL: https://github.com/apache/airflow/pull/11350#discussion_r530517569



##########
File path: airflow/providers/snowflake/hooks/snowflake.py
##########
@@ -139,3 +139,13 @@ def _get_aws_credentials(self) -> Tuple[Optional[Any], Optional[Any]]:
 
     def set_autocommit(self, conn, autocommit: Any) -> None:
         conn.autocommit(autocommit)
+
+    def run(self, sql, autocommit=False, parameters=None):
+        """Snowflake-connector doesn't allow natively the execution of multiple SQL statements in the same call. So for
+        allowing to pass files or strings with several queries this method is coded, that relies on run from DBApiHook"""
+        if isinstance(sql, str):
+            queries = sql.split(';')

Review comment:
       I have updated the code using execute_string and adding the previous method for not breaking backwards compatibility in case you were using a list of queries




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] eladkal commented on a change in pull request #11350: Allow execution of multiple sql statements in SnowflakeHook

Posted by GitBox <gi...@apache.org>.
eladkal commented on a change in pull request #11350:
URL: https://github.com/apache/airflow/pull/11350#discussion_r501772392



##########
File path: airflow/providers/snowflake/hooks/snowflake.py
##########
@@ -139,3 +139,13 @@ def _get_aws_credentials(self) -> Tuple[Optional[Any], Optional[Any]]:
 
     def set_autocommit(self, conn, autocommit: Any) -> None:
         conn.autocommit(autocommit)
+
+    def run(self, sql, autocommit=False, parameters=None):
+        """Snowflake-connector doesn't allow natively the execution of multiple SQL statements in the same call. So for
+        allowing to pass files or strings with several queries this method is coded, that relies on run from DBApiHook"""
+        if isinstance(sql, str):
+            queries = sql.split(';')

Review comment:
       Is it backward compatible?
   What happens when user has single query that search text column for substring `;` as char? I think this requires to add note about it.
   
   What do you think about exposing `execute_string` which can handle multiple SQLs in single string in addition to the run method of dbapi hook.
   
   https://docs.snowflake.com/en/user-guide/python-connector-api.html#execute_string
   
   https://github.com/snowflakedb/snowflake-connector-python/blob/master/src/snowflake/connector/connection.py#L493
   
   
   **Edit:** thinking about it with not sure it's possible to preserve backward compatible as even if the hook behavior is unchanged there will still be an open question regarding which one of the hook function the SnowflakeOperator should use.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] JavierLopezT commented on a change in pull request #11350: Allow execution of multiple sql statements in SnowflakeHook

Posted by GitBox <gi...@apache.org>.
JavierLopezT commented on a change in pull request #11350:
URL: https://github.com/apache/airflow/pull/11350#discussion_r582163390



##########
File path: airflow/providers/snowflake/hooks/snowflake.py
##########
@@ -142,3 +143,16 @@ def set_autocommit(self, conn, autocommit: Any) -> None:
 
     def get_autocommit(self, conn):
         return getattr(conn, 'autocommit_mode', False)
+
+    def run(self, sql, autocommit=False, parameters=None):
+        """
+        Snowflake-connector doesn't allow natively the execution of multiple SQL statements in the same
+        call. So for allowing to pass files or strings with several queries this method is coded,
+        that relies on run from DBApiHook
+        """
+        conn = self.get_conn()
+        self.set_autocommit(conn, autocommit)
+
+        queries = [item[0] for item in split_statements(StringIO(sql))]
+        for query in queries:
+            super().run(query, autocommit, parameters)

Review comment:
       Thanks for the example. However, I don't get why would we want to print fetchall




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] JavierLopezT commented on a change in pull request #11350: Allow execution of multiple sql statements in SnowflakeHook

Posted by GitBox <gi...@apache.org>.
JavierLopezT commented on a change in pull request #11350:
URL: https://github.com/apache/airflow/pull/11350#discussion_r545812706



##########
File path: airflow/providers/snowflake/hooks/snowflake.py
##########
@@ -142,3 +141,18 @@ def set_autocommit(self, conn, autocommit: Any) -> None:
 
     def get_autocommit(self, conn):
         return getattr(conn, 'autocommit_mode', False)
+
+    def run(self, sql, autocommit=False, parameters=None):
+        """
+        Snowflake-connector doesn't allow natively the execution of multiple SQL statements in the same
+        call. So for allowing to pass files or strings with several queries this method is coded,
+        that relies on run from DBApiHook
+        """
+        if isinstance(sql, str):
+            with closing(self.get_conn()) as conn:
+                if self.supports_autocommit:
+                    self.set_autocommit(conn, autocommit)
+
+                    conn.execute_string(sql, parameters)
+        else:
+            super().run(sql, autocommit, parameters)

Review comment:
       Hi @kaxil . I think I have to test that depending on whether sql is a list or a string, it calls conn.execute_string or super().run, but I have no idea how to do that. I have started defining two functions and two different SQL, but I can't go beyond. Could you help me, please? Thank you very much in advance




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish edited a comment on pull request #11350: Allow execution of multiple sql statements in SnowflakeHook

Posted by GitBox <gi...@apache.org>.
dstandish edited a comment on pull request #11350:
URL: https://github.com/apache/airflow/pull/11350#issuecomment-812038431


   Hi @JavierLopezT I think it would be better to keep the same PR rather than creating a new one.
   
   Any reason why this one needs to be abandoned?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on a change in pull request #11350: Allow execution of multiple sql statements in SnowflakeHook

Posted by GitBox <gi...@apache.org>.
dstandish commented on a change in pull request #11350:
URL: https://github.com/apache/airflow/pull/11350#discussion_r582108801



##########
File path: airflow/providers/snowflake/hooks/snowflake.py
##########
@@ -142,3 +143,16 @@ def set_autocommit(self, conn, autocommit: Any) -> None:
 
     def get_autocommit(self, conn):
         return getattr(conn, 'autocommit_mode', False)
+
+    def run(self, sql, autocommit=False, parameters=None):
+        """
+        Snowflake-connector doesn't allow natively the execution of multiple SQL statements in the same
+        call. So for allowing to pass files or strings with several queries this method is coded,
+        that relies on run from DBApiHook
+        """
+        conn = self.get_conn()
+        self.set_autocommit(conn, autocommit)
+
+        queries = [item[0] for item in split_statements(StringIO(sql))]
+        for query in queries:
+            super().run(query, autocommit, parameters)

Review comment:
       the problem with this @JavierLopezT is it will reconnect for every statement.
   
   this can be a nightmare when you use okta :) (though it's possible to cache the creds)
   
   but more critically, as you have implemented, it's not possible to use temp tables.
   
   instead you should connect only once.
   
   e.g. 
   
   ```
   with closing(hook.get_conn()) as cnx:
       cur = cnx.cursor()
       for query in queries:
           cur.execute(query)
           for row in cur.fetchall():
               print(row)
   ```
   
   (you want to print fetchall because this is how you get statement results, and operator is not meant to be used for a bare "select" statement so no worry about writing a billion rows to the log :) )
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #11350: Allow execution of multiple sql statements in SnowflakeHook

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #11350:
URL: https://github.com/apache/airflow/pull/11350#issuecomment-785385503


   [The Workflow run](https://github.com/apache/airflow/actions/runs/597352712) is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks,^Build docs$,^Spell check docs$,^Backport packages$,^Provider packages,^Checks: Helm tests$,^Test OpenAPI*.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] JavierLopezT commented on a change in pull request #11350: Allow execution of multiple sql statements in SnowflakeHook

Posted by GitBox <gi...@apache.org>.
JavierLopezT commented on a change in pull request #11350:
URL: https://github.com/apache/airflow/pull/11350#discussion_r582163390



##########
File path: airflow/providers/snowflake/hooks/snowflake.py
##########
@@ -142,3 +143,16 @@ def set_autocommit(self, conn, autocommit: Any) -> None:
 
     def get_autocommit(self, conn):
         return getattr(conn, 'autocommit_mode', False)
+
+    def run(self, sql, autocommit=False, parameters=None):
+        """
+        Snowflake-connector doesn't allow natively the execution of multiple SQL statements in the same
+        call. So for allowing to pass files or strings with several queries this method is coded,
+        that relies on run from DBApiHook
+        """
+        conn = self.get_conn()
+        self.set_autocommit(conn, autocommit)
+
+        queries = [item[0] for item in split_statements(StringIO(sql))]
+        for query in queries:
+            super().run(query, autocommit, parameters)

Review comment:
       Thanks for the example. However, I don't get the print fetchall part




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on a change in pull request #11350: Allow execution of multiple sql statements in SnowflakeHook

Posted by GitBox <gi...@apache.org>.
dstandish commented on a change in pull request #11350:
URL: https://github.com/apache/airflow/pull/11350#discussion_r582108801



##########
File path: airflow/providers/snowflake/hooks/snowflake.py
##########
@@ -142,3 +143,16 @@ def set_autocommit(self, conn, autocommit: Any) -> None:
 
     def get_autocommit(self, conn):
         return getattr(conn, 'autocommit_mode', False)
+
+    def run(self, sql, autocommit=False, parameters=None):
+        """
+        Snowflake-connector doesn't allow natively the execution of multiple SQL statements in the same
+        call. So for allowing to pass files or strings with several queries this method is coded,
+        that relies on run from DBApiHook
+        """
+        conn = self.get_conn()
+        self.set_autocommit(conn, autocommit)
+
+        queries = [item[0] for item in split_statements(StringIO(sql))]
+        for query in queries:
+            super().run(query, autocommit, parameters)

Review comment:
       the problem with this @JavierLopezT is it will reconnect for every statement.
   
   this can be a nightmare when you use okta :) (though it's possible to cache the creds)
   
   instead you should connect only once.  this way you can reuse temp tables, for example.  but also it will be faster




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] JavierLopezT edited a comment on pull request #11350: Allow execution of multiple sql statements in SnowflakeHook

Posted by GitBox <gi...@apache.org>.
JavierLopezT edited a comment on pull request #11350:
URL: https://github.com/apache/airflow/pull/11350#issuecomment-785119481


   @potiuk Hello! Is it worth it to keep with this MR or shall we wait for a new version of snowflake python connector?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on a change in pull request #11350: Allow execution of multiple sql statements in SnowflakeHook

Posted by GitBox <gi...@apache.org>.
dstandish commented on a change in pull request #11350:
URL: https://github.com/apache/airflow/pull/11350#discussion_r550005187



##########
File path: airflow/providers/snowflake/hooks/snowflake.py
##########
@@ -142,3 +141,18 @@ def set_autocommit(self, conn, autocommit: Any) -> None:
 
     def get_autocommit(self, conn):
         return getattr(conn, 'autocommit_mode', False)
+
+    def run(self, sql, autocommit=False, parameters=None):
+        """
+        Snowflake-connector doesn't allow natively the execution of multiple SQL statements in the same
+        call. So for allowing to pass files or strings with several queries this method is coded,
+        that relies on run from DBApiHook
+        """
+        if isinstance(sql, str):
+            with closing(self.get_conn()) as conn:

Review comment:
       you do not need to wrap `SnowflakeConnection` with `closing` because it already has a good `__exit__`

##########
File path: airflow/providers/snowflake/hooks/snowflake.py
##########
@@ -142,3 +141,18 @@ def set_autocommit(self, conn, autocommit: Any) -> None:
 
     def get_autocommit(self, conn):
         return getattr(conn, 'autocommit_mode', False)
+
+    def run(self, sql, autocommit=False, parameters=None):
+        """
+        Snowflake-connector doesn't allow natively the execution of multiple SQL statements in the same
+        call. So for allowing to pass files or strings with several queries this method is coded,
+        that relies on run from DBApiHook
+        """
+        if isinstance(sql, str):

Review comment:
       ok so you want to run with execute_string if `str` and use dbapi.run otherwise  (e.g. if it is `List[str]`)
   
   it seems reasonable. but another option would be to use snowflake's `split_statements` function
   
   https://github.com/snowflakedb/snowflake-connector-python/blob/master/src/snowflake/connector/util_text.py#L32
   
   what you could do is always split and then you can always dbapi run.  this would save you the type checking and autocommit logic.
   
   but i think you can reasonably go either way.
   
   one thought though... at my prev company we liked to execute sql files.  what we did was first check if `sql` was a file that exists with `Path(sql).exists()` and if so then `sql = Path(sql).read_text()`.  then execute as normal.  you can do with templating but i find this easier.  anyway it's an idea for your consideration if you want to make this a little more generic --- `Union[Path, List[str], str]` where str can be either single statement, multiple statement, or path to sql file.

##########
File path: airflow/providers/snowflake/hooks/snowflake.py
##########
@@ -142,3 +141,18 @@ def set_autocommit(self, conn, autocommit: Any) -> None:
 
     def get_autocommit(self, conn):
         return getattr(conn, 'autocommit_mode', False)
+
+    def run(self, sql, autocommit=False, parameters=None):
+        """
+        Snowflake-connector doesn't allow natively the execution of multiple SQL statements in the same
+        call. So for allowing to pass files or strings with several queries this method is coded,
+        that relies on run from DBApiHook
+        """
+        if isinstance(sql, str):
+            with closing(self.get_conn()) as conn:
+                if self.supports_autocommit:

Review comment:
       this generic `if self.supports_autocommit` logic makes sense in DbApiHook because it's a generic base class that means to support many different databases -- some which support autocommit and some which don't.  in your case here, you are implementing for one specific database and you _know_ whether it supports autocommit or not -- so you don't need to check --- just do the right thing.

##########
File path: airflow/providers/snowflake/hooks/snowflake.py
##########
@@ -142,3 +141,18 @@ def set_autocommit(self, conn, autocommit: Any) -> None:
 
     def get_autocommit(self, conn):
         return getattr(conn, 'autocommit_mode', False)
+
+    def run(self, sql, autocommit=False, parameters=None):
+        """
+        Snowflake-connector doesn't allow natively the execution of multiple SQL statements in the same
+        call. So for allowing to pass files or strings with several queries this method is coded,
+        that relies on run from DBApiHook
+        """
+        if isinstance(sql, str):
+            with closing(self.get_conn()) as conn:
+                if self.supports_autocommit:
+                    self.set_autocommit(conn, autocommit)
+
+                    conn.execute_string(sql, parameters)
+        else:
+            super().run(sql, autocommit, parameters)

Review comment:
       @JavierLopezT what you want to do is use mock.patch to verify that in the `run` method, the right underlying method is called given the right input.  google for basic examples of mock patch.
   
   in this case you would want to mock the `get_conn` method so that its `return_value` is a mock object.  then after `run` finishes, you can verify that the mock object was called with method `execute_string` and your value for `sql`
   
   there are lots of examples in the repo just try to find something similar




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on a change in pull request #11350: Allow execution of multiple sql statements in SnowflakeHook

Posted by GitBox <gi...@apache.org>.
dstandish commented on a change in pull request #11350:
URL: https://github.com/apache/airflow/pull/11350#discussion_r582108801



##########
File path: airflow/providers/snowflake/hooks/snowflake.py
##########
@@ -142,3 +143,16 @@ def set_autocommit(self, conn, autocommit: Any) -> None:
 
     def get_autocommit(self, conn):
         return getattr(conn, 'autocommit_mode', False)
+
+    def run(self, sql, autocommit=False, parameters=None):
+        """
+        Snowflake-connector doesn't allow natively the execution of multiple SQL statements in the same
+        call. So for allowing to pass files or strings with several queries this method is coded,
+        that relies on run from DBApiHook
+        """
+        conn = self.get_conn()
+        self.set_autocommit(conn, autocommit)
+
+        queries = [item[0] for item in split_statements(StringIO(sql))]
+        for query in queries:
+            super().run(query, autocommit, parameters)

Review comment:
       the problem with this @JavierLopezT is it will reconnect for every statement.
   
   this can be a nightmare when you use okta :) (though it's possible to cache the creds)
   
   instead you should connect only once.  this way you can reuse temp tables, for example.  but also it will be faster
   
   e.g. 
   
   ```
   with closing(hook.get_conn()) as cnx:
       cur = cnx.cursor()
       for query in queries:
           cur.execute(query)
           for row in cur.fetchall():
               print(row)
   ```
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #11350: Allow execution of multiple sql statements in SnowflakeHook

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #11350:
URL: https://github.com/apache/airflow/pull/11350#issuecomment-705606554


   [The Workflow run](https://github.com/apache/airflow/actions/runs/295639340) is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks$,^Build docs$,^Spell check docs$,^Backport packages$,^Checks: Helm tests$,^Test OpenAPI*.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] JavierLopezT closed pull request #11350: Allow execution of multiple sql statements in SnowflakeHook

Posted by GitBox <gi...@apache.org>.
JavierLopezT closed pull request #11350:
URL: https://github.com/apache/airflow/pull/11350


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on a change in pull request #11350: Allow execution of multiple sql statements in SnowflakeHook

Posted by GitBox <gi...@apache.org>.
dstandish commented on a change in pull request #11350:
URL: https://github.com/apache/airflow/pull/11350#discussion_r582285327



##########
File path: airflow/providers/snowflake/hooks/snowflake.py
##########
@@ -144,3 +145,17 @@ def set_autocommit(self, conn, autocommit: Any) -> None:
 
     def get_autocommit(self, conn):
         return getattr(conn, 'autocommit_mode', False)
+
+    def run(self, sql, autocommit=False, parameters=None):
+        """
+        Snowflake-connector doesn't allow natively the execution of multiple SQL statements in the same
+        call. So for allowing to pass files or strings with several queries this method is coded,

Review comment:
       @JavierLopezT does this actually support running a file?   It looks like it must be string here.
   
   I think we absolutely should make it so `sql` can be path _or_ sql _(i.e. Union[Path,str], and if str, check if it's a path to a file that exists)_, though it doesn't have to be _this_ PR -- I just want to suggest here you make sure the docstring is consistent here with the behavior.
   
    Sorry missed this before.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #11350: Allow execution of multiple sql statements in SnowflakeHook

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #11350:
URL: https://github.com/apache/airflow/pull/11350#issuecomment-785385892


   [The Workflow run](https://github.com/apache/airflow/actions/runs/597356912) is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks,^Build docs$,^Spell check docs$,^Backport packages$,^Provider packages,^Checks: Helm tests$,^Test OpenAPI*.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] JavierLopezT commented on a change in pull request #11350: Allow execution of multiple sql statements in SnowflakeHook

Posted by GitBox <gi...@apache.org>.
JavierLopezT commented on a change in pull request #11350:
URL: https://github.com/apache/airflow/pull/11350#discussion_r545812706



##########
File path: airflow/providers/snowflake/hooks/snowflake.py
##########
@@ -142,3 +141,18 @@ def set_autocommit(self, conn, autocommit: Any) -> None:
 
     def get_autocommit(self, conn):
         return getattr(conn, 'autocommit_mode', False)
+
+    def run(self, sql, autocommit=False, parameters=None):
+        """
+        Snowflake-connector doesn't allow natively the execution of multiple SQL statements in the same
+        call. So for allowing to pass files or strings with several queries this method is coded,
+        that relies on run from DBApiHook
+        """
+        if isinstance(sql, str):
+            with closing(self.get_conn()) as conn:
+                if self.supports_autocommit:
+                    self.set_autocommit(conn, autocommit)
+
+                    conn.execute_string(sql, parameters)
+        else:
+            super().run(sql, autocommit, parameters)

Review comment:
       Hi @kaxil . I think I have to test that depending on whether sql is a list or a string, it calls conn.execute_string or super().run, but I have no idea how to do that. Could you help me, please? Thank you very much in advance




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] JavierLopezT commented on pull request #11350: Allow execution of multiple sql statements in SnowflakeHook

Posted by GitBox <gi...@apache.org>.
JavierLopezT commented on pull request #11350:
URL: https://github.com/apache/airflow/pull/11350#issuecomment-785119481


   @potiuk is it worth it to keep with this MR or shall we wait to a new version of snowflake python connector?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on a change in pull request #11350: Allow execution of multiple sql statements in SnowflakeHook

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #11350:
URL: https://github.com/apache/airflow/pull/11350#discussion_r582330851



##########
File path: airflow/providers/snowflake/hooks/snowflake.py
##########
@@ -15,14 +15,17 @@
 # KIND, either express or implied.  See the License for the
 # specific language governing permissions and limitations
 # under the License.
+from contextlib import closing
 from typing import Any, Dict, Optional, Tuple
+from io import StringIO

Review comment:
       ```suggestion
   from io import StringIO
   from typing import Any, Dict, Optional, Tuple
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] JavierLopezT commented on pull request #11350: Allow execution of multiple sql statements in SnowflakeHook

Posted by GitBox <gi...@apache.org>.
JavierLopezT commented on pull request #11350:
URL: https://github.com/apache/airflow/pull/11350#issuecomment-811748397


   Closing it to open a new open with more features as suggestions here


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on pull request #11350: Allow execution of multiple sql statements in SnowflakeHook

Posted by GitBox <gi...@apache.org>.
kaxil commented on pull request #11350:
URL: https://github.com/apache/airflow/pull/11350#issuecomment-721446180


   Can you please rebase your PR on latest Master since we have applied [Black](https://github.com/apache/airflow/commit/4e8f9cc8d02b29c325b8a5a76b4837671bdf5f68) and [PyUpgrade](https://github.com/apache/airflow/commit/8c42cf1b00c90f0d7f11b8a3a455381de8e003c5) on Master.
   
   It will help if your squash your commits into single commit first so that there are less conflicts.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #11350: Allow execution of multiple sql statements in SnowflakeHook

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #11350:
URL: https://github.com/apache/airflow/pull/11350#issuecomment-785295353


   The PR is likely OK to be merged with just subset of tests for default Python and Database versions without running the full matrix of tests, because it does not modify the core of Airflow. If the committers decide that the full tests matrix is needed, they will add the label 'full tests needed'. Then you should rebase to the latest master or amend the last commit of the PR, and push it with --force-with-lease.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #11350: Allow execution of multiple sql statements in SnowflakeHook

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #11350:
URL: https://github.com/apache/airflow/pull/11350#issuecomment-785133537


   > @potiuk Hello! Is it worth it to keep with this MR or shall we wait for a new version of snowflake python connector?
   
   Certainly. I am not owning the connector :). Not sure when/if the new version comes out. And when it does - i think it will be backwards compatible, so there is no reason why this should be blocking anyone.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #11350: Allow execution of multiple sql statements in SnowflakeHook

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #11350:
URL: https://github.com/apache/airflow/pull/11350#issuecomment-748083948


   [The Workflow run](https://github.com/apache/airflow/actions/runs/430540854) is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks,^Build docs$,^Spell check docs$,^Backport packages$,^Provider packages,^Checks: Helm tests$,^Test OpenAPI*.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on a change in pull request #11350: Allow execution of multiple sql statements in SnowflakeHook

Posted by GitBox <gi...@apache.org>.
dstandish commented on a change in pull request #11350:
URL: https://github.com/apache/airflow/pull/11350#discussion_r582108801



##########
File path: airflow/providers/snowflake/hooks/snowflake.py
##########
@@ -142,3 +143,16 @@ def set_autocommit(self, conn, autocommit: Any) -> None:
 
     def get_autocommit(self, conn):
         return getattr(conn, 'autocommit_mode', False)
+
+    def run(self, sql, autocommit=False, parameters=None):
+        """
+        Snowflake-connector doesn't allow natively the execution of multiple SQL statements in the same
+        call. So for allowing to pass files or strings with several queries this method is coded,
+        that relies on run from DBApiHook
+        """
+        conn = self.get_conn()
+        self.set_autocommit(conn, autocommit)
+
+        queries = [item[0] for item in split_statements(StringIO(sql))]
+        for query in queries:
+            super().run(query, autocommit, parameters)

Review comment:
       the problem with this @JavierLopezT is it will reauthenticate for every statement.
   
   instead you should connect only once.  this way you can reuse temp tables, for example.  but also it will be faster




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org