You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/08/29 19:59:00 UTC

[GitHub] [airflow] denimalpaca opened a new issue, #26046: `isinstance()` check in `_hook()` breaking provider hook usage

denimalpaca opened a new issue, #26046:
URL: https://github.com/apache/airflow/issues/26046

   ### Apache Airflow Provider(s)
   
   common-sql
   
   ### Versions of Apache Airflow Providers
   
   Using `apache-airflow-providers-common-sql==1.1.0`
   
   ### Apache Airflow version
   
   2.3.2
   
   ### Operating System
   
   Debian GNU/Linux 11 bullseye
   
   ### Deployment
   
   Astronomer
   
   ### Deployment details
   
   astro-runtime:5.0.5
   
   ### What happened
   
   The `isinstance()` method to check that the hook is a `DbApiHook` is breaking when a snowflake connection is passed to an operator's `conn_id` parameter, as the check finds an instance of `SnowflakeHook` and not `DbApiHook`.
   
   ### What you think should happen instead
   
   There should not be an error when subclasses of `DbApiHook` are used. This can be fixed by replacing `isinstance()` with something that checks the inheritance hierarchy.
   
   ### How to reproduce
   
   Run an operator from the common-sql provider with a Snowflake connection passed to `conn_id`.
   
   ### Anything else
   
   Occurs every time.
   Log:
   ```
   [2022-08-29, 19:10:42 UTC] {manager.py:49} ERROR - Failed to extract metadata The connection type is not supported by SQLColumnCheckOperator. The associated hook should be a subclass of `DbApiHook`. Got SnowflakeHook task_type=SQLColumnCheckOperator airflow_dag_id=complex_snowflake_transform task_id=quality_check_group_forestfire.forestfire_column_checks airflow_run_id=manual__2022-08-29T19:04:54.998289+00:00 
   Traceback (most recent call last):
     File "/usr/local/airflow/include/openlineage/airflow/extractors/manager.py", line 38, in extract_metadata
       task_metadata = extractor.extract_on_complete(task_instance)
     File "/usr/local/airflow/include/openlineage/airflow/extractors/sql_check_extractors.py", line 26, in extract_on_complete
       return super().extract()
     File "/usr/local/airflow/include/openlineage/airflow/extractors/sql_extractor.py", line 50, in extract
       authority=self._get_authority(),
     File "/usr/local/airflow/include/openlineage/airflow/extractors/snowflake_extractor.py", line 57, in _get_authority
       return self.conn.extra_dejson.get(
     File "/usr/local/airflow/include/openlineage/airflow/extractors/sql_extractor.py", line 102, in conn
       self._conn = get_connection(self._conn_id())
     File "/usr/local/airflow/include/openlineage/airflow/extractors/sql_extractor.py", line 91, in _conn_id
       return getattr(self.hook, self.hook.conn_name_attr)
     File "/usr/local/airflow/include/openlineage/airflow/extractors/sql_extractor.py", line 96, in hook
       self._hook = self._get_hook()
     File "/usr/local/airflow/include/openlineage/airflow/extractors/snowflake_extractor.py", line 63, in _get_hook
       return self.operator.get_db_hook()
     File "/usr/local/lib/python3.9/site-packages/airflow/providers/common/sql/operators/sql.py", line 112, in get_db_hook
       return self._hook
     File "/usr/local/lib/python3.9/functools.py", line 969, in __get__
       val = self.func(instance)
     File "/usr/local/lib/python3.9/site-packages/airflow/providers/common/sql/operators/sql.py", line 95, in _hook
       raise AirflowException(
   airflow.exceptions.AirflowException: The connection type is not supported by SQLColumnCheckOperator. The associated hook should be a subclass of `DbApiHook`. Got SnowflakeHook
   ```
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on issue #26046: `isinstance()` check in `_hook()` breaking provider hook usage

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #26046:
URL: https://github.com/apache/airflow/issues/26046#issuecomment-1231796093

   And the generic message is also clearer in my PR. It adds fully-quallified name to the expected class and prints the whole MRO hierarchy of the class, so it will be much easier to diagnose someone's problem in the future.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on issue #26046: `isinstance()` check in `_hook()` breaking provider hook usage

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #26046:
URL: https://github.com/apache/airflow/issues/26046#issuecomment-1230902359

   I believe this error **could** occur if you had snowflake provider `3.1.0` or below AND common-sql provider installed. I think we cannot really do much about this if this is the case. the provider prior 3.2.0 did not  depend on common-sql and we are not able to "force" upgrading snowflake provider.
   
   My hypotheis is that you are really trying to use "common-sql" "SQLColumnCheck" operator with "pre-common-sql" provider. And this is not going to work. 
   
   Quick fix to that is to install 3.2.0 version of the snowflake provider. If that works (IMHO it should) then the most we can do is to warn the users what to do in such case.
   
   Please try it out and see if that's the case. If it is, then we can likely think of better handling it in common-sql provider (I have an idea, PR is coming).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on issue #26046: `isinstance()` check in `_hook()` breaking provider hook usage

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #26046:
URL: https://github.com/apache/airflow/issues/26046#issuecomment-1230924680

   Likely fix (well - it's not a "fix" it just provides a clearer error message in this case if you confirm my hypothesis @denimalpaca ) in #26051 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on issue #26046: `isinstance()` check in `_hook()` breaking provider hook usage

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #26046:
URL: https://github.com/apache/airflow/issues/26046#issuecomment-1231794238

   Now, the error shoudl be crystal clear in 1.2.0 when we relese. It shoudl even tell you which version of which provider you should upgrade to :D


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on issue #26046: `isinstance()` check in `_hook()` breaking provider hook usage

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #26046:
URL: https://github.com/apache/airflow/issues/26046#issuecomment-1230946448

   FYI: I reproduced it with `snowflake-provider` == 3.0.0, upgrading to 3.1.0 fixed the problem. After applying the #26051 the error message will be similar to:
   
   ```
   airflow.exceptions.AirflowException: You are trying to use common-sql provider functionality with SnowflakeHook, but the Hook class comes from provider that does not support common-sql-provider. Please make sure that you upgrade the provider it comes from to a version that supports common-sql provider (look at the changelog of the provider).Got SnowflakeHook Hook with class hierarchy: [<class 'airflow.providers.snowflake.hooks.snowflake.SnowflakeHook'>, <class 'airflow.hooks.dbapi.DbApiHook'>, <class 'airflow.hooks.base.BaseHook'>, <class 'airflow.utils.log.logging_mixin.Loggin
   gMixin'>, <class 'object'>]
   ``` 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on issue #26046: `isinstance()` check in `_hook()` breaking provider hook usage

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #26046:
URL: https://github.com/apache/airflow/issues/26046#issuecomment-1230892287

   What is the combination of snowflake-provider and common-sql provider you have? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] denimalpaca commented on issue #26046: `isinstance()` check in `_hook()` breaking provider hook usage

Posted by GitBox <gi...@apache.org>.
denimalpaca commented on issue #26046:
URL: https://github.com/apache/airflow/issues/26046#issuecomment-1231801576

   > And the generic message is also clearer in my PR. It adds fully-quallified name to the expected class and prints the whole MRO hierarchy of the class, so it will be much easier to diagnose someone's problem in the future.
   
   That's awesome


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk closed issue #26046: `isinstance()` check in `_hook()` breaking provider hook usage

Posted by GitBox <gi...@apache.org>.
potiuk closed issue #26046: `isinstance()` check in `_hook()` breaking provider hook usage 
URL: https://github.com/apache/airflow/issues/26046


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on issue #26046: `isinstance()` check in `_hook()` breaking provider hook usage

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #26046:
URL: https://github.com/apache/airflow/issues/26046#issuecomment-1230893199

   BTW. isinstance is the right check - it checks for the hierarchy. The problem is different I think but I need to know the versions you have


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on issue #26046: `isinstance()` check in `_hook()` breaking provider hook usage

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #26046:
URL: https://github.com/apache/airflow/issues/26046#issuecomment-1231323724

   I improved the message in #26501 - in the upcoming common-sql, the error message will be much more specific in case of the community providers:
   
   ![image](https://user-images.githubusercontent.com/595491/187387386-634ebc48-c528-499b-a454-0020445dfe70.png)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] denimalpaca commented on issue #26046: `isinstance()` check in `_hook()` breaking provider hook usage

Posted by GitBox <gi...@apache.org>.
denimalpaca commented on issue #26046:
URL: https://github.com/apache/airflow/issues/26046#issuecomment-1231703153

   Ah, gotcha. I was under the impression that `isinstance()` only checked one level in the hierarchy. Upgrading to a higher Snowflake version now, I was using `>=2.5.0` which I guess was not upgrading to the latest version! Thanks for clarifying here 💯 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org