You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "scnerd (via GitHub)" <gi...@apache.org> on 2023/11/22 20:41:50 UTC

[I] apache-airflow-providers-amazon [airflow]

scnerd opened a new issue, #35805:
URL: https://github.com/apache/airflow/issues/35805

   ### Apache Airflow version
   
   2.7.3
   
   ### What happened
   
   When RedshiftSQLHook attempts to auto-fetch credentials when `iam=True`, it uses a cluster-specific approach to obtaining credentials, which fails for Redshift Serverless.
   
   ```
   Traceback (most recent call last):
     File "/usr/local/lib/python3.11/site-packages/airflow/providers/common/sql/operators/sql.py", line 280, in execute
       output = hook.run(
                ^^^^^^^^^
     File "/usr/local/lib/python3.11/site-packages/airflow/providers/common/sql/hooks/sql.py", line 385, in run
       with closing(self.get_conn()) as conn:
                    ^^^^^^^^^^^^^^^
     File "/usr/local/lib/python3.11/site-packages/airflow/providers/amazon/aws/hooks/redshift_sql.py", line 173, in get_conn
       conn_params = self._get_conn_params()
                     ^^^^^^^^^^^^^^^^^^^^^^^
     File "/usr/local/lib/python3.11/site-packages/airflow/providers/amazon/aws/hooks/redshift_sql.py", line 84, in _get_conn_params
       conn.login, conn.password, conn.port = self.get_iam_token(conn)
                                              ^^^^^^^^^^^^^^^^^^^^^^^^
     File "/usr/local/lib/python3.11/site-packages/airflow/providers/amazon/aws/hooks/redshift_sql.py", line 115, in get_iam_token
       cluster_creds = redshift_client.get_cluster_credentials(
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     File "/usr/local/lib/python3.11/site-packages/botocore/client.py", line 535, in _api_call
       return self._make_api_call(operation_name, kwargs)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     File "/usr/local/lib/python3.11/site-packages/botocore/client.py", line 980, in _make_api_call
       raise error_class(parsed_response, operation_name)
   botocore.errorfactory.ClusterNotFoundFault: An error occurred (ClusterNotFound) when calling the GetClusterCredentials operation: Cluster *** not found.
   ```
   
   ### What you think should happen instead
   
   The operator should establish a connection to the serverless workgroup using IAM-obtained credentials using `redshift_connector`.
   
   ### How to reproduce
   
   Create a direct SQL connection to Redshift using IAM authentication, something like:
   
   ```
   {"conn_type":"redshift","extra":"{\"db_user\":\"USER\",\"iam\":true,\"user\":\"USER\"}","host":"WORKGROUP_NAME.ACCOUNT.REGION.redshift-serverless.amazonaws.com","login":"USER","port":5439,"schema":"DATABASE"}
   ```
   
   Then use this connection for any `SQLExecuteQueryOperator`. The crash should occur when establishing the connection.
   
   ### Operating System
   
   Docker, `amazonlinux:2023` base
   
   ### Versions of Apache Airflow Providers
   
   This report applies to apache-airflow-providers-amazon==8.7.1, and the relevant code appears unchange in the master branch. The code I'm using worked for Airflow 2.5.2 and version 7.1.0 of the provider.
   
   ### Deployment
   
   Amazon (AWS) MWAA
   
   ### Deployment details
   
   Local MWAA runner
   
   ### Anything else
   
   The break seems to occur because the RedshiftSQLHook integrates the IAM -> credential conversion, which used to occur inside `redshift_connector.connect`. The logic is not as robust and assumes that the connection refers to a Redshift cluster rather than a serverless workgroup. It's not clear to me why this logic was pulled up and out of `redshift_connector`, but it seems like the easiest solution is just to let `redshift_connector` handle IAM authentication and not attempt to duplicate that logic in the airflow provider.
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] `RedshiftSQLHook` does not work with `iam=True` [airflow]

Posted by "hussein-awala (via GitHub)" <gi...@apache.org>.
hussein-awala closed issue #35805: `RedshiftSQLHook` does not work with `iam=True`
URL: https://github.com/apache/airflow/issues/35805


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] apache-airflow-providers-amazon [airflow]

Posted by "Taragolis (via GitHub)" <gi...@apache.org>.
Taragolis commented on issue #35805:
URL: https://github.com/apache/airflow/issues/35805#issuecomment-1824329980

   @scnerd would you change issue title to less broad and which could describe your problem in short
   
   ---
   
   In addition I have a look `boto3` documentation for [`RedshiftServerless.Client`](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/redshift-serverless.html) and I guess nothing could be done until same method as [`Redshift.Client.get_cluster_credentials`](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/redshift/client/get_cluster_credentials.html) would add to `RedshiftServerless.Client`
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] `RedshiftSQLHook` does not work with `iam=True` [airflow]

Posted by "hussein-awala (via GitHub)" <gi...@apache.org>.
hussein-awala commented on issue #35805:
URL: https://github.com/apache/airflow/issues/35805#issuecomment-1828365011

   @scnerd I just created https://github.com/apache/airflow/pull/35897 to fix this issue; it would be great if you could test it (if you have a created cluster).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] apache-airflow-providers-amazon [airflow]

Posted by "boring-cyborg[bot] (via GitHub)" <gi...@apache.org>.
boring-cyborg[bot] commented on issue #35805:
URL: https://github.com/apache/airflow/issues/35805#issuecomment-1823478430

   Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org