You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/01/04 23:33:39 UTC
[GitHub] [airflow] yehoshuadimarsky opened a new issue #20660: Exception unfairly raised in SQL Operator when using Application Default Credentials with Workload Identity for Cloud SQL Auth proxy
yehoshuadimarsky opened a new issue #20660:
URL: https://github.com/apache/airflow/issues/20660
### Apache Airflow Provider(s)
google
### Versions of Apache Airflow Providers
```
apache-airflow-providers-amazon==2.3.0
apache-airflow-providers-celery==2.1.0
apache-airflow-providers-cncf-kubernetes==2.0.0
apache-airflow-providers-docker==2.2.0
apache-airflow-providers-elasticsearch==2.0.3
apache-airflow-providers-ftp==2.0.1
apache-airflow-providers-google==6.0.0
apache-airflow-providers-grpc==2.0.1
apache-airflow-providers-hashicorp==2.1.1
apache-airflow-providers-http==2.0.1
apache-airflow-providers-imap==2.0.1
apache-airflow-providers-microsoft-azure==3.2.0
apache-airflow-providers-mysql==2.1.1
apache-airflow-providers-odbc==2.0.1
apache-airflow-providers-postgres==2.3.0
apache-airflow-providers-redis==2.0.1
apache-airflow-providers-sendgrid==2.0.1
apache-airflow-providers-sftp==2.1.1
apache-airflow-providers-slack==4.1.0
apache-airflow-providers-sqlite==2.0.1
apache-airflow-providers-ssh==2.2.0
```
### Apache Airflow version
2.2.1
### Operating System
Debian Buster
### Deployment
Official Apache Airflow Helm Chart
### Deployment details
Running the official Airflow Helm chart on GKE. Have Workload Identity set up and working, linked the Google and Kubernetes service accounts.
Created the an Airflow connection for GCP ADC, per the [instructions](https://airflow.apache.org/docs/apache-airflow-providers-google/6.2.0/connections/gcp.html#note-on-application-default-credentials)
```bash
airflow connections add \
--conn-type google_cloud_platform \
'gcp-airflow-svc-acct-dev'
```
The SQL connection - note, no password!
```bash
airflow connections add \
--conn-type gcpcloudsql \
--conn-host $SQL_INSTANCE_PUBLIC_IP \
--conn-login {GSA_NAME}@{PROJECT_ID}.iam \ # linked to KSA via WI
--conn-extra '{"instance": "INSTANCE_NAME", "location": "us-east1", "database_type": "postgres", "project_id": "PROJECT_ID", "use_proxy": true, "sql_proxy_use_tcp": true}' \
'gcp-sql-ods-dev'
```
### What happened
Getting this error when trying to run a [CloudSQLExecuteQueryOperator](https://airflow.apache.org/docs/apache-airflow-providers-google/6.0.0/_api/airflow/providers/google/cloud/operators/cloud_sql/index.html#airflow.providers.google.cloud.operators.cloud_sql.CloudSQLExecuteQueryOperator)
```
[2022-01-04, 18:03:49 EST] {taskinstance.py:1703} ERROR - Task failed with exception
Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.9/site-packages/airflow/models/taskinstance.py", line 1332, in _run_raw_task
self._execute_task_with_callbacks(context)
File "/home/airflow/.local/lib/python3.9/site-packages/airflow/models/taskinstance.py", line 1458, in _execute_task_with_callbacks
result = self._execute_task(context, self.task)
File "/home/airflow/.local/lib/python3.9/site-packages/airflow/models/taskinstance.py", line 1514, in _execute_task
result = execute_callable(context=context)
File "/home/airflow/.local/lib/python3.9/site-packages/airflow/providers/google/cloud/operators/cloud_sql.py", line 1076, in execute
connection = hook.create_connection()
File "/home/airflow/.local/lib/python3.9/site-packages/airflow/providers/google/cloud/hooks/cloud_sql.py", line 932, in create_connection
uri = self._generate_connection_uri()
File "/home/airflow/.local/lib/python3.9/site-packages/airflow/providers/google/cloud/hooks/cloud_sql.py", line 893, in _generate_connection_uri
raise AirflowException("The password parameter needs to be set in connection")
airflow.exceptions.AirflowException: The password parameter needs to be set in connection
```
### What you expected to happen
I expect to be able to use the Cloud SQL Auth proxy from GKE using the IAM database authentication of the Workload Identity-provided GSA and not needing any passwords at all. However, the Airflow Hook is hardcoded to raise an error if there is no password in the SQL connection. Why? Outside of Airflow it can work without a password via Cloud SQL Auth proxy and WI etc, can we allow that in Airflow too?
### How to reproduce
_No response_
### Anything else
_No response_
### Are you willing to submit PR?
- [X] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] yehoshuadimarsky commented on issue #20660: Google SQL shouldn't require password when using Application Default Credentials and Cloud SQL Auth proxy
Posted by GitBox <gi...@apache.org>.
yehoshuadimarsky commented on issue #20660:
URL: https://github.com/apache/airflow/issues/20660#issuecomment-1006152110
So I think I know how to fix it, but I'm at a loss how to test it. I've read all the Breeze stuff and the relevant test code at
https://github.com/apache/airflow/blob/68fa358c85d0ebe1aa22ace30b33867ad4e622a8/tests/providers/google/cloud/hooks/test_cloud_sql.py#L989-L1038
and
https://github.com/apache/airflow/blob/68fa358c85d0ebe1aa22ace30b33867ad4e622a8/tests/providers/google/cloud/hooks/test_cloud_sql.py#L1122-L1135
But I can't figure out how to test this kind of login within old-style `unittest` patching - any suggestions?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] yehoshuadimarsky edited a comment on issue #20660: Google SQL shouldn't require password when using Application Default Credentials and Cloud SQL Auth proxy
Posted by GitBox <gi...@apache.org>.
yehoshuadimarsky edited a comment on issue #20660:
URL: https://github.com/apache/airflow/issues/20660#issuecomment-1006152110
So I think I know how to fix it, but I'm at a loss how to test it. I've read all the Breeze stuff and the relevant test code.
The fix:
```diff
diff --git a/airflow/providers/google/cloud/hooks/cloud_sql.py b/airflow/providers/google/cloud/hooks/cloud_sql.py
index 1b6766390..5275a46c3 100644
--- a/airflow/providers/google/cloud/hooks/cloud_sql.py
+++ b/airflow/providers/google/cloud/hooks/cloud_sql.py
@@ -536,10 +536,10 @@ class CloudSqlProxyRunner(LoggingMixin):
self.log.info(
"The credentials are not supplied by neither key_path nor "
"keyfile_dict of the gcp connection %s. Falling back to "
- "default activated account",
+ "default activated account and using automatic IAM login",
self.gcp_conn_id,
)
- credential_params = []
+ credential_params = ["-enable_iam_login"]
if not self.instance_specification:
project_id = connection.extra_dejson.get('extra__google_cloud_platform__project')
@@ -890,8 +890,6 @@ class CloudSQLDatabaseHook(BaseHook):
raise AirflowException("The login parameter needs to be set in connection")
if not self.public_ip:
raise AirflowException("The location parameter needs to be set in connection")
- if not self.password:
- raise AirflowException("The password parameter needs to be set in connection")
if not self.database:
raise AirflowException("The database parameter needs to be set in connection")
```
Ref: https://cloud.google.com/sql/docs/postgres/iam-logins#cloud-sql-auth-proxy
Relevant parts of existing test code:
https://github.com/apache/airflow/blob/68fa358c85d0ebe1aa22ace30b33867ad4e622a8/tests/providers/google/cloud/hooks/test_cloud_sql.py#L989-L1038
and
https://github.com/apache/airflow/blob/68fa358c85d0ebe1aa22ace30b33867ad4e622a8/tests/providers/google/cloud/hooks/test_cloud_sql.py#L1122-L1135
But I can't figure out how to test this kind of login within old-style `unittest` patching - any suggestions?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] potiuk commented on issue #20660: Google SQL shouldn't require password when using Application Default Credentials and Cloud SQL Auth proxy
Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #20660:
URL: https://github.com/apache/airflow/issues/20660#issuecomment-1005997313
Feel Free to add pr @yehoshuadimarsky . Cloud SQL operator was added when there was no Workload Identity (at all I tihnk not only for Cloud SQL).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] yehoshuadimarsky commented on issue #20660: Google SQL shouldn't require password when using Application Default Credentials and Cloud SQL Auth proxy
Posted by GitBox <gi...@apache.org>.
yehoshuadimarsky commented on issue #20660:
URL: https://github.com/apache/airflow/issues/20660#issuecomment-1005997602
Happy to give it a shot, I'll try!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org