You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "mai-nakagawa (via GitHub)" <gi...@apache.org> on 2023/08/15 07:11:19 UTC
[GitHub] [airflow] mai-nakagawa opened a new issue, #33400: BigQuery with impersonation_chain does not accept custom scopes
mai-nakagawa opened a new issue, #33400:
URL: https://github.com/apache/airflow/issues/33400
### Apache Airflow version
main (development)
### What happened
I always face the following error when I try to run a BigQuery query that accesses [connected sheets](https://cloud.google.com/bigquery/docs/connected-sheets), when I use `impersonation_chain`.
```
File "/opt/python3.8/lib/python3.8/site-packages/airflow/providers/google/cloud/hooks/bigquery.py", line 2203, in run_query
job = self.insert_job(configuration=configuration, project_id=self.project_id)
File "/opt/python3.8/lib/python3.8/site-packages/airflow/providers/google/common/hooks/base_google.py", line 439, in inner_wrapper
return func(self, *args, **kwargs)
File "/opt/python3.8/lib/python3.8/site-packages/airflow/providers/google/cloud/hooks/bigquery.py", line 1571, in insert_job
job.result(timeout=timeout, retry=retry)
File "/opt/python3.8/lib/python3.8/site-packages/google/cloud/bigquery/job/query.py", line 1499, in result
do_get_result()
File "/opt/python3.8/lib/python3.8/site-packages/google/cloud/bigquery/job/query.py", line 1489, in do_get_result
super(QueryJob, self).result(retry=retry, timeout=timeout)
File "/opt/python3.8/lib/python3.8/site-packages/google/cloud/bigquery/job/base.py", line 728, in result
return super(_AsyncJob, self).result(timeout=timeout, **kwargs)
File "/opt/python3.8/lib/python3.8/site-packages/google/api_core/future/polling.py", line 137, in result
raise self._exception
google.api_core.exceptions.Forbidden: 403 Access Denied: BigQuery BigQuery: Permission denied while getting Drive credentials.
```
I think it's because it always uses a default scope: `https://www.googleapis.com/auth/cloud-platform`. We can set scopes with Airflow connections. However, we cannot set scopes with `impersonation_chain`.
### What you think should happen instead
I would like the operators and hooks to accept custom scope - `https://www.googleapis.com/auth/drive` in this case.
### How to reproduce
1. Prepare a [connected sheet](https://cloud.google.com/bigquery/docs/connected-sheets).
2. Run a task with BigQueryInsertJobOperator (or the like) to run a BigQuery query against the connected sheet, using `impersonation_chain`.
3. You'll face the error:
```
403 Access Denied: BigQuery BigQuery: Permission denied while getting Drive credentials.
```
### Operating System
Linux
### Versions of Apache Airflow Providers
_No response_
### Deployment
Google Cloud Composer
### Deployment details
_No response_
### Anything else
_No response_
### Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] nathadfield commented on issue #33400: BigQuery with impersonation_chain does not accept custom scopes
Posted by "nathadfield (via GitHub)" <gi...@apache.org>.
nathadfield commented on issue #33400:
URL: https://github.com/apache/airflow/issues/33400#issuecomment-1678562158
No problem. I'll ask around to see if other people have some thoughts on this too.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] BigQuery with impersonation_chain does not accept custom scopes [airflow]
Posted by "mai-nakagawa (via GitHub)" <gi...@apache.org>.
mai-nakagawa commented on issue #33400:
URL: https://github.com/apache/airflow/issues/33400#issuecomment-1849156253
Yes, it picks up scopes from Airflow Connection's field. The problem is that we cannot set scopes with impersonation_chain, as written in the description field of this GitHub issue.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] BigQuery with impersonation_chain does not accept custom scopes [airflow]
Posted by "nathadfield (via GitHub)" <gi...@apache.org>.
nathadfield commented on issue #33400:
URL: https://github.com/apache/airflow/issues/33400#issuecomment-1748918041
@aritra24 No problem, I've been there myself.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] nathadfield commented on issue #33400: BigQuery with impersonation_chain does not accept custom scopes
Posted by "nathadfield (via GitHub)" <gi...@apache.org>.
nathadfield commented on issue #33400:
URL: https://github.com/apache/airflow/issues/33400#issuecomment-1678522826
@mai-nakagawa Thanks for logging this. I also know this is a problem so I'm keen to see if this can be addressed. Are you aware of what the solution to this might be?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] BigQuery with impersonation_chain does not accept custom scopes [airflow]
Posted by "nathadfield (via GitHub)" <gi...@apache.org>.
nathadfield commented on issue #33400:
URL: https://github.com/apache/airflow/issues/33400#issuecomment-1748904434
@aritra24 Are you planning trying to implement this or should we un-assign you?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] BigQuery with impersonation_chain does not accept custom scopes [airflow]
Posted by "aritra24 (via GitHub)" <gi...@apache.org>.
aritra24 commented on issue #33400:
URL: https://github.com/apache/airflow/issues/33400#issuecomment-1774979003
@nathadfield this might need to be assigned to someone else, I tried it out for a few days now and unfortunately my lack of experience with gcp is really slowing down progress and it might be better handled by someone with better grasp on this.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] nathadfield commented on issue #33400: BigQuery with impersonation_chain does not accept custom scopes
Posted by "nathadfield (via GitHub)" <gi...@apache.org>.
nathadfield commented on issue #33400:
URL: https://github.com/apache/airflow/issues/33400#issuecomment-1680213843
@aritra24 I think that would be most welcomed.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] BigQuery with impersonation_chain does not accept custom scopes [airflow]
Posted by "nathadfield (via GitHub)" <gi...@apache.org>.
nathadfield commented on issue #33400:
URL: https://github.com/apache/airflow/issues/33400#issuecomment-1845598109
@pierre-comalada No. This is currently looking for someone with enough time and desire to take it on.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] mai-nakagawa commented on issue #33400: BigQuery with impersonation_chain does not accept custom scopes
Posted by "mai-nakagawa (via GitHub)" <gi...@apache.org>.
mai-nakagawa commented on issue #33400:
URL: https://github.com/apache/airflow/issues/33400#issuecomment-1678530406
@nathadfield Thanks for addressing the matter.
Airflow connection already have a functionality to set scope. So, the only `impersonation_chain` needs scope. My quick idea is to create a class for impersonation_chain to keep the service account email address and scope as follows. What do you think?
```
@dataclass
class ImpersonationChain:
chain: ImpersonationServiceAccountWithScope | Sequence[ImpersonationServiceAccountWithScope]
@dataclass
class ImpersonationServiceAccountWithScope:
email_address: str
scope: str | None = None
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] nathadfield commented on issue #33400: BigQuery with impersonation_chain does not accept custom scopes
Posted by "nathadfield (via GitHub)" <gi...@apache.org>.
nathadfield commented on issue #33400:
URL: https://github.com/apache/airflow/issues/33400#issuecomment-1678536544
@mai-nakagawa Ah, ok! Well, I would suggest that you make a change and raise a PR then it will no doubt come up for discussion with the committers. I will assign the issue to you and look forward to this improvement!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] BigQuery with impersonation_chain does not accept custom scopes [airflow]
Posted by "buu-nguyen (via GitHub)" <gi...@apache.org>.
buu-nguyen commented on issue #33400:
URL: https://github.com/apache/airflow/issues/33400#issuecomment-1848596780
> Possible workaround:
>
> * Step1: Extend a BigQueryHook class and overwrite [`GoogleBaseHook#scopes`](https://github.com/apache/airflow/blob/0f73647bdab79ac6c30961222924f6166f75b55a/airflow/providers/google/common/hooks/base_google.py#L395-L404) method as follows:
> ```python
> class BigQueryHookWithScopes(BigQueryHook):
> def __init__(self, scopes: Sequence[str], *args, **kwargs):
> super().__init__(*args, **kwargs)
> self._scopes = scopes
>
> @property
> def scopes(self) -> Sequence[str]:
> return self._scopes
> ```
>
> * Step2: Extend a BigQuery related Operators to use the above hook as follows:
> ```python
> class BigQueryExecuteQueryOperatorWithScope(BigQueryExecuteQueryOperator):
> def __init__(self, scopes, *args, **kwargs):
> super().__init__(*args, **kwargs)
> self.scopes = scopes
>
> def execute(self, context):
> self.hook = BigQueryHookWithScopes(
> scopes=self.scopes,
> gcp_conn_id=self.gcp_conn_id,
> use_legacy_sql=self.use_legacy_sql,
> delegate_to=self.delegate_to,
> location=self.location,
> impersonation_chain=self.impersonation_chain,
> )
> super().execute(context)
> ```
Hey, thanks for the workaround. I just noticed that GoogleBaseHook seems to already pick up scopes from connection's field(https://github.com/apache/airflow/blob/0f73647bdab79ac6c30961222924f6166f75b55a/airflow/providers/google/common/hooks/base_google.py#L402C44-L402C44).
Is there a specific reason to override it? Just curious about this approach.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] BigQuery with impersonation_chain does not accept custom scopes [airflow]
Posted by "potiuk (via GitHub)" <gi...@apache.org>.
potiuk closed issue #33400: BigQuery with impersonation_chain does not accept custom scopes
URL: https://github.com/apache/airflow/issues/33400
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] phanikumv commented on issue #33400: BigQuery with impersonation_chain does not accept custom scopes
Posted by "phanikumv (via GitHub)" <gi...@apache.org>.
phanikumv commented on issue #33400:
URL: https://github.com/apache/airflow/issues/33400#issuecomment-1680386737
Assigned to you @aritra24
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] mai-nakagawa commented on issue #33400: BigQuery with impersonation_chain does not accept custom scopes
Posted by "mai-nakagawa (via GitHub)" <gi...@apache.org>.
mai-nakagawa commented on issue #33400:
URL: https://github.com/apache/airflow/issues/33400#issuecomment-1678561047
@nathadfield Can you please un-assign me then? I might try to fix by myself when I have time, however, I can't guarantee. Sorry for confusion.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] BigQuery with impersonation_chain does not accept custom scopes [airflow]
Posted by "aritra24 (via GitHub)" <gi...@apache.org>.
aritra24 commented on issue #33400:
URL: https://github.com/apache/airflow/issues/33400#issuecomment-1748909965
@nathadfield I've been a bit occupied with work and lost track of this, I can try working on this by early to mid next week I presume.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] aritra24 commented on issue #33400: BigQuery with impersonation_chain does not accept custom scopes
Posted by "aritra24 (via GitHub)" <gi...@apache.org>.
aritra24 commented on issue #33400:
URL: https://github.com/apache/airflow/issues/33400#issuecomment-1680208118
I can try taking this up if it's available
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] nathadfield commented on issue #33400: BigQuery with impersonation_chain does not accept custom scopes
Posted by "nathadfield (via GitHub)" <gi...@apache.org>.
nathadfield commented on issue #33400:
URL: https://github.com/apache/airflow/issues/33400#issuecomment-1678555836
@mai-nakagawa Sorry, I didn't want to assume that you would implement this. It just seemed like you were already working on it. I can un-assign you if you'd prefer.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] mai-nakagawa commented on issue #33400: BigQuery with impersonation_chain does not accept custom scopes
Posted by "mai-nakagawa (via GitHub)" <gi...@apache.org>.
mai-nakagawa commented on issue #33400:
URL: https://github.com/apache/airflow/issues/33400#issuecomment-1678606598
Possible workaround:
- Step1: Extend a BigQueryHook class and overwrite [`GoogleBaseHook#scopes`](https://github.com/apache/airflow/blob/0f73647bdab79ac6c30961222924f6166f75b55a/airflow/providers/google/common/hooks/base_google.py#L395-L404) method as follows:
```python
class BigQueryHookWithScopes(BigQueryHook):
def __init__(self, scopes: Sequence[str], *args, **kwargs):
super().__init__(*args, **kwargs)
self._scopes = scopes
@property
def scopes(self) -> Sequence[str]:
return self._scopes
```
- Step2: Extend a BigQuery related Operators to use the above hook as follows:
```python
class BigQueryInsertJobOperatorWithScope(BigQueryInsertJobOperator):
def __init__(self, scopes, *args, **kwargs):
super().__init__(*args, **kwargs)
self.scopes = scopes
def execute(self, context):
self.hook = BigQueryHookWithScopes(
scopes=self.scopes,
gcp_conn_id=self.gcp_conn_id,
use_legacy_sql=self.use_legacy_sql,
delegate_to=self.delegate_to,
location=self.location,
impersonation_chain=self.impersonation_chain,
)
super().execute(context)
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] mai-nakagawa commented on issue #33400: BigQuery with impersonation_chain does not accept custom scopes
Posted by "mai-nakagawa (via GitHub)" <gi...@apache.org>.
mai-nakagawa commented on issue #33400:
URL: https://github.com/apache/airflow/issues/33400#issuecomment-1678545996
@nathadfield Oh, ok. I'll try when I have time.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] BigQuery with impersonation_chain does not accept custom scopes [airflow]
Posted by "pierre-comalada (via GitHub)" <gi...@apache.org>.
pierre-comalada commented on issue #33400:
URL: https://github.com/apache/airflow/issues/33400#issuecomment-1845592742
Any updates on this issue ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org