You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/06/21 07:38:16 UTC

[GitHub] [airflow] mik-laj opened a new issue #9461: Unclear documentation for the delegate_to parameter

mik-laj opened a new issue #9461:
URL: https://github.com/apache/airflow/issues/9461


   Hello,
   
   Most GCP operators accept the `delegate_to` parameter. It is not super well documented and it can be confusing to users exactly when they would use this as opposed to another conn id.
   
   A little more documentation on the following page would be helpful.
   https://airflow.readthedocs.io/en/latest/howto/connection/gcp.html
   
   Best regards,
   Kamil


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] jaketf commented on issue #9461: Unclear documentation for the delegate_to parameter

Posted by GitBox <gi...@apache.org>.
jaketf commented on issue #9461:
URL: https://github.com/apache/airflow/issues/9461#issuecomment-670757178


   I agree remove `delegate_to`, make sure it is an ignored kwarg rather than breaking peoples DAGs if passed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] jaketf commented on issue #9461: Unclear documentation for the delegate_to parameter

Posted by GitBox <gi...@apache.org>.
jaketf commented on issue #9461:
URL: https://github.com/apache/airflow/issues/9461#issuecomment-670188547


   Taking a look at the code now it seems we have this common [GoogleBaseHook](https://github.com/apache/airflow/blob/d79e7221de76f01b5cd36c15224b59e8bb451c90/airflow/providers/google/common/hooks/base_google.py#L125) used by hooks for gsuite and cloud. This `delegate_to` seems not really not useful for cloud, and I don't think the scenario 2 of delegating to a human user to impersonate a service account is an advisable pattern / one worth supporting in airflow core. I think `delegate_to` should be removed / deprecated from the Google Cloud Hooks / Operators to avoid confusion.
   
   To play devil's advocate: There may be use cases where users expect `delegate_to` to attribute API calls (e.g. a BQ Query) to the delegated human user. Again, I don't think I'd recommend this as an auditing posture as anyone could throw jake@foo.com into the `delegate_to` and bootstrap my IAM permissions. IMO This seems like something we shouldn't support.  


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] olchas commented on issue #9461: Unclear documentation for the delegate_to parameter

Posted by GitBox <gi...@apache.org>.
olchas commented on issue #9461:
URL: https://github.com/apache/airflow/issues/9461#issuecomment-662015775


   @mik-laj I am still looking into it. The name suggests that domain-wide delegation only makes sense for G Suite applications (so in terms of Airflow it should only be applied to `GoogleDriveHook` and `GSheetsHook`), but [this article](https://medium.com/google-cloud/impersonating-users-with-google-cloud-platform-service-accounts-ba762db09092) calls it a legacy branding and tells that it applies to Cloud Identity as well.
   
   I am also still uncertain about how the two impersonation mechanisms can/should work together. As far as I can tell, domain-wide delegation is supposed to be used to impersonate **user account** using service account, while direct impersonation can be used to impersonate **service account** using **either** another service account **or** user account.
   
   So, I can see two scenarios:
   1. You start with a service/user account that you use to directly impersonate some service account, that is then used to perform domain-wide delegation on some user.
   1. You start by performing domain-wide delegation on some user, and then use this user to impersonate some service account.
   
   However, `Credentials` class from [google.auth.impersonated_credentials module](https://google-auth.readthedocs.io/en/latest/reference/google.auth.impersonated_credentials.html) does not have `with_subject` method, so apparently it is impossible to use directly impersonated account to perform domain-wide delegation of authority, which renders first scenario impossible. On the other hand, it seems you can specify the delegate for source credentials and then use these credentials for direct impersonation as in scenario 2, but I did not have a chance to test it.
   
   @jaketf, @amithmathew, do you perhaps have more insight on the topic?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] olchas commented on issue #9461: Unclear documentation for the delegate_to parameter

Posted by GitBox <gi...@apache.org>.
olchas commented on issue #9461:
URL: https://github.com/apache/airflow/issues/9461#issuecomment-659411579


   @mik-laj sure. Could you assign me to this issue, please?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] amithmathew commented on issue #9461: Unclear documentation for the delegate_to parameter

Posted by GitBox <gi...@apache.org>.
amithmathew commented on issue #9461:
URL: https://github.com/apache/airflow/issues/9461#issuecomment-670193295


   `delegate_to` would be used to impersonate a user account using a (specific) service account - It does look like `delegate_to` may be [used](https://developers.google.com/admin-sdk/reports/v1/guides/delegation) on the GSuite side of things.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] turbaszek commented on issue #9461: Unclear documentation for the delegate_to parameter

Posted by GitBox <gi...@apache.org>.
turbaszek commented on issue #9461:
URL: https://github.com/apache/airflow/issues/9461#issuecomment-670481476


   @olchas I'm in favour of 2nd approach. Regarding the deprecation - first I would check if current implementation works. If it does not then I see no reason for deprecation - let us just remove it.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #9461: Unclear documentation for the delegate_to parameter

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #9461:
URL: https://github.com/apache/airflow/issues/9461#issuecomment-670453386


   Sounds like great plan.  


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on issue #9461: Unclear documentation for the delegate_to parameter

Posted by GitBox <gi...@apache.org>.
mik-laj commented on issue #9461:
URL: https://github.com/apache/airflow/issues/9461#issuecomment-659320364


   @olchas] Do you want to work on it? It looks like this is a task for you.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on issue #9461: Unclear documentation for the delegate_to parameter

Posted by GitBox <gi...@apache.org>.
mik-laj commented on issue #9461:
URL: https://github.com/apache/airflow/issues/9461#issuecomment-660258187


   @olchas We first need to ask a basic question. Shouldn't we abandon support for this feature? In what cases does it work? In what cases does it not work?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #9461: Unclear documentation for the delegate_to parameter

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #9461:
URL: https://github.com/apache/airflow/issues/9461#issuecomment-670453674


   One small problem though - we will have to deprecate the delegate_to parameter rather than remove them initially. In 2.1 we might be able to remove them,


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #9461: Unclear documentation for the delegate_to parameter

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #9461:
URL: https://github.com/apache/airflow/issues/9461#issuecomment-670486761


   @olchas I'm in favour of 2nd approach. Regarding the deprecation - first I would check if current implementation works. If it does not then I see no reason for deprecation - let us just remove it.
   
   Agree with @turbaszek. I thought about deprecation in the sense that it will not fail when someone actually adds the "delegate_to" parameter. I am afraid there might be cases that someone already added this parameter and we do not want their DAGs to fail (even if this parameter was superfluous). 
   
   I believe we have **kwargs in most such operators but it's worth checking.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] olchas commented on issue #9461: Unclear documentation for the delegate_to parameter

Posted by GitBox <gi...@apache.org>.
olchas commented on issue #9461:
URL: https://github.com/apache/airflow/issues/9461#issuecomment-670384662


   @jaketf, @amithmathew so it would seem that the approach could be to:
   
   1. disallow simultaneous use of `delegate_to` and `impersonation_chain` (new argument used for direct impersonation) in `GoogleBaseHook` by raising an exception if both arguments are provided
   1. remove `delegate_to` from most Google operators and hooks, leaving it only in G Suite operators and hooks
   
   WDYT?
   CC: @turbaszek 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org