You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "Samrose-Ahmed (via GitHub)" <gi...@apache.org> on 2023/03/03 09:35:26 UTC

[GitHub] [arrow-rs] Samrose-Ahmed opened a new issue, #3797: Support GCP Workload Identity Federation

Samrose-Ahmed opened a new issue, #3797:
URL: https://github.com/apache/arrow-rs/issues/3797

   **Is your feature request related to a problem or challenge? Please describe what you are trying to do.**
   <!--
   A clear and concise description of what the problem is. Ex. I'm always frustrated when [...] 
   (This section helps Arrow developers understand the context and *why* for this feature, in addition to  the *what*)
   -->
   
   I am accessing GCP resources from AWS using GCP Workload Identity Federation.
   
   **Describe the solution you'd like**
   <!--
   A clear and concise description of what you want to happen.
   -->
   
   Be able to access GCP resources from AWS using GCP GCP Workload Identity Federation using object_store.
   
   https://cloud.google.com/docs/authentication/provide-credentials-adc#wlif
   
   **Describe alternatives you've considered**
   <!--
   A clear and concise description of any alternative solutions or features you've considered.
   -->
   
   Is there a way to export my workload identity credentials to a form object_store can understand similar to AWS STS GetSessionToken (my knowledge of GCP is more limited)?
   
   **Additional context**
   <!--
   Add any other context or screenshots about the feature request here.
   -->
   
   - Currently errors with `GCP credential error: A configuration file was passed in but was not used` at https://github.com/apache/arrow-rs/blob/master/object_store/src/gcp/credential.rs#L431
   - There are different types of Application Default Credentials files other than https://github.com/apache/arrow-rs/blob/master/object_store/src/gcp/credential.rs#L405-L411, see https://cloud.google.com/docs/authentication/provide-credentials-adc#wlif
   The One for workload identity federation looks like:
   
   ```json
   {
       "audience": "//iam.googleapis.com/projects/111111534588/locations/global/workloadIdentityPools/abc",
       "credential_source": {
         "environment_id": "id123",
         "regional_cred_verification_url": "https://sts.{region}.amazonaws.com?Action=GetCallerIdentity&Version=2011-06-15"
       },
       "service_account_impersonation": {
           "token_lifetime_seconds": 3600
       },
       "service_account_impersonation_url": "https://iamcredentials.googleapis.com/v1/projects/-/serviceAccounts/acct@acb123.iam.gserviceaccount.com:generateAccessToken",
       "subject_token_type": "urn:ietf:params:aws:token-type:aws4_request",
       "token_url": "https://sts.googleapis.com/v1/token",
       "type": "external_account"
   }
   ```
   - I believe this library doesn't use any GCP client (if it exists), the process for exchanging credentials over REST API is documented here: https://cloud.google.com/iam/docs/workload-identity-federation-with-other-clouds#generate-automatic


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Support GCP Workload Identity Federation [arrow-rs]

Posted by "tustvold (via GitHub)" <gi...@apache.org>.
tustvold commented on issue #3797:
URL: https://github.com/apache/arrow-rs/issues/3797#issuecomment-1808016404

   No this is covering a different kind of credential federation for workloads running outside of GCP. That error is coming from the GCP metadata server, and might indicate some sort of misconfiguration on your part


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] Samrose-Ahmed commented on issue #3797: Support GCP Workload Identity Federation

Posted by "Samrose-Ahmed (via GitHub)" <gi...@apache.org>.
Samrose-Ahmed commented on issue #3797:
URL: https://github.com/apache/arrow-rs/issues/3797#issuecomment-1453259013

   I need this so happy to contribute it if there's no way to get around ti


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] tustvold commented on issue #3797: Support GCP Workload Identity Federation

Posted by "tustvold (via GitHub)" <gi...@apache.org>.
tustvold commented on issue #3797:
URL: https://github.com/apache/arrow-rs/issues/3797#issuecomment-1453388538

   I don't believe there currently is support for this, but I would be happy to review a PR that added support for it. :+1:
   
   FWIW @winding-lines filed https://github.com/apache/arrow-rs/pull/3532 which used an external gcp_auth crate. Typically we have tried to keep the dependency tree down, and so went with https://github.com/apache/arrow-rs/pull/3541 instead, but looking into the gcp_auth crate it doesn't appear to support the `external_account` credential source either...
   
   https://google.aip.dev/auth/4110 appears to be the authoritative docs on ApplicationDefaultCredentials, with https://google.aip.dev/auth/4117 documenting the external_account flow. This appears to have special case logic for the different sources, e.g. AWS, Azure. Ideally this would reuse the existing auth logic we have for those systems... 
   
   Alternatively if you can find a well-supported upstream crate that supports this, I wouldn't object to an optional dependency on it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Support GCP Workload Identity Federation [arrow-rs]

Posted by "gianarb (via GitHub)" <gi...@apache.org>.
gianarb commented on issue #3797:
URL: https://github.com/apache/arrow-rs/issues/3797#issuecomment-1808297397

   Yeah I have no idea which one! unfortunely but if I can't figure it out
   I will open my own issue.
   
   Thanks
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Support GCP Workload Identity Federation [arrow-rs]

Posted by "gianarb (via GitHub)" <gi...@apache.org>.
gianarb commented on issue #3797:
URL: https://github.com/apache/arrow-rs/issues/3797#issuecomment-1808010047

   Hello! I am writing here to double check if the issue I am working on is similar to this one or if I am just doing something wrong since my lack of knowledge when it comes to GCP.
   
   I enabled GCP support to my application that uses datafusion (previously I was using AWS and local storage), everything works fine locally when I use the `APPLICATION_CREDENTIALS` environment variable but in production my workload runs on GCP autopilot so my plan was to use the suggested workload identity to provide access to GCP Object Storage and my expectation is that the token acquisition should work without any configuration (from a datafusion point of view)
   
   https://cloud.google.com/kubernetes-engine/docs/how-to/workload-identity#authenticating_to
   
   But it fails:
   
   ```
   Error performing token request: response error \"Unable to generate access token; IAM returned 400 Bad Request: Invalid form of account ID serviceAccount:<>.iam.gserviceaccount.com. Should be [Gaia ID |Email |Unique ID |] of the account
   ```
   
   So I am wondering if I don't know how to properly configure the object store builder or if it is an unsupported authentication method.
   
   Thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org