You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "sugibuchi (via GitHub)" <gi...@apache.org> on 2023/04/17 19:25:40 UTC

[GitHub] [arrow-rs] sugibuchi opened a new issue, #4096: ImdsManagedIdentityOAuthProvider should send resource ID instead of OIDC scope

sugibuchi opened a new issue, #4096:
URL: https://github.com/apache/arrow-rs/issues/4096

   **Describe the bug**
   
   The current implementation of `ImdsManagedIdentityOAuthProvider` (for MSI-based authentication in Azure) tries to get tokens from IMDS endpoint by using the default **OIDC scope** (resource ID+permission) of Azure storage service (`https://storage.azure.com/.default`) as query parameter `resource`. 
   
   https://github.com/apache/arrow-rs/blob/master/object_store/src/azure/credential.rs#L53
   https://github.com/apache/arrow-rs/blob/master/object_store/src/azure/credential.rs#L418-L428
   ```
   const AZURE_STORAGE_SCOPE: &str = "https://storage.azure.com/.default";   /// <-- This is a "scope"
   ...
   impl TokenCredential for ImdsManagedIdentityOAuthProvider {
       /// Fetch a token
       async fn fetch_token(
           &self,
           _client: &Client,
           retry: &RetryConfig,
       ) -> Result<TemporaryToken<String>> {
           let mut query_items = vec![
               ("api-version", MSI_API_VERSION),
               ("resource", AZURE_STORAGE_SCOPE),    /// <-- Set "scope" including ".default"
           ];
   ```
   
   However, the value of `resource` must be a **resource ID** without `.default`. You can find a C# code example in the following official document.
   
   https://learn.microsoft.com/en-us/azure/active-directory/managed-identities-azure-resources/tutorial-vm-windows-access-storage#access-data
   
   **To Reproduce**
   
   Sorry. I cannot directly reproduce this problem since I have no experience in Rust. We identified this problem when we tried to write Delta Lake file by using [Python binding of delta-rs](https://delta-io.github.io/delta-rs/python/api_reference.html#writing-deltatables) which uses Rust `object_store`.
   
   ```
   deltalake.PyDeltaTableError: Failed to load checkpoint: Failed to read checkpoint content: Generic MicrosoftAzure error: Error authorizing request: Error performing token request: response error "adal: Refresh request failed. Status Code = '400'. Response body: {"error":"invalid_resource","error_description":"AADSTS500011: The resource principal named https://storage.azure.com/.default was not found in the tenant named *****. This can happen if the application has not been installed by the administrator of the tenant or consented to by any user in the tenant. You might have sent your authentication request to the wrong tenant.\r\nTrace ID: *****\r\nCorrelation ID: *****\r\nTimestamp: 2023-04-17 16:15:01Z","error_codes":[500011],"timestamp":"2023-04-17 16:15:01Z","trace_id":"*****","correlation_id":"*****","error_uri":"https://westeurope.login.microsoft.com/error?code=500011"}
   Endpoint http://169.254.169.254/metadata/identity/oauth2/token?api-version=2018-02-01&client_id=*****&resource=https%3A%2F%2Fstorage.azure.com%2F.default
   ", after 0 retries: HTTP status client error (403 Forbidden) for url (http://169.254.169.254/metadata/identity/oauth2/token?api-version=2019-08-01&resource=https%3A%2F%2Fstorage.azure.com%2F.default&client_id=*****)
   ```
   
   We can reproduce the same error by sending requests to IMDS endpoint by using curl.
   
   ```
   # with ".default"
   curl -H "Metadata: true" "http://169.254.169.254/metadata/identity/oauth2/token?api-version=2018-02-01&client_id=***&resource=https%3A%2F%2Fstorage.azure.com%2F.default"
   
   adal: Refresh request failed. Status Code = '400'. Response body: {"error":"invalid_resource", ...
   
   # without ".default"
   curl -H "Metadata: true" "http://169.254.169.254/metadata/identity/oauth2/token?api-version=2018-02-01&client_id=***&resource=https%3A%2F%2Fstorage.azure.com%2F"
   
   {"access_token":"...
   ```
   
   **Expected behavior**
   ImdsManagedIdentityOAuthProvider sends request to `http://169.254.169.254/metadata/identity/oauth2/token?api-version=2018-02-01&client_id=*****&resource=https%3A%2F%2Fstorage.azure.com%2F`, without `.default` in query parameter `resource`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] saryeHaddadi commented on issue #4096: ImdsManagedIdentityOAuthProvider should send resource ID instead of OIDC scope

Posted by "saryeHaddadi (via GitHub)" <gi...@apache.org>.
saryeHaddadi commented on issue #4096:
URL: https://github.com/apache/arrow-rs/issues/4096#issuecomment-1518339681

   I'd like to share my researches in case it helps.
   My conclusion is that, in the below HTTP call, one needs to pass in a "resource" instead of a "scope".
   This can be done by extracting the ressource from the scope (see _scopes_to_resource() mentioned above).
   - https://github.com/apache/arrow-rs/blob/master/object_store/src/azure/credential.rs#L427
   
   How I reached that conclusion.
   
   ### First, what is the definition of a scope
   > In OAuth 2.0, scopes and permissions are used interchangeably to define the level of access that a client has to a protected resource. Scopes are used to specify the level of access that a client has to a protected resource, but they do not provide the granularity necessary to define what the client can do with that resource. Permissions are represented as string values and are used by an app to request the permissions it needs by specifying them in the scope query parameter
   >
   > In other words, scopes are per client app while permissions are per user. For example, one client app can have a scope(s) to access certain API(s), but the users of this client app will have different permissions in this API based on their roles.
   
   Scope examples:
   - `User.Read.All`, `Directory.ReadWrite.All`
   - There are some "well-known" scopes like `email`, `profile`
   - And (in Azure at least) `.default`
   
   Sources:
   [1. learn.microsoft.com](https://learn.microsoft.com/EN-US/azure/active-directory/develop/scopes-oidc)
   [2. permit.io](https://www.permit.io/blog/oauth2-scopes-for-authz)
   [3. stackoverflow.com](https://stackoverflow.com/questions/48351332/oauth-scopes-and-application-roles-permissions)
   [4. stackoverflow.com](https://stackoverflow.com/questions/60942114/oauth-2-0-jwt-guidance-about-when-to-use-scope-vs-roles)
   
   ### What is the `.default` scope
   
   → See Microsoft [documentation](https://learn.microsoft.com/en-us/azure/active-directory/develop/scopes-oidc#the-default-scope).
   
   In particular, it states how, from a `resource`, to reference the `.default` scope.
   > The scope parameter value is constructed by using the identifier URI for the resource and `.default`, separated by a forward slash (`/`). For example, if the resource's identifier URI is `https://contoso.com`, the scope to request is `https://contoso.com/.default`.
   
   So I understand that Scopes & Ressources are two different kinds of objects. Now I'd like to confirm how, from a `ressource identifier`, I can construct a `scope`. [link to doc](https://learn.microsoft.com/EN-US/azure/active-directory/develop/consent-types-developer#requesting-individual-user-consent)
   
   > The scope parameter is a space-separated list of delegated permissions that the application is requesting. Each permission is indicated by appending the permission value to the resource's identifier (the application ID URI).
   
   [Examples 1](https://learn.microsoft.com/en-us/azure/active-directory/develop/msal-v1-app-scopes#scopes-to-request-access-to-specific-oauth2-permissions-of-a-v10-application)
   > var scopes = new [] {  ResourceId+"/user_impersonation"};
   
   [Example 2](https://learn.microsoft.com/EN-US/azure/active-directory/develop/consent-types-developer#requesting-individual-user-consent)
   > GET https://login.microsoftonline.com/common/oauth2/v2.0/authorize?
   client_id=6731de76-14a6-49ae-97bc-6eba6914391e
   &response_type=code
   &redirect_uri=http%3A%2F%2Flocalhost%2Fmyapp%2F
   &response_mode=query
   &**scope=
   https%3A%2F%2Fgraph.microsoft.com%2Fcalendars.read%20         <---- '%20' : space separated-list
   https%3A%2F%2Fgraph.microsoft.com%2Fmail.send**
   &state=12345
   
   ### When using a Managed-Identity, how to get an Access Token
   
   An App is deployed on a VM. And an Identity (= a service-principal) is given to the VM. The App doesn't need to have its own service-principale, but can ask the get authorized with the same rights than the VM. For that, the App needs to query the Azure Instance Metadata Service (IMDS).
   > IMDS is a REST API that's available at a well-known, non-routable IP address (169.254.169.254). You can only access it from within the VM.
   
   This IMDS service exposes a number of APIs, among them the `/identity/oauth2/token` API. As per the documentation, this API accepts a `ressource` parameter, but does not accept a `scope` as a parameter.
   See [Swagger spec](https://github.com/Azure/azure-rest-api-specs/tree/main/specification/imds/data-plane/Microsoft.InstanceMetadataService/stable/2019-08-01) for IMDS API, version 2019-08-01.
   - imds.json L109-114: "This is the urlencoded identifier URI of the sink resource for the requested Azure AD token." => Not a scope.
   - examples/GetIdentityToken.json: The ressource parameter is not given a scope, but a ressource.
   
   
   Related Readings
   1. [learn.microsoft.co - Instance Metadata Service](https://learn.microsoft.com/en-us/azure/virtual-machines/instance-metadata-service?tabs=windows)
   2. [learn.microsoft.co - Get a Token using HTTP](https://learn.microsoft.com/en-us/azure/active-directory/managed-identities-azure-resources/how-to-use-vm-token#get-a-token-using-http)
   
   Also, from the source code, [this call](https://github.com/apache/arrow-rs/blob/master/object_store/src/azure/credential.rs#L332-L346) passes a scope value to a `scope` argument.
   ![image](https://user-images.githubusercontent.com/51054901/233732274-8f6a58c6-8258-4818-b2a1-644ed18433b8.png)
   
   While [that call](https://github.com/apache/arrow-rs/blob/master/object_store/src/azure/credential.rs#L425-L428), passes a scope value to a `resource` argument, here is the issue.
   ![image](https://user-images.githubusercontent.com/51054901/233732592-4d842540-e92f-43d2-ad7f-49d57680573e.png)
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] sugibuchi commented on issue #4096: ImdsManagedIdentityOAuthProvider should send resource ID instead of OIDC scope

Posted by "sugibuchi (via GitHub)" <gi...@apache.org>.
sugibuchi commented on issue #4096:
URL: https://github.com/apache/arrow-rs/issues/4096#issuecomment-1513623719

   @tustvold 
   It might work with `.default` in some environments (we are using [AAD Pod Identity](https://learn.microsoft.com/en-us/azure/aks/use-azure-ad-pod-identity) in AKS, which is an emulation of IMDS in Kuberentes cluster. This is probably a reason why we are seeing different results).
   
   But the documentation clearly says that a value of `resource` should be "App ID URI of the target **resource**", not scope.
   
   https://learn.microsoft.com/en-us/azure/active-directory/managed-identities-azure-resources/how-to-use-vm-token#get-a-token-using-http
   
   Managed Identity credential class in Azure Java SDK accepts resource ID as configuration parameter.
   
   https://github.com/Azure/azure-sdk-for-java/blob/main/sdk/identity/azure-identity/src/main/java/com/azure/identity/ManagedIdentityCredentialBuilder.java#L83
   
   And an equivalent class in Azure Python SDK explicitly drops `.default` from query parameter values.
   
   https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/identity/azure-identity/azure/identity/_internal/managed_identity_client.py#L112
   https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/identity/azure-identity/azure/identity/_internal/__init__.py#L19-L29


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] tustvold closed issue #4096: ImdsManagedIdentityOAuthProvider should send resource ID instead of OIDC scope

Posted by "tustvold (via GitHub)" <gi...@apache.org>.
tustvold closed issue #4096: ImdsManagedIdentityOAuthProvider should send resource ID instead of OIDC scope
URL: https://github.com/apache/arrow-rs/issues/4096


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] tustvold commented on issue #4096: ImdsManagedIdentityOAuthProvider should send resource ID instead of OIDC scope

Posted by "tustvold (via GitHub)" <gi...@apache.org>.
tustvold commented on issue #4096:
URL: https://github.com/apache/arrow-rs/issues/4096#issuecomment-1552715788

   `label_issue.py` automatically added labels {'object-store'} from #4193


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] tustvold commented on issue #4096: ImdsManagedIdentityOAuthProvider should send resource ID instead of OIDC scope

Posted by "tustvold (via GitHub)" <gi...@apache.org>.
tustvold commented on issue #4096:
URL: https://github.com/apache/arrow-rs/issues/4096#issuecomment-1518459873

   Thank you for your investigation, this makes sense to me and I would be happy to review a fix.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] tustvold commented on issue #4096: ImdsManagedIdentityOAuthProvider should send resource ID instead of OIDC scope

Posted by "tustvold (via GitHub)" <gi...@apache.org>.
tustvold commented on issue #4096:
URL: https://github.com/apache/arrow-rs/issues/4096#issuecomment-1513502703

   Is it possible the error is the result of a permissions restriction in your environment, the use of .default works in my test environment


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org