You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Steve Loughran (Jira)" <ji...@apache.org> on 2022/06/10 14:18:00 UTC
[jira] [Updated] (HADOOP-18285) S3a should retry when being throttled by STS (assumed roles)
[ https://issues.apache.org/jira/browse/HADOOP-18285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Steve Loughran updated HADOOP-18285:
------------------------------------
Component/s: fs/s3
> S3a should retry when being throttled by STS (assumed roles)
> ------------------------------------------------------------
>
> Key: HADOOP-18285
> URL: https://issues.apache.org/jira/browse/HADOOP-18285
> Project: Hadoop Common
> Issue Type: Improvement
> Components: fs/s3
> Affects Versions: 3.3.3
> Reporter: André Kelpe
> Priority: Major
>
> We ran into an issue where we were being throttled by AWS when reading from a bucket using the sts assume-role mechanism.
>
> The stacktrace looks like this:
>
> {code:java}
> Caused by: com.amazonaws.services.securitytoken.model.AWSSecurityTokenServiceException: Rate exceeded (Service: AWSSecurityTokenService; Status Code: 400; Error Code: Throttling; Request ID: 02f32511-418c-4b2a-96ef-2d7ba8dafab1; Proxy: null) 1654700598727
> at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1862) 1654700598727
> at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleServiceErrorResponse(AmazonHttpClient.java:1415) 1654700598727
> at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1384) 1654700598727
> at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1154) 1654700598727
> at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:811) 1654700598727
> at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:779) 1654700598727
> at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:753) 1654700598727
> at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:713) 1654700598727
> at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:695) 1654700598727
> at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:559) 1654700598727
> at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:539) 1654700598727
> at com.amazonaws.services.securitytoken.AWSSecurityTokenServiceClient.doInvoke(AWSSecurityTokenServiceClient.java:1682) 1654700598727
> at com.amazonaws.services.securitytoken.AWSSecurityTokenServiceClient.invoke(AWSSecurityTokenServiceClient.java:1649) 1654700598727
> at com.amazonaws.services.securitytoken.AWSSecurityTokenServiceClient.invoke(AWSSecurityTokenServiceClient.java:1638) 1654700598727
> at com.amazonaws.services.securitytoken.AWSSecurityTokenServiceClient.executeAssumeRole(AWSSecurityTokenServiceClient.java:498) 1654700598727
> at com.amazonaws.services.securitytoken.AWSSecurityTokenServiceClient.assumeRole(AWSSecurityTokenServiceClient.java:467) 1654700598727
> at com.amazonaws.auth.STSAssumeRoleSessionCredentialsProvider.newSession(STSAssumeRoleSessionCredentialsProvider.java:348) 1654700598727
> at com.amazonaws.auth.STSAssumeRoleSessionCredentialsProvider.access$000(STSAssumeRoleSessionCredentialsProvider.java:44) 1654700598727
> at com.amazonaws.auth.STSAssumeRoleSessionCredentialsProvider$1.call(STSAssumeRoleSessionCredentialsProvider.java:93) 1654700598727
> at com.amazonaws.auth.STSAssumeRoleSessionCredentialsProvider$1.call(STSAssumeRoleSessionCredentialsProvider.java:90) 1654700598727
> at com.amazonaws.auth.RefreshableTask.refreshValue(RefreshableTask.java:295) 1654700598727
> at com.amazonaws.auth.RefreshableTask.blockingRefresh(RefreshableTask.java:251) 1654700598727
> at com.amazonaws.auth.RefreshableTask.getValue(RefreshableTask.java:192) 1654700598727
> at com.amazonaws.auth.STSAssumeRoleSessionCredentialsProvider.getCredentials(STSAssumeRoleSessionCredentialsProvider.java:320) 1654700598727{code}
> I read the code and from what I can see the Exception is being handled by S3AUtils here [https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AUtils.java#L240]
> It does not further inspect the message and assumes that the 400 is indeed a bad request. Because of this it gets handled as a {color:#24292f}AWSBadRequestException{color} which then will lead to the request to fail instead of retry in the S3ARetryPolicy.
> [https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3ARetryPolicy.java#L215-L217]
>
> A better approach seems to be to look at the sub-type and message of the original exception and handle it as a back-off and retry by throwing a different exception than {color:#24292f}AWSBadRequestException{color}
>
--
This message was sent by Atlassian Jira
(v8.20.7#820007)
---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org