You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-dev@hadoop.apache.org by "Steve Loughran (Jira)" <ji...@apache.org> on 2020/01/21 18:42:00 UTC

[jira] [Created] (HADOOP-16823) Manage S3 Throttling exclusively in S3A client

Steve Loughran created HADOOP-16823:
---------------------------------------

             Summary: Manage S3 Throttling exclusively in S3A client
                 Key: HADOOP-16823
                 URL: https://issues.apache.org/jira/browse/HADOOP-16823
             Project: Hadoop Common
          Issue Type: Sub-task
          Components: fs/s3
    Affects Versions: 3.2.1
            Reporter: Steve Loughran
            Assignee: Steve Loughran


Currently AWS S3 throttling is initially handled in the AWS SDK, only reaching the S3 client code after it has given up.

This means we don't always directly observe when throttling is taking place.

Proposed:

* disable throttling retries in the AWS client library
* add a quantile for the S3 throttle events, as DDB has
* isolate counters of s3 and DDB throttle events to classify issues better

Because we are taking over the AWS retries, we will need to expand the initial delay en retries and the number of retries we should support before giving up.

Also: should we log throttling events? It could be useful but there is a risk of logs overloading especially if many threads in the same process were triggering the problem.

Proposed: log at debug.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-dev-help@hadoop.apache.org