You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2022/02/07 15:10:00 UTC

[jira] [Work logged] (HADOOP-14661) S3A to support Requester Pays Buckets

     [ https://issues.apache.org/jira/browse/HADOOP-14661?focusedWorklogId=721971&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-721971 ]

ASF GitHub Bot logged work on HADOOP-14661:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 07/Feb/22 15:09
            Start Date: 07/Feb/22 15:09
    Worklog Time Spent: 10m 
      Work Description: dannycjones opened a new pull request #3962:
URL: https://github.com/apache/hadoop/pull/3962


   ### Description of PR
   
   This change introduces support for "requester pays" S3 buckets (HADOOP-14661).
   
   There was already some discussion and patches made available, although the code base has changed a bit since then.
   
   One concern for me is that unused options have been added both to the [`S3ClientCreationParameters`](https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3ClientFactory.java#L103-L106) (in March 2021) and SDK [`RequestFactory`](https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/RequestFactoryImpl.java#L109-L114) (in May 2021, as part of auditor changes). I've adopted the former approach which allows me to add the relevant header to all S3 requests in a fairly obvious place. Please let me know if either implementation method is preferred or if both are needed.
   
   In summary:
   
   - Add some information to the main S3A documentation
   - Add example of error and how to fix/configure to troubleshooting doc
   - Add options to configure requester pays in S3A (`fs.s3a.requester-pays.enabled`) when creating S3A filesystem, which uses S3ClientFactory to create a new S3Client also taking this option
   - Add the required header inside S3ClientFactory if option was configured
   - Add new integration test which talks to publicly available `usgs-landsat` bucket (new version, not old) which is configured to require requester pays (['usgs-landsat' on Registry of Open Data on AWS](https://registry.opendata.aws/usgs-landsat/index.html)).
   
   ### How was this patch tested?
   
   I created an EC2 instance and an S3 Bucket in eu-west-1 and ran the tests against the global S3 endpoint.
   
   ```
   mvn -Dparallel-tests -DtestsThreadCount=16 clean verify
   ```
   
   Note, the request pays specific tests introduced do not use the developer supplied bucket.
   
   ### For code changes:
   
   - [x] Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
   - [x] Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
   - [x] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)?
   - [x] If applicable, have you updated the `LICENSE`, `LICENSE-binary`, `NOTICE-binary` files?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

            Worklog Id:     (was: 721971)
    Remaining Estimate: 1h 50m  (was: 2h)
            Time Spent: 10m

> S3A to support Requester Pays Buckets
> -------------------------------------
>
>                 Key: HADOOP-14661
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14661
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: common, util
>    Affects Versions: 3.0.0-alpha3
>            Reporter: Mandus Momberg
>            Assignee: Mandus Momberg
>            Priority: Minor
>         Attachments: HADOOP-14661.patch
>
>   Original Estimate: 2h
>          Time Spent: 10m
>  Remaining Estimate: 1h 50m
>
> Amazon S3 has the ability to charge the requester for the cost of accessing S3. This is called Requester Pays Buckets. 
> In order to access these buckets, each request needs to be signed with a specific header. 
> http://docs.aws.amazon.com/AmazonS3/latest/dev/RequesterPaysBuckets.html



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org