You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Daniel Carl Jones (Jira)" <ji...@apache.org> on 2022/02/18 10:05:00 UTC

[jira] [Commented] (HADOOP-18095) s3a connector to fully support AWS partitions,

    [ https://issues.apache.org/jira/browse/HADOOP-18095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17494504#comment-17494504 ] 

Daniel Carl Jones commented on HADOOP-18095:
--------------------------------------------

I've been able to try the integration tests in China. I’ve taken commit 19d90e62fb2 and run the integration tests against a cn-north-1 bucket using aws-cn credentials from an cn-north-1a EC2 instance.

I used the following auth-keys.xml (credentials redacted):
{code:java}
<configuration>

<property>
<name>test.fs.s3a.name</name>
<value>s3a://<my-aws-cn-bucket>/</value>
</property>

<property>
<name>fs.contract.test.fs.s3a</name>
<value>${test.fs.s3a.name}</value>
</property>

<property>
<name>fs.s3a.endpoint</name>
<value>s3.cn-north-1.amazonaws.com.cn</value>
</property>

<property>
<name>fs.s3a.scale.test.enabled</name>
<value>false</value>
</property>

<property>
<name>fs.s3a.access.key</name>
<value>my-access-key-id</value>
</property>

<property>
<name>fs.s3a.secret.key</name>
<value>my-secret-key</value>
</property>

<property>
<name>fs.s3a.scale.test.csvfile</name>
<value> </value>
<description>Skip scale test if blank (based on landsat bucket)</description>
</property>

<property>
<name>fs.s3a.assumed.role.sts.endpoint</name>
<value>sts.cn-north-1.amazonaws.com.cn</value>
</property>

<property>
<name>fs.s3a.assumed.role.sts.endpoint.region</name>
<value>cn-north-1</value>
</property>

<property>
<name>fs.s3a.server-side-encryption-algorithm</name>
<value></value>
</property>

<property>
<name>fs.s3a.bucket.<my-aws-cn-bucket>.server-side-encryption.key</name>
<value>arn:aws-cn:kms:cn-north-1:<my-aws-cn-acct>:key/0e93438c-f680-407c-8fc9-d716206e428e</value>
</property>

</configuration>{code}
Here are the results.
{noformat}
[INFO] Results:
[INFO]
[ERROR] Failures:
[ERROR] ITestS3ABucketExistence.testAccessPointProbingV2:171->expectUnknownStore:103 Expected a org.apache.hadoop.fs.s3a.UnknownStoreException to be thrown, but got the result: : S3AFileSystem{uri=s3a://random-bucket-5256b729-af4e-4c80-83f3-c7eb4ff623ea, workingDir=s3a://random-bucket-5256b729-af4e-4c80-83f3-c7eb4ff623ea/user/ubuntu, inputPolicy=normal, partSize=67108864, enableMultiObjectsDelete=true, maxKeys=5000, readAhead=65536, blockSize=33554432, multiPartThreshold=134217728, s3EncryptionAlgorithm='NONE', blockFactory=org.apache.hadoop.fs.s3a.S3ADataBlocks$DiskBlockFactory@54ef761e, auditManager=Service ActiveAuditManagerS3A in state ActiveAuditManagerS3A: STARTED, auditor=LoggingAuditor{ID='f8cb95e7-108b-4a5c-aea3-08255651ee73', headerEnabled=true, rejectOutOfSpan=true}}, authoritativePath=[], useListV1=false, magicCommitter=true, boundedExecutor=BlockingThreadPoolExecutorService{SemaphoredDelegatingExecutor{permitCount=160, available=160, waiting=0}, activeCount=0}, unboundedExecutor=java.util.concurrent.ThreadPoolExecutor@5077ec21[Running, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 0], credentials=AWSCredentialProviderList[refcount= 1: [TemporaryAWSCredentialsProvider, SimpleAWSCredentialsProvider, EnvironmentVariableCredentialsProvider, org.apache.hadoop.fs.s3a.auth.IAMInstanceCredentialsProvider@7c1ff3d4] last provider: SimpleAWSCredentialsProvider, delegation tokens=disabled, DirectoryMarkerRetention{policy='delete'}, instrumentation {S3AInstrumentation{}}, ClientSideEncryption=false, arnForBucket=arn:aws:s3:eu-west-1:123456789012:accesspoint/random-bucket-5256b729-af4e-4c80-83f3-c7eb4ff623ea}
[ERROR] ITestS3ABucketExistence.testAccessPointRequired:188->expectUnknownStore:103 Expected a org.apache.hadoop.fs.s3a.UnknownStoreException to be thrown, but got the result: : S3AFileSystem{uri=s3a://random-bucket-a74d1eee-ea3e-4722-9214-b1a900710a68, workingDir=s3a://random-bucket-a74d1eee-ea3e-4722-9214-b1a900710a68/user/ubuntu, inputPolicy=normal, partSize=67108864, enableMultiObjectsDelete=true, maxKeys=5000, readAhead=65536, blockSize=33554432, multiPartThreshold=134217728, s3EncryptionAlgorithm='NONE', blockFactory=org.apache.hadoop.fs.s3a.S3ADataBlocks$DiskBlockFactory@43c49197, auditManager=Service ActiveAuditManagerS3A in state ActiveAuditManagerS3A: STARTED, auditor=LoggingAuditor{ID='470b4ed0-07aa-4192-847a-c9bc61d09466', headerEnabled=true, rejectOutOfSpan=true}}, authoritativePath=[], useListV1=false, magicCommitter=true, boundedExecutor=BlockingThreadPoolExecutorService{SemaphoredDelegatingExecutor{permitCount=160, available=160, waiting=0}, activeCount=0}, unboundedExecutor=java.util.concurrent.ThreadPoolExecutor@2e3c7876[Running, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 0], credentials=AWSCredentialProviderList[refcount= 1: [TemporaryAWSCredentialsProvider, SimpleAWSCredentialsProvider, EnvironmentVariableCredentialsProvider, org.apache.hadoop.fs.s3a.auth.IAMInstanceCredentialsProvider@2ff803d8] last provider: SimpleAWSCredentialsProvider, delegation tokens=disabled, DirectoryMarkerRetention{policy='delete'}, instrumentation {S3AInstrumentation{}}, ClientSideEncryption=false, arnForBucket=arn:aws:s3:eu-west-1:123456789012:accesspoint/random-bucket-a74d1eee-ea3e-4722-9214-b1a900710a68}
[ERROR] ITestS3AEncryptionSSEKMSDefaultKey>AbstractTestS3AEncryption.testEncryption:120->AbstractTestS3AEncryption.validateEncryptionForFilesize:160->assertEncrypted:57->Assert.assertThat:930->Assert.assertThat:964
Expected: a string containing "arn:aws:kms:"
but: was "arn:aws-cn:kms:cn-north-1:<dannycjones-acct-id>:key/541a093c-9ea7-4de4-806a-1fbccf9b0d61"
[ERROR] ITestS3AEncryptionSSEKMSDefaultKey>AbstractTestS3AEncryption.testEncryptionOverRename:134->assertEncrypted:57->Assert.assertThat:930->Assert.assertThat:964
Expected: a string containing "arn:aws:kms:"
but: was "arn:aws-cn:kms:cn-north-1:<dannycjones-acct-id>:key/541a093c-9ea7-4de4-806a-1fbccf9b0d61"
[ERROR] Errors:
[ERROR] ITestDelegatedMRJob.testCommonCrawlLookup:234 » AccessDenied s3a://osm-pds/pla...
[ERROR] ITestDelegatedMRJob.testCommonCrawlLookup:234 » AccessDenied s3a://osm-pds/pla...
[ERROR] ITestDelegatedMRJob.testJobSubmissionCollectsTokens:281 » AccessDenied s3a://o...
[ERROR] ITestDelegatedMRJob.testJobSubmissionCollectsTokens:281 » AccessDenied s3a://o...
[ERROR] ITestSessionDelegationInFileystem.testDelegatedFileSystem:333->readLandsatMetadata:598 » AccessDenied
[ERROR] ITestMarkerTool.testRunLimitedLandsatAudit:320->AbstractMarkerToolTest.runToFailure:271 » AccessDenied
[INFO]
[ERROR] Tests run: 1063, Failures: 6, Errors: 6, Skipped: 255{noformat}

There’s a few obvious causes, such as tests trying to access buckets or make assertions based on ARNs assuming standard AWS partition. Others are less obvious, such as the unknownStore failures. I’d need to dive deeper to understand these tests.

I also removed ITestAuditManagerDisabled tests which are currently failing for me in eu-west-1 as well as cn-north-1.

> s3a connector to fully support AWS partitions,
> ----------------------------------------------
>
>                 Key: HADOOP-18095
>                 URL: https://issues.apache.org/jira/browse/HADOOP-18095
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 3.3.2
>            Reporter: Steve Loughran
>            Priority: Minor
>
> There are some minor issues in using the S3A connector's more advanced features in china
> see https://docs.aws.amazon.com/general/latest/gr/aws-arns-and-namespaces.html
> Specifically, that "arn:aws:" prefix we use for all arns needs to be configurable so that aws-cn can be used instead.
> This means finding where we create and use these in production code (dynamically creating IAM role policies) and in tests, and making it configurable.  
> proposed
> * add an option {{fs.s3a.aws.partition}}, default aws.
> * new StoreContext methods to query this, and create the arn for the current bucket (string concat or from the bucket's ARN if created with an AP ARN)
> * docs
> I remember ABFS had a problem with oauth endpoints, that was a lot more serious.
> Can't think of real tests for this, other than verifying that if you create an invalid partition "aws-mars" some things break.
> someone needs to run all our existing tests in china, including those with IAM roles and SSE-KMS.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org