You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Steve Loughran (JIRA)" <ji...@apache.org> on 2018/11/01 19:46:00 UTC

[jira] [Commented] (HADOOP-14556) S3A to support Delegation Tokens

    [ https://issues.apache.org/jira/browse/HADOOP-14556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16672084#comment-16672084 ] 

Steve Loughran commented on HADOOP-14556:
-----------------------------------------

patch 017

* Move  {{S3AFileSystem.dtIntegration}} field in S3A Filesystem to being an optional  type; change name & do a best effort at java-8 code.
* declaring a binding is enough to turn DTs on. Simplifies configs and docs, reduces misconfig risks. You can't ever have DT enabled without a binding class; or have a binding class declared but for some reason not have tokens picked up just because the enabled bit was false.
* marshalled credentials have validity checks, it should be impossible to have null values there; meaningful messages
when they don't meet the requirements of the caller (Full tokens, session, either...)
* validation methods let callers declare excactly what they want to validate the creds with
* test improvements
* move from Preconditions.checkNotNull to Objects.requireNonNull, use lambda-for errors when it seems worthwhile

The property for DT binding auth providers switched to fs.s3a.aws.credentials.provider, that is, the normal list for
credential providers when DT bindings is not enabled (i.e. not the assumed role one.)

The list of default credential providers now includes Temporary/Session credentials as the first entry in the list.
This is a change from before. where people had to explicitly turn it on. In contrast, the Env var plugin looked for session creds first,
and of course IaM roles is always temp.

This gives the following sequence for finding credentials

# fs.s3a.account.key + fs.s3a.secret.key + fs.s3a.session.token => session credentials
# fs.s3a.account.key + fs.s3a.secret.key => full credentials. (because of the ordering, this will only be reached if #1 is unsatisifed)
# Env vars, first session env vars, then full
# IAM Role.

I've also set things up in future to move to async IAM credential refresh with a new (not yet documented, still private) credential provider there, which
internally shares a reference to the single IAM instance credentials provider.

Testing: s3a ireland, all well, but (and this was a scale run)

MPU tests still failing; WiP by Ewan there
{code}
[ERROR] testMultipartUploadEmptyPart(org.apache.hadoop.fs.contract.s3a.ITestS3AContractMultipartUploader)  Time elapsed: 0.749 s  <<< ERROR!
java.lang.IllegalArgumentException: partNumber must be between 1 and 10000 inclusive, but is 0
	at com.google.common.base.Preconditions.checkArgument(Preconditions.java:115)
	at org.apache.hadoop.fs.s3a.WriteOperationHelper.newUploadPartRequest(WriteOperationHelper.java:377)
	at org.apache.hadoop.fs.s3a.S3AMultipartUploader.putPart(S3AMultipartUploader.java:97)
{code}

Bad request on the SSEC huge file upload
{code}
[ERROR] test_040_PositionedReadHugeFile(org.apache.hadoop.fs.s3a.scale.ITestS3AHugeFilesSSECDiskBlocks)  Time elapsed: 0.322 s  <<< ERROR!
org.apache.hadoop.fs.s3a.AWSBadRequestException: getFileStatus on test/: com.amazonaws.services.s3.model.AmazonS3Exception: Bad Request (Service: Amazon S3; Status Code: 400; Error Code: 400 Bad Request; Request ID: 670ACD7E81475D64; S3 Extended Request ID: 22WuQmeTqICfvd+9bgfSnwiptwCNA80ZlQqoF1hDBJJ0wlfPYTkmlO+r4g0tHBILG5l2NYIHVb8=), S3 Extended Request ID: 22WuQmeTqICfvd+9bgfSnwiptwCNA80ZlQqoF1hDBJJ0wlfPYTkmlO+r4g0tHBILG5l2NYIHVb8=:400 Bad Request: Bad Request (Service: Amazon S3; Status Code: 400; Error Code: 400 Bad Request; Request ID: 670ACD7E81475D64; S3 Extended Request ID: 22WuQmeTqICfvd+9bgfSnwiptwCNA80ZlQqoF1hDBJJ0wlfPYTkmlO+r4g0tHBILG5l2NYIHVb8=)
	at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:227)
	at org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2382)
	at org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:2320)
	at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:2258)
	at org.apache.hadoop.fs.s3a.S3AFileSystem.innerMkdirs(S3AFileSystem.java:2207)
	at org.apache.hadoop.fs.s3a.S3AFileSystem.mkdirs(S3AFileSystem.java:2177)
	at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:2274)
	at org.apache.hadoop.fs.contract.AbstractFSContractTestBase.mkdirs(AbstractFSContractTestBase.java:338)
	at org.apache.hadoop.fs.contract.AbstractFSContractTestBase.setup(AbstractFSContractTestBase.java:193)
	at org.apache.hadoop.fs.s3a.scale.S3AScaleTestBase.setup(S3AScaleTestBase.java:90)
	at org.apache.hadoop.fs.s3a.scale.AbstractSTestS3AHugeFiles.setup(AbstractSTestS3AHugeFiles.java:78)
	at org.apache.hadoop.fs.s3a.scale.ITestS3AHugeFilesSSECDiskBlocks.setup(ITestS3AHugeFilesSSECDiskBlocks.java:41)
	at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
	at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
	at org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
{code}

I worry that something I've been doing with marshalling encryption secrets has broken something here: more investigation needed.

> S3A to support Delegation Tokens
> --------------------------------
>
>                 Key: HADOOP-14556
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14556
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 3.2.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>            Priority: Major
>         Attachments: HADOOP-14556-001.patch, HADOOP-14556-002.patch, HADOOP-14556-003.patch, HADOOP-14556-004.patch, HADOOP-14556-005.patch, HADOOP-14556-007.patch, HADOOP-14556-008.patch, HADOOP-14556-009.patch, HADOOP-14556-010.patch, HADOOP-14556-010.patch, HADOOP-14556-011.patch, HADOOP-14556-012.patch, HADOOP-14556-013.patch, HADOOP-14556-014.patch, HADOOP-14556-015.patch, HADOOP-14556-016.patch, HADOOP-14556-017.patch, HADOOP-14556.oath-002.patch, HADOOP-14556.oath.patch
>
>
> S3A to support delegation tokens where
> * an authenticated client can request a token via {{FileSystem.getDelegationToken()}}
> * Amazon's token service is used to request short-lived session secret & id; these will be saved in the token and  marshalled with jobs
> * A new authentication provider will look for a token for the current user and authenticate the user if found
> This will not support renewals; the lifespan of a token will be limited to the initial duration. Also, as you can't request an STS token from a temporary session, IAM instances won't be able to issue tokens.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org