You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-commits@hadoop.apache.org by st...@apache.org on 2016/09/09 17:55:44 UTC

hadoop git commit: HADOOP-13540 improve section on troubleshooting s3a auth problems. Contributed by Steve Loughran

Repository: hadoop
Updated Branches:
  refs/heads/trunk a0b038367 -> 4b6d795f2


HADOOP-13540 improve section on troubleshooting s3a auth problems. Contributed by Steve Loughran


Project: http://git-wip-us.apache.org/repos/asf/hadoop/repo
Commit: http://git-wip-us.apache.org/repos/asf/hadoop/commit/4b6d795f
Tree: http://git-wip-us.apache.org/repos/asf/hadoop/tree/4b6d795f
Diff: http://git-wip-us.apache.org/repos/asf/hadoop/diff/4b6d795f

Branch: refs/heads/trunk
Commit: 4b6d795f28591df7761186bc5c8d31dcd1a5a99d
Parents: a0b0383
Author: Steve Loughran <st...@apache.org>
Authored: Fri Sep 9 18:54:08 2016 +0100
Committer: Steve Loughran <st...@apache.org>
Committed: Fri Sep 9 18:55:32 2016 +0100

----------------------------------------------------------------------
 .../src/site/markdown/tools/hadoop-aws/index.md | 101 +++++++++++++++----
 1 file changed, 83 insertions(+), 18 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/hadoop/blob/4b6d795f/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/index.md
----------------------------------------------------------------------
diff --git a/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/index.md b/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/index.md
index 5100030..7fcadb9 100644
--- a/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/index.md
+++ b/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/index.md
@@ -1023,7 +1023,7 @@ the classpath.
 
 This means that one or more of the `aws-*-sdk` JARs are missing. Add them.
 
-### Missing method in AWS class
+### Missing method in `com.amazonaws` class
 
 This can be triggered by incompatibilities between the AWS SDK on the classpath
 and the version which Hadoop was compiled with.
@@ -1047,23 +1047,84 @@ classpath. All Jackson JARs on the classpath *must* be of the same version.
 
 ### Authentication failure
 
-The general cause is: you have the wrong credentials \u2014or somehow
+If Hadoop cannot authenticate with the S3 service endpoint,
+the client retries a number of times before eventually failing.
+When it finally gives up, it will report a message about signature mismatch:
+
+```
+com.amazonaws.services.s3.model.AmazonS3Exception:
+ The request signature we calculated does not match the signature you provided.
+ Check your key and signing method.
+  (Service: Amazon S3; Status Code: 403; Error Code: SignatureDoesNotMatch,
+```
+
+The likely cause is that you either have the wrong credentials or somehow
 the credentials were not readable on the host attempting to read or write
 the S3 Bucket.
 
-There's not much that Hadoop can do for diagnostics here.
 Enabling debug logging for the package `org.apache.hadoop.fs.s3a`
-can help somewhat.
+can help provide more information.
+
+The most common cause is that you have the wrong credentials for any of the current
+authentication mechanism(s) \u2014or somehow
+the credentials were not readable on the host attempting to read or write
+the S3 Bucket. However, there are a couple of system configuration problems
+(JVM version, system clock) which also need to be checked.
+
+Most common: there's an error in the configuration properties.
+
+
+1. Make sure that the name of the bucket is the correct one.
+That is: check the URL.
+
+1. Make sure the property names are correct. For S3A, they are
+`fs.s3a.access.key` and `fs.s3a.secret.key` \u2014you cannot just copy the S3N
+properties and replace `s3n` with `s3a`.
+
+1. Make sure the properties are visible to the process attempting to
+talk to the object store. Placing them in `core-site.xml` is the standard
+mechanism.
+
+1. If using session authentication, the session may have expired.
+Generate a new session token and secret.
 
-Most common: there's an error in the key or secret.
+1. If using environement variable-based authentication, make sure that the
+relevant variables are set in the environment in which the process is running.
 
-Otherwise, try to use the AWS command line tools with the same credentials.
-If you set the environment variables, you can take advantage of S3A's support
-of environment-variable authentication by attempting to use the `hdfs fs` command
-to read or write data on S3. That is: comment out the `fs.s3a` secrets and rely on
-the environment variables.
+The standard first step is: try to use the AWS command line tools with the same
+credentials, through a command such as:
 
-### Authentication failure when using URLs with embedded secrets
+    hdfs fs -ls s3a://my-bucket/
+
+Note the trailing "/" here; without that the shell thinks you are trying to list
+your home directory under the bucket, which will only exist if explicitly created.
+
+
+Attempting to list a bucket using inline credentials is a
+means of verifying that the key and secret can access a bucket;
+
+    hdfs fs -ls s3a://key:secret@my-bucket/
+
+Do escape any `+` or `/` symbols in the secret, as discussed below, and never
+share the URL, logs generated using it, or use such an inline authentication
+mechanism in production.
+
+Finally, if you set the environment variables, you can take advantage of S3A's
+support of environment-variable authentication by attempting the same ls operation.
+That is: unset the `fs.s3a` secrets and rely on the environment variables.
+
+#### Authentication failure due to clock skew
+
+The timestamp is used in signing to S3, so as to
+defend against replay attacks. If the system clock is too far behind *or ahead*
+of Amazon's, requests will be rejected.
+
+This can surface as the situation where
+read requests are allowed, but operations which write to the bucket are denied.
+
+Check the system clock.
+
+#### Authentication failure when using URLs with embedded secrets
 
 If using the (strongly discouraged) mechanism of including the
 AWS Key and secret in a URL, then both "+" and "/" symbols need
@@ -1076,23 +1137,25 @@ encoding problems are not uncommon.
 | `/` | `%2F` |
 
 
-That is, a URL for `bucket` with AWS ID `user1` and secret `a+b/c` would
+As an example, a URL for `bucket` with AWS ID `user1` and secret `a+b/c` would
 be represented as
 
 ```
-s3a://user1:a%2Bb%2Fc@bucket
+s3a://user1:a%2Bb%2Fc@bucket/
 ```
 
 This technique is only needed when placing secrets in the URL. Again,
 this is something users are strongly advised against using.
 
-### Authentication failures running on Java 8u60+
+#### Authentication Failures When Running on Java 8u60+
 
 A change in the Java 8 JVM broke some of the `toString()` string generation
 of Joda Time 2.8.0, which stopped the Amazon S3 client from being able to
 generate authentication headers suitable for validation by S3.
 
-Fix: make sure that the version of Joda Time is 2.8.1 or later.
+**Fix**: Make sure that the version of Joda Time is 2.8.1 or later, or
+use a new version of Java 8.
+
 
 ### "Bad Request" exception when working with AWS S3 Frankfurt, Seoul, or other "V4" endpoint
 
@@ -1291,10 +1354,12 @@ expense of sequential read performance and bandwidth.
 The slow performance of `rename()` surfaces during the commit phase of work,
 including
 
-* The MapReduce FileOutputCommitter.
-* DistCp's rename after copy operation.
+* The MapReduce `FileOutputCommitter`.
+* DistCp's rename-after-copy operation.
+* The `hdfs fs -rm` command renaming the file under `.Trash` rather than
+deleting it. Use `-skipTrash` to eliminate that step.
 
-Both these operations can be significantly slower when S3 is the destination
+These operations can be significantly slower when S3 is the destination
 compared to HDFS or other "real" filesystem.
 
 *Improving S3 load-balancing behavior*


---------------------------------------------------------------------
To unsubscribe, e-mail: common-commits-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-commits-help@hadoop.apache.org