You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-commits@hadoop.apache.org by st...@apache.org on 2017/08/12 21:00:02 UTC

[2/3] hadoop git commit: HADOOP-14749. review s3guard docs & code prior to merge. Contributed by Steve Loughran

http://git-wip-us.apache.org/repos/asf/hadoop/blob/e531ae25/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/s3guard.md
----------------------------------------------------------------------
diff --git a/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/s3guard.md b/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/s3guard.md
index c28e354..fe67d69 100644
--- a/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/s3guard.md
+++ b/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/s3guard.md
@@ -20,7 +20,7 @@
 
 ## Overview
 
-*S3Guard* is an experimental feature for the S3A client of the S3 Filesystem,
+*S3Guard* is an experimental feature for the S3A client of the S3 object store,
 which can use a (consistent) database as the store of metadata about objects
 in an S3 bucket.
 
@@ -34,8 +34,8 @@ processes.
 1. Permits a consistent view of the object store. Without this, changes in
 objects may not be immediately visible, especially in listing operations.
 
-1. Create a platform for future performance improvements for running Hadoop
-   workloads on top of object stores
+1. Offers a platform for future performance improvements for running Hadoop
+workloads on top of object stores
 
 The basic idea is that, for each operation in the Hadoop S3 client (s3a) that
 reads or modifies metadata, a shadow copy of that metadata is stored in a
@@ -60,19 +60,22 @@ S3 Repository to use the feature. Clients reading the data may work directly
 with the S3A data, in which case the normal S3 consistency guarantees apply.
 
 
-## Configuring S3Guard
+## Setting up S3Guard
 
 The latest configuration parameters are defined in `core-default.xml`.  You
 should consult that file for full information, but a summary is provided here.
 
 
-### 1. Choose your MetadataStore implementation.
+### 1. Choose the Database
 
-By default, S3Guard is not enabled.  S3A uses "`NullMetadataStore`", which is a
-MetadataStore that does nothing.
+A core concept of S3Guard is that the directory listing data of the object
+store, *the metadata* is replicated in a higher-performance, consistent,
+database. In S3Guard, this database is called *The Metadata Store*
 
-The funtional MetadataStore back-end uses Amazon's DynamoDB database service.  The
- following setting will enable this MetadataStore:
+By default, S3Guard is not enabled.
+
+The Metadata Store to use in production is bonded to Amazon's DynamoDB
+database service.  The following setting will enable this Metadata Store:
 
 ```xml
 <property>
@@ -81,8 +84,8 @@ The funtional MetadataStore back-end uses Amazon's DynamoDB database service.  T
 </property>
 ```
 
-
-Note that the Null metadata store can be explicitly requested if desired.
+Note that the `NullMetadataStore` store can be explicitly requested if desired.
+This offers no metadata storage, and effectively disables S3Guard.
 
 ```xml
 <property>
@@ -91,10 +94,10 @@ Note that the Null metadata store can be explicitly requested if desired.
 </property>
 ```
 
-### 2. Configure S3Guard settings
+### 2. Configure S3Guard Settings
 
-More settings will be added here in the future as we add to S3Guard.
-Currently the only MetadataStore-independent setting, besides the
+More settings will may be added in the future.
+Currently the only Metadata Store-independent setting, besides the
 implementation class above, is the *allow authoritative* flag.
 
 It is recommended that you leave the default setting here:
@@ -107,25 +110,32 @@ It is recommended that you leave the default setting here:
 
 ```
 
-Setting this to true is currently an experimental feature.  When true, the
+Setting this to `true` is currently an experimental feature.  When true, the
 S3A client will avoid round-trips to S3 when getting directory listings, if
-there is a fully-cached version of the directory stored in the MetadataStore.
+there is a fully-cached version of the directory stored in the Metadata Store.
 
 Note that if this is set to true, it may exacerbate or persist existing race
 conditions around multiple concurrent modifications and listings of a given
 directory tree.
 
+In particular: **If the Metadata Store is declared as authoritative,
+all interactions with the S3 bucket(s) must be through S3A clients sharing
+the same Metadata Store**
+
 
-### 3. Configure the MetadataStore.
+### 3. Configure the Metadata Store.
 
-Here are the `DynamoDBMetadataStore` settings.  Other MetadataStore
- implementations will have their own configuration parameters.
+Here are the `DynamoDBMetadataStore` settings.  Other Metadata Store
+implementations will have their own configuration parameters.
+
+
+### 4. Name Your Table
 
 First, choose the name of the table you wish to use for the S3Guard metadata
-storage in your DynamoDB instance.  If you leave the default blank value, a
+storage in your DynamoDB instance.  If you leave it unset/empty, a
 separate table will be created for each S3 bucket you access, and that
-bucket's name will be used for the name of the DynamoDB table.  Here we
-choose our own table name:
+bucket's name will be used for the name of the DynamoDB table.  For example,
+this sets the table name to `my-ddb-table-name`
 
 ```xml
 <property>
@@ -133,16 +143,45 @@ choose our own table name:
   <value>my-ddb-table-name</value>
   <description>
     The DynamoDB table name to operate. Without this property, the respective
-    S3 bucket name will be used.
+    S3 bucket names will be used.
   </description>
 </property>
 ```
 
+It is good to share a table across multiple buckets for multiple reasons.
+
+1. You are billed for the I/O capacity allocated to the table,
+*even when the table is not used*. Sharing capacity can reduce costs.
+
+1. You can share the "provision burden" across the buckets. That is, rather
+than allocating for the peak load on a single bucket, you can allocate for
+the peak load *across all the buckets*, which is likely to be significantly
+lower.
+
+1. It's easier to measure and tune the load requirements and cost of
+S3Guard, because there is only one table to review and configure in the
+AWS management console.
+
+When wouldn't you want to share a table?
+
+1. When you do explicitly want to provision I/O capacity to a specific bucket
+and table, isolated from others.
+
+1. When you are using separate billing for specific buckets allocated
+to specific projects.
+
+1. When different users/roles have different access rights to different buckets.
+As S3Guard requires all users to have R/W access to the table, all users will
+be able to list the metadata in all buckets, even those to which they lack
+read access.
+
+### 5. Locate your Table
+
 You may also wish to specify the region to use for DynamoDB.  If a region
 is not configured, S3A will assume that it is in the same region as the S3
 bucket. A list of regions for the DynamoDB service can be found in
 [Amazon's documentation](http://docs.aws.amazon.com/general/latest/gr/rande.html#ddb_region).
-In this example, we set the US West 2 region:
+In this example, to use the US West 2 region:
 
 ```xml
 <property>
@@ -151,9 +190,17 @@ In this example, we set the US West 2 region:
 </property>
 ```
 
+When working with S3Guard-managed buckets from EC2 VMs running in AWS
+infrastructure, using a local DynamoDB region ensures the lowest latency
+and highest reliability, as well as avoiding all long-haul network charges.
+The S3Guard tables, and indeed, the S3 buckets, should all be in the same
+region as the VMs.
+
+### 6. Optional: Create your Table
+
 Next, you can choose whether or not the table will be automatically created
-(if it doesn't already exist).  If we want this feature, we can set the
-following parameter to true.
+(if it doesn't already exist).  If you want this feature, set the
+`fs.s3a.s3guard.ddb.table.create` option to `true`.
 
 ```xml
 <property>
@@ -165,6 +212,8 @@ following parameter to true.
 </property>
 ```
 
+### 7. If creating a table: Set your DynamoDB IO Capacity
+
 Next, you need to set the DynamoDB read and write throughput requirements you
 expect to need for your cluster.  Setting higher values will cost you more
 money.  *Note* that these settings only affect table creation when
@@ -174,10 +223,10 @@ an existing table, use the AWS console or CLI tool.
 For more details on DynamoDB capacity units, see the AWS page on [Capacity
 Unit Calculations](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/WorkingWithTables.html#CapacityUnitCalculations).
 
-The charges are incurred per hour for the life of the table, even when the
-table and the underlying S3 bucket are not being used.
+The charges are incurred per hour for the life of the table, *even when the
+table and the underlying S3 buckets are not being used*.
 
-There are also charges incurred for data storage and  for data IO outside of the
+There are also charges incurred for data storage and for data IO outside of the
 region of the DynamoDB instance. S3Guard only stores metadata in DynamoDB: path names
 and summary details of objects —the actual data is stored in S3, so billed at S3
 rates.
@@ -210,29 +259,107 @@ Attempting to perform more IO than the capacity requested simply throttles the
 IO; small capacity numbers are recommended when initially experimenting
 with S3Guard.
 
-## Credentials
+## Authenticating with S3Guard
 
 The DynamoDB metadata store takes advantage of the fact that the DynamoDB
-service uses uses the same authentication mechanisms as S3. With S3Guard,
-DynamoDB doesn't have any dedicated authentication configuration; it gets its
-credentials from the S3A client that is using it.
+service uses the same authentication mechanisms as S3. S3Guard
+gets all its credentials from the S3A client that is using it.
+
+All existing S3 authentication mechanisms can be used, except for one
+exception. Credentials placed in URIs are not supported for S3Guard, for security
+reasons.
+
+## Per-bucket S3Guard configuration
+
+In production, it is likely only some buckets will have S3Guard enabled;
+those which are read-only may have disabled, for example. Equally importantly,
+buckets in different regions should have different tables, each
+in the relevant region.
+
+These options can be managed through S3A's [per-bucket configuration
+mechanism](./index.html#Configuring_different_S3_buckets).
+All options with the under `fs.s3a.bucket.BUCKETNAME.KEY` are propagated
+to the options `fs.s3a.KEY` *for that bucket only*.
+
+As an example, here is a configuration to use different metadata stores
+and tables for different buckets
+
+First, we define shortcuts for the metadata store classnames
+
+
+```xml
+<property>
+  <name>s3guard.null</name>
+  <value>org.apache.hadoop.fs.s3a.s3guard.NullMetadataStore</value>
+</property>
+
+<property>
+  <name>s3guard.dynamo</name>
+  <value>org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStore</value>
+</property>
+```
+
+Next, Amazon's public landsat database is configured with no
+metadata store
+
+```xml
+<property>
+  <name>fs.s3a.bucket.landsat-pds.metadatastore.impl</name>
+  <value>${s3guard.null}</value>
+  <description>The read-only landsat-pds repository isn't
+  managed by S3Guard</description>
+</property>
+```
+
+Next the `ireland-2` and `ireland-offline` buckets are configured with
+DynamoDB as the store, and a shared table `production-table`
+
+
+```xml
+<property>
+  <name>fs.s3a.bucket.ireland-2.metadatastore.impl</name>
+  <value>${s3guard.dynamo}</value>
+</property>
+
+<property>
+  <name>fs.s3a.bucket.ireland-offline.metadatastore.impl</name>
+  <value>${s3guard.dynamo}</value>
+</property>
+
+<property>
+  <name>fs.s3a.bucket.ireland-2.s3guard.ddb.table</name>
+  <value>production-table</value>
+</property>
+```
+
+The region of this table is automatically set to be that of the buckets,
+here `eu-west-1`; the same table name may actually be used in different
+regions.
+
+Together then, this configuration enables the DynamoDB Metadata Store
+for two buckets with a shared table, while disabling it for the public
+bucket.
 
-The existing S3 authentication mechanisms can be used, except for one
-exception. Credentials placed in URIs are not supported for S3Guard.  The
-reason is that providing login details in filesystem URIs is considered
-unsafe and thus deprecated.
 
 ## S3Guard Command Line Interface (CLI)
 
-Note that in some cases an AWS region or s3a:// URI can be provided.
+Note that in some cases an AWS region or `s3a://` URI can be provided.
 
 Metadata store URIs include a scheme that designates the backing store. For
-example (e.g. dynamodb://&lt;table_name&gt;). As documented above, AWS region
-can be inferred if the URI to an existing bucket is provided.
+example (e.g. `dynamodb://table_name`;). As documented above, the
+AWS region can be inferred if the URI to an existing bucket is provided.
 
-### Init
 
-```
+The S3A URI must also be provided for per-bucket configuration options
+to be picked up. That is: when an s3a URL is provided on the command line,
+all its "resolved" per-bucket settings are used to connect to, authenticate
+with and configure the S3Guard table. If no such URL is provided, then
+the base settings are picked up.
+
+
+### Create a table: `s3guard init`
+
+```bash
 hadoop s3guard init -meta URI ( -region REGION | s3a://BUCKET )
 ```
 
@@ -241,44 +368,124 @@ Creates and initializes an empty metadata store.
 A DynamoDB metadata store can be initialized with additional parameters
 pertaining to [Provisioned Throughput](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/HowItWorks.ProvisionedThroughput.html):
 
-```
+```bash
 [-write PROVISIONED_WRITES] [-read PROVISIONED_READS]
 ```
 
-### Import
+Example 1
+
+```bash
+hadoop s3guard init -meta dynamodb://ireland-team -write 5 -read 10 s3a://ireland-1
+```
+
+Creates a table "ireland-team" with a capacity of 5 for writes, 10 for reads,
+in the same location as the bucket "ireland-1".
+
+
+Example 2
 
+```bash
+hadoop s3guard init -meta dynamodb://ireland-team -region eu-west-1
 ```
+
+Creates a table "ireland-team" in the same region "s3-eu-west-1.amazonaws.com"
+
+
+### Import a bucket: `s3guard import`
+
+```bash
 hadoop s3guard import [-meta URI] s3a://BUCKET
 ```
 
 Pre-populates a metadata store according to the current contents of an S3
-bucket.
+bucket. If the `-meta` option is omitted, the binding information is taken
+from the `core-site.xml` configuration.
 
-### Diff
+Example
 
+```bash
+hadoop s3guard import s3a://ireland-1
 ```
+
+### Audit a table: `s3guard diff`
+
+```bash
 hadoop s3guard diff [-meta URI] s3a://BUCKET
 ```
 
 Lists discrepancies between a metadata store and bucket. Note that depending on
 how S3Guard is used, certain discrepancies are to be expected.
 
-### Destroy
+Example
 
+```bash
+hadoop s3guard diff s3a://ireland-1
 ```
+
+### Delete a table: `s3guard destroy`
+
+
+Deletes a metadata store. With DynamoDB as the store, this means
+the specific DynamoDB table use to store the metadata.
+
+```bash
 hadoop s3guard destroy [-meta URI] ( -region REGION | s3a://BUCKET )
 ```
 
-Deletes a metadata store.
+This *does not* delete the bucket, only the S3Guard table which it is bound
+to.
+
 
-### Prune
+Examples
 
+```bash
+hadoop s3guard destroy s3a://ireland-1
+```
+
+Deletes the table which the bucket ireland-1 is configured to use
+as its MetadataStore.
+
+```bash
+hadoop s3guard destroy -meta dynamodb://ireland-team -region eu-west-1
 ```
+
+
+
+### Clean up a table, `s3guard prune`
+
+Delete all file entries in the MetadataStore table whose object "modification
+time" is older than the specified age.
+
+```bash
 hadoop s3guard prune [-days DAYS] [-hours HOURS] [-minutes MINUTES]
     [-seconds SECONDS] [-m URI] ( -region REGION | s3a://BUCKET )
 ```
 
-Trims metadata for files that are older than the time given. Must supply at least length of time.
+A time value must be supplied.
+
+1. This does not delete the entries in the bucket itself.
+1. The modification time is effectively the creation time of the objects
+in the S3 Bucket.
+1. Even when an S3A URI is supplied, all entries in the table older than
+a specific age are deleted &mdash; even those from other buckets.
+
+Example
+
+```bash
+hadoop s3guard prune -days 7 s3a://ireland-1
+```
+
+Deletes all entries in the S3Guard table for files older than seven days from
+the table associated with `s3a://ireland-1`.
+
+```bash
+hadoop s3guard prune -hours 1 -minutes 30 -meta dynamodb://ireland-team -region eu-west-1
+```
+
+Delete all entries more than 90 minutes old from the table "ireland-team" in
+the region "eu-west-1".
+
+
 
 ## Debugging and Error Handling
 
@@ -287,7 +494,6 @@ middle of an operation, you may end up with your metadata store having state
 that differs from S3.  The S3Guard CLI commands, covered in the CLI section
 above, can be used to diagnose and repair these issues.
 
-
 There are some logs whose log level can be increased to provide more
 information.
 
@@ -298,7 +504,7 @@ log4j.logger.org.apache.hadoop.fs.s3a.s3guard=DEBUG
 # Log all S3A classes
 log4j.logger.org.apache.hadoop.fs.s3a=DEBUG
 
-# Enable debug logging of AWS Dynamo client
+# Enable debug logging of AWS DynamoDB client
 log4j.logger.com.amazonaws.services.dynamodbv2.AmazonDynamoDB
 
 # Log all HTTP requests made; includes S3 interaction. This may
@@ -308,7 +514,7 @@ log4j.logger.com.amazonaws.request=DEBUG
 ```
 
 If all else fails, S3Guard is designed to allow for easy recovery by deleting
-your metadata store data.  In DynamoDB, this can be accomplished by simply
+the metadata store data. In DynamoDB, this can be accomplished by simply
 deleting the table, and allowing S3Guard to recreate it from scratch.  Note
 that S3Guard tracks recent changes to file metadata to implement consistency.
 Deleting the metadata store table will simply result in a period of eventual
@@ -319,10 +525,10 @@ was deleted.
 
 Operations which modify metadata will make changes to S3 first. If, and only
 if, those operations succeed, the equivalent changes will be made to the
-MetadataStore.
+Metadata Store.
 
-These changes to S3 and MetadataStore are not fully-transactional:  If the S3
-operations succeed, and the subsequent MetadataStore updates fail, the S3
+These changes to S3 and Metadata Store are not fully-transactional:  If the S3
+operations succeed, and the subsequent Metadata Store updates fail, the S3
 changes will *not* be rolled back.  In this case, an error message will be
 logged.
 
@@ -351,6 +557,20 @@ in an incompatible manner. The version marker in tables exists to support
 such an option if it ever becomes necessary, by ensuring that all S3Guard
 client can recognise any version mismatch.
 
+### Security
+
+All users of the DynamoDB table must have write access to it. This
+effectively means they must have write access to the entire object store.
+
+There's not been much testing of using a S3Guard Metadata Store
+with a read-only S3 Bucket. It *should* work, provided all users
+have write access to the DynamoDB table. And, as updates to the Metadata Store
+are only made after successful file creation, deletion and rename, the
+store is *unlikely* to get out of sync, it is still something which
+merits more testing before it could be considered reliable.
+
+### Troubleshooting
+
 #### Error: `S3Guard table lacks version marker.`
 
 The table which was intended to be used as a S3guard metadata store
@@ -376,148 +596,15 @@ bucket. Upgrade the application/library.
 If the expected version is higher than the actual version, then the table
 itself will need upgrading.
 
-## Testing S3Guard
-
-The basic strategy for testing S3Guard correctness consists of:
-
-1. MetadataStore Contract tests.
-
-    The MetadataStore contract tests are inspired by the Hadoop FileSystem and
-    FileContext contract tests.  Each implementation of the MetadataStore interface
-    subclasses the `MetadataStoreTestBase` class and customizes it to initialize
-    their MetadataStore.  This test ensures that the different implementations
-    all satisfy the semantics of the MetadataStore API.
-
-2. Running existing S3A unit and integration tests with S3Guard enabled.
-
-    You can run the S3A integration tests on top of S3Guard by configuring your
-    MetadataStore (as documented above) in your
-    `hadoop-tools/hadoop-aws/src/test/resources/core-site.xml` or
-    `hadoop-tools/hadoop-aws/src/test/resources/auth-keys.xml` files.
-    Next run the S3A integration tests as outlined in the *Running the Tests* section
-    of the [S3A documentation](./index.html)
-
-3. Running fault-injection tests that test S3Guard's consistency features.
-
-    The `ITestS3GuardListConsistency` uses failure injection to ensure
-    that list consistency logic is correct even when the underlying storage is
-    eventually consistent.
-
-    The integration test adds a shim above the Amazon S3 Client layer that injects
-    delays in object visibility.
-
-    All of these tests will be run if you follow the steps listed in step 2 above.
-
-    No charges are incurred for using this store, and its consistency
-    guarantees are that of the underlying object store instance. <!-- :) -->
-
-## Testing S3 with S3Guard Enabled
-
-All the S3A tests which work with a private repository can be configured to
-run with S3Guard by using the `s3guard` profile. When set, this will run
-all the tests with local memory for the metadata set to "non-authoritative" mode.
-
-```bash
-mvn -T 1C verify -Dparallel-tests -DtestsThreadCount=6 -Ds3guard 
-```
-
-When the `s3guard` profile is enabled, following profiles can be specified:
-
-* `dynamo`: use an AWS-hosted DynamoDB table; creating the table if it does
-  not exist. You will have to pay the bills for DynamoDB web service.
-* `dynamodblocal`: use an in-memory DynamoDBLocal server instead of real AWS
-  DynamoDB web service; launch the server if it is not yet started; creating the
-  table if it does not exist. You won't be charged bills for using DynamoDB in
-  test. However, the DynamoDBLocal is a simulator of real AWS DynamoDB and is
-  maintained separately, so it may be stale.
-* `non-auth`: treat the s3guard metadata as authorative
-
-```bash
-mvn -T 1C verify -Dparallel-tests -DtestsThreadCount=6 -Ds3guard -Ddynamo -Dauth 
-```
-
-When experimenting with options, it is usually best to run a single test suite
-at a time until the operations appear to be working.
-
-```bash
-mvn -T 1C verify -Dtest=skip -Dit.test=ITestS3AMiscOperations -Ds3guard -Ddynamo
-```
-
-### Notes
-
-1. If the `s3guard` profile is not set, then the s3guard properties are those
-of the test configuration set in `contract-test-options.xml` or `auth-keys.xml`
-
-If the `s3guard` profile *is* set, 
-1. The s3guard options from maven (the dynamo and authoritative flags)
-  overwrite any previously set. in the configuration files.
-1. Dynamo will be configured to create any missing tables.
-
-### Warning About Concurrent Tests
-
-You should not run S3A and S3N tests in parallel on the same bucket.  This is
-especially true when S3Guard is enabled.  S3Guard requires that all clients
-that are modifying the bucket have S3Guard enabled, so having S3N
-integration tests running in parallel with S3A tests will cause strange
-failures.
-
-### Scale Testing MetadataStore Directly
-
-We also have some scale tests that exercise MetadataStore implementations
-directly.  These allow us to ensure were are robust to things like DynamoDB
-throttling, and compare performance for different implementations. See the
-main [S3A documentation](./index.html) for more details on how to enable the
-S3A scale tests.
-
-The two scale tests here are `ITestDynamoDBMetadataStoreScale` and
-`ITestLocalMetadataStoreScale`.  To run the DynamoDB test, you will need to
-define your table name and region in your test configuration.  For example,
-the following settings allow us to run `ITestDynamoDBMetadataStoreScale` with
-artificially low read and write capacity provisioned, so we can judge the
-effects of being throttled by the DynamoDB service:
-
-```
-<property>
-    <name>scale.test.operation.count</name>
-    <value>10</value>
-</property>
-<property>
-    <name>scale.test.directory.count</name>
-    <value>3</value>
-</property>
-<property>
-    <name>fs.s3a.scale.test.enabled</name>
-    <value>true</value>
-</property>
-<property>
-    <name>fs.s3a.s3guard.ddb.table</name>
-    <value>my-scale-test</value>
-</property>
-<property>
-    <name>fs.s3a.s3guard.ddb.region</name>
-    <value>us-west-2</value>
-</property>
-<property>
-    <name>fs.s3a.s3guard.ddb.table.create</name>
-    <value>true</value>
-</property>
-<property>
-    <name>fs.s3a.s3guard.ddb.table.capacity.read</name>
-    <value>10</value>
-</property>
-<property>
-    <name>fs.s3a.s3guard.ddb.table.capacity.write</name>
-    <value>10</value>
-</property>
-```
-
-### Testing only: Local Metadata Store
+#### Error `"DynamoDB table TABLE does not exist in region REGION; auto-creation is turned off"`
 
-There is an in-memory metadata store for testing.
+S3Guard could not find the DynamoDB table for the Metadata Store,
+and it was not configured to create it. Either the table was missing,
+or the configuration is preventing S3Guard from finding the table.
 
-```xml
-<property>
-  <name>fs.s3a.metadatastore.impl</name>
-  <value>org.apache.hadoop.fs.s3a.s3guard.LocalMetadataStore</value>
-</property>
-```
+1. Verify that the value of `fs.s3a.s3guard.ddb.table` is correct.
+1. If the region for an existing table has been set in
+`fs.s3a.s3guard.ddb.region`, verify that the value is correct.
+1. If the region is not set, verify that the table exists in the same
+region as the bucket being used.
+1. Create the table if necessary.

http://git-wip-us.apache.org/repos/asf/hadoop/blob/e531ae25/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/testing.md
----------------------------------------------------------------------
diff --git a/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/testing.md b/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/testing.md
index dcc6d46..3b9b5c4 100644
--- a/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/testing.md
+++ b/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/testing.md
@@ -821,28 +821,29 @@ using an absolute XInclude reference to it.
 # Failure Injection
 
 **Warning do not enable any type of failure injection in production.  The
-following settings are for test development only.**
-
-## Inconsistency Injection
+following settings are for testing only.**
 
 One of the challenges with S3A integration tests is the fact that S3 is an
 eventually-consistent storage system.  In practice, we rarely see delays in
-visibility of recently created objects both in listings (listStatus()) and
-when getting a single file's metadata (getFileStatus()).  Since this behavior
+visibility of recently created objects both in listings (`listStatus()`) and
+when getting a single file's metadata (`getFileStatus()`). Since this behavior
 is rare and non-deterministic, thorough integration testing is challenging.
 
-To address this, we developed a shim layer on top of the `AmazonS3Client`
+To address this, S3A supports a shim layer on top of the `AmazonS3Client`
 class which artificially delays certain paths from appearing in listings.
 This is implemented in the class `InconsistentAmazonS3Client`.
 
+## Simulating List Inconsistencies
+
 ### Enabling the InconsistentAmazonS3CClient
 
 There are two ways of enabling the `InconsistentAmazonS3Client`: at
-config-time, or programmatically.  For an example of programmatic test usage,
+config-time, or programmatically. For an example of programmatic test usage,
 see `ITestS3GuardListConsistency`.
 
-To enable the inconsistency injecting client via configuration, set the
-following class name for the client factory configuration:
+To enable the fault-injecting client via configuration, switch the
+S3A client to use the "Inconsistent S3 Client Factory" when connecting to
+S3:
 
 ```xml
 <property>
@@ -855,9 +856,9 @@ The inconsistent client works by:
 
 1. Choosing which objects will be "inconsistent" at the time the object is
 created or deleted.
-2. When listObjects is called, any keys that we have marked as
+2. When `listObjects()` is called, any keys that we have marked as
 inconsistent above will not be returned in the results (until the
-configured delay has elapsed).  Similarly, deleted items may be *added* to
+configured delay has elapsed). Similarly, deleted items may be *added* to
 missing results to delay the visibility of the delete.
 
 There are two ways of choosing which keys (filenames) will be affected: By
@@ -876,15 +877,15 @@ substring, and by random probability.
 ```
 
 By default, any object which has the substring "DELAY_LISTING_ME" in its key
-will subject to delayed visibility.  For example, the path
+will subject to delayed visibility. For example, the path
 `s3a://my-bucket/test/DELAY_LISTING_ME/file.txt` would match this condition.
 To match all keys use the value "\*" (a single asterisk). This is a special
 value: *We don't support arbitrary wildcards.*
 
-The default probability of delaying an object is 1.0.  This means that *all*
+The default probability of delaying an object is 1.0. This means that *all*
 keys that match the substring will get delayed visibility. Note that we take
 the logical *and* of the two conditions (substring matches *and* probability
-random chance occurs).  Here are some example configurations:
+random chance occurs). Here are some example configurations:
 
 ```
 | substring | probability |  behavior                                  |
@@ -910,33 +911,189 @@ The default is 5000 milliseconds (five seconds).
 </property>
 ```
 
-#### Limitations of Inconsistency Injection
+Future versions of this client will introduce new failure modes,
+with simulation of S3 throttling exceptions the next feature under
+development.
+
+### Limitations of Inconsistency Injection
 
-Although we can delay visibility of an object or parent directory va the
-`InconsistentAmazonS3Client` we do not keep the key of that object from
-appearing in all prefix searches.  For example, if we create the following
+Although `InconsistentAmazonS3Client` can delay the visibility of an object
+or parent directory, it does not prevent the key of that object from
+appearing in all prefix searches. For example, if we create the following
 object with the default configuration above, in an otherwise empty bucket:
 
 ```
- s3a://bucket/a/b/c/DELAY_LISTING_ME
+s3a://bucket/a/b/c/DELAY_LISTING_ME
 ```
 
-Then the following paths will still be visible as directories:
+Then the following paths will still be visible as directories (ignoring
+possible real-world inconsistencies):
+
+```
+s3a://bucket/a
+s3a://bucket/a/b
+```
+
+Whereas `getFileStatus()` on the following *will* be subject to delayed
+visibility (`FileNotFoundException` until delay has elapsed):
+
+```
+s3a://bucket/a/b/c
+s3a://bucket/a/b/c/DELAY_LISTING_ME
+```
+
+In real-life S3 inconsistency, however, we expect that all the above paths
+(including `a` and `b`) will be subject to delayed visiblity.
+
+### Using the `InconsistentAmazonS3CClient` in downstream integration tests
+
+The inconsistent client is shipped in the `hadoop-aws` JAR, so it can
+be used in applications which work with S3 to see how they handle
+inconsistent directory listings.
+
+## Testing S3Guard
+
+The basic strategy for testing S3Guard correctness consists of:
+
+1. MetadataStore Contract tests.
+
+    The MetadataStore contract tests are inspired by the Hadoop FileSystem and
+    `FileContext` contract tests.  Each implementation of the `MetadataStore` interface
+    subclasses the `MetadataStoreTestBase` class and customizes it to initialize
+    their MetadataStore.  This test ensures that the different implementations
+    all satisfy the semantics of the MetadataStore API.
 
+2. Running existing S3A unit and integration tests with S3Guard enabled.
+
+    You can run the S3A integration tests on top of S3Guard by configuring your
+    `MetadataStore` in your
+    `hadoop-tools/hadoop-aws/src/test/resources/core-site.xml` or
+    `hadoop-tools/hadoop-aws/src/test/resources/auth-keys.xml` files.
+    Next run the S3A integration tests as outlined in the *Running the Tests* section
+    of the [S3A documentation](./index.html)
+
+3. Running fault-injection tests that test S3Guard's consistency features.
+
+    The `ITestS3GuardListConsistency` uses failure injection to ensure
+    that list consistency logic is correct even when the underlying storage is
+    eventually consistent.
+
+    The integration test adds a shim above the Amazon S3 Client layer that injects
+    delays in object visibility.
+
+    All of these tests will be run if you follow the steps listed in step 2 above.
+
+    No charges are incurred for using this store, and its consistency
+    guarantees are that of the underlying object store instance. <!-- :) -->
+
+## Testing S3A with S3Guard Enabled
+
+All the S3A tests which work with a private repository can be configured to
+run with S3Guard by using the `s3guard` profile. When set, this will run
+all the tests with local memory for the metadata set to "non-authoritative" mode.
+
+```bash
+mvn -T 1C verify -Dparallel-tests -DtestsThreadCount=6 -Ds3guard
 ```
- s3a://bucket/a
- s3a://bucket/a/b
+
+When the `s3guard` profile is enabled, following profiles can be specified:
+
+* `dynamo`: use an AWS-hosted DynamoDB table; creating the table if it does
+  not exist. You will have to pay the bills for DynamoDB web service.
+* `dynamodblocal`: use an in-memory DynamoDBLocal server instead of real AWS
+  DynamoDB web service; launch the server and creating the table.
+  You won't be charged bills for using DynamoDB in test. As it runs in-JVM,
+  the table isn't shared across other tests running in parallel.
+* `non-auth`: treat the S3Guard metadata as authorative.
+
+```bash
+mvn -T 1C verify -Dparallel-tests -DtestsThreadCount=6 -Ds3guard -Ddynamo -Dauth
 ```
 
-Whereas getFileStatus() on the following *will* be subject to delayed
-visibility (FileNotFoundException until delay has elapsed):
+When experimenting with options, it is usually best to run a single test suite
+at a time until the operations appear to be working.
 
+```bash
+mvn -T 1C verify -Dtest=skip -Dit.test=ITestS3AMiscOperations -Ds3guard -Ddynamo
 ```
- s3a://bucket/a/b/c
- s3a://bucket/a/b/c/DELAY_LISTING_ME
+
+### Notes
+
+1. If the `s3guard` profile is not set, then the S3Guard properties are those
+of the test configuration set in `contract-test-options.xml` or `auth-keys.xml`
+
+If the `s3guard` profile *is* set,
+1. The S3Guard options from maven (the dynamo and authoritative flags)
+  overwrite any previously set in the configuration files.
+1. DynamoDB will be configured to create any missing tables.
+
+### Warning About Concurrent Tests
+
+You must not run S3A and S3N tests in parallel on the same bucket.  This is
+especially true when S3Guard is enabled.  S3Guard requires that all clients
+that are modifying the bucket have S3Guard enabled, so having S3N
+integration tests running in parallel with S3A tests will cause strange
+failures.
+
+### Scale Testing MetadataStore Directly
+
+There are some scale tests that exercise Metadata Store implementations
+directly. These ensure that S3Guard is are robust to things like DynamoDB
+throttling, and compare performance for different implementations. These
+are included in the scale tests executed when `-Dscale` is passed to
+the maven command line.
+
+The two S3Guard scale testse are `ITestDynamoDBMetadataStoreScale` and
+`ITestLocalMetadataStoreScale`.  To run the DynamoDB test, you will need to
+define your table name and region in your test configuration.  For example,
+the following settings allow us to run `ITestDynamoDBMetadataStoreScale` with
+artificially low read and write capacity provisioned, so we can judge the
+effects of being throttled by the DynamoDB service:
+
+```xml
+<property>
+  <name>scale.test.operation.count</name>
+  <value>10</value>
+</property>
+<property>
+  <name>scale.test.directory.count</name>
+  <value>3</value>
+</property>
+<property>
+  <name>fs.s3a.scale.test.enabled</name>
+  <value>true</value>
+</property>
+<property>
+  <name>fs.s3a.s3guard.ddb.table</name>
+  <value>my-scale-test</value>
+</property>
+<property>
+  <name>fs.s3a.s3guard.ddb.region</name>
+  <value>us-west-2</value>
+</property>
+<property>
+  <name>fs.s3a.s3guard.ddb.table.create</name>
+  <value>true</value>
+</property>
+<property>
+  <name>fs.s3a.s3guard.ddb.table.capacity.read</name>
+  <value>10</value>
+</property>
+<property>
+  <name>fs.s3a.s3guard.ddb.table.capacity.write</name>
+  <value>10</value>
+</property>
 ```
 
- In real-life S3 inconsistency, however, we expect that all the above paths
- (including `a` and `b`) will be subject to delayed visiblity.
+### Testing only: Local Metadata Store
+
+There is an in-memory Metadata Store for testing.
 
+```xml
+<property>
+  <name>fs.s3a.metadatastore.impl</name>
+  <value>org.apache.hadoop.fs.s3a.s3guard.LocalMetadataStore</value>
+</property>
+```
 
+This is not for use in production.

http://git-wip-us.apache.org/repos/asf/hadoop/blob/e531ae25/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/S3ATestConstants.java
----------------------------------------------------------------------
diff --git a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/S3ATestConstants.java b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/S3ATestConstants.java
index ccc28de..2c4f009 100644
--- a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/S3ATestConstants.java
+++ b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/S3ATestConstants.java
@@ -135,7 +135,7 @@ public interface S3ATestConstants {
   String TEST_STS_ENDPOINT = "test.fs.s3a.sts.endpoint";
 
   /**
-   * Various s3guard tests.
+   * Various S3Guard tests.
    */
   String TEST_S3GUARD_PREFIX = "fs.s3a.s3guard.test";
   String TEST_S3GUARD_ENABLED = TEST_S3GUARD_PREFIX + ".enabled";

http://git-wip-us.apache.org/repos/asf/hadoop/blob/e531ae25/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/S3ATestUtils.java
----------------------------------------------------------------------
diff --git a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/S3ATestUtils.java b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/S3ATestUtils.java
index cb73323..8dbf90a 100644
--- a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/S3ATestUtils.java
+++ b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/S3ATestUtils.java
@@ -331,13 +331,17 @@ public final class S3ATestUtils {
 
   /**
    * Test assumption that S3Guard is/is not enabled.
+   * @param shouldBeEnabled should S3Guard be enabled?
+   * @param originalConf configuration to check
+   * @throws URISyntaxException
    */
   public static void assumeS3GuardState(boolean shouldBeEnabled,
       Configuration originalConf) throws URISyntaxException {
     boolean isEnabled = getTestPropertyBool(originalConf, TEST_S3GUARD_ENABLED,
         originalConf.getBoolean(TEST_S3GUARD_ENABLED, false));
-    Assume.assumeThat("Unexpected S3Guard test state: shouldBeEnabled=" +
-        shouldBeEnabled + " and isEnabled =" + isEnabled,
+    Assume.assumeThat("Unexpected S3Guard test state:"
+            + " shouldBeEnabled=" + shouldBeEnabled
+            + " and isEnabled=" + isEnabled,
         shouldBeEnabled, Is.is(isEnabled));
 
     final String fsname = originalConf.getTrimmed(TEST_FS_S3A_NAME);
@@ -346,8 +350,9 @@ public final class S3ATestUtils {
     final Configuration conf = propagateBucketOptions(originalConf, bucket);
     boolean usingNullImpl = S3GUARD_METASTORE_NULL.equals(
         conf.getTrimmed(S3_METADATA_STORE_IMPL, S3GUARD_METASTORE_NULL));
-    Assume.assumeThat("Unexpected S3Guard test state: shouldBeEnabled=" +
-        shouldBeEnabled + " but usingNullImpl=" + usingNullImpl,
+    Assume.assumeThat("Unexpected S3Guard test state:"
+            + " shouldBeEnabled=" + shouldBeEnabled
+            + " but usingNullImpl=" + usingNullImpl,
         shouldBeEnabled, Is.is(!usingNullImpl));
   }
 
@@ -358,7 +363,7 @@ public final class S3ATestUtils {
   public static void maybeEnableS3Guard(Configuration conf) {
     if (getTestPropertyBool(conf, TEST_S3GUARD_ENABLED,
         conf.getBoolean(TEST_S3GUARD_ENABLED, false))) {
-      // s3guard is enabled.
+      // S3Guard is enabled.
       boolean authoritative = getTestPropertyBool(conf,
           TEST_S3GUARD_AUTHORITATIVE,
           conf.getBoolean(TEST_S3GUARD_AUTHORITATIVE, true));
@@ -603,12 +608,32 @@ public final class S3ATestUtils {
   private S3ATestUtils() {
   }
 
+  /**
+   * Verify the core size, block size and timestamp values of a file.
+   * @param status status entry to check
+   * @param size file size
+   * @param blockSize block size
+   * @param modTime modified time
+   */
   public static void verifyFileStatus(FileStatus status, long size,
       long blockSize, long modTime) {
     verifyFileStatus(status, size, 0, modTime, 0, blockSize, null, null, null);
   }
 
-  public static void verifyFileStatus(FileStatus status, long size,
+  /**
+   * Verify the status entry of a file matches that expected.
+   * @param status status entry to check
+   * @param size file size
+   * @param replication replication factor (may be 0)
+   * @param modTime modified time
+   * @param accessTime access time (may be 0)
+   * @param blockSize block size
+   * @param owner owner (may be null)
+   * @param group user group (may be null)
+   * @param permission permission (may be null)
+   */
+  public static void verifyFileStatus(FileStatus status,
+      long size,
       int replication,
       long modTime,
       long accessTime,
@@ -641,6 +666,16 @@ public final class S3ATestUtils {
     }
   }
 
+  /**
+   * Verify the status entry of a directory matches that expected.
+   * @param status status entry to check
+   * @param replication replication factor
+   * @param modTime modified time
+   * @param accessTime access time
+   * @param owner owner
+   * @param group user group
+   * @param permission permission.
+   */
   public static void verifyDirStatus(FileStatus status,
       int replication,
       long modTime,

http://git-wip-us.apache.org/repos/asf/hadoop/blob/e531ae25/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/AbstractMSContract.java
----------------------------------------------------------------------
diff --git a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/AbstractMSContract.java b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/AbstractMSContract.java
index 9a1b590..921d4a6 100644
--- a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/AbstractMSContract.java
+++ b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/AbstractMSContract.java
@@ -28,6 +28,6 @@ import java.io.IOException;
  */
 public abstract class AbstractMSContract {
 
-  public abstract FileSystem getFileSystem();
+  public abstract FileSystem getFileSystem() throws IOException;
   public abstract MetadataStore getMetadataStore() throws IOException;
-}
\ No newline at end of file
+}

http://git-wip-us.apache.org/repos/asf/hadoop/blob/e531ae25/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/AbstractS3GuardToolTestBase.java
----------------------------------------------------------------------
diff --git a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/AbstractS3GuardToolTestBase.java b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/AbstractS3GuardToolTestBase.java
new file mode 100644
index 0000000..5f34795
--- /dev/null
+++ b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/AbstractS3GuardToolTestBase.java
@@ -0,0 +1,161 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * <p>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a.s3guard;
+
+import java.io.IOException;
+import java.util.concurrent.TimeUnit;
+
+import org.junit.Test;
+
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.fs.contract.ContractTestUtils;
+import org.apache.hadoop.fs.s3a.AbstractS3ATestBase;
+import org.apache.hadoop.fs.s3a.Constants;
+import org.apache.hadoop.fs.s3a.S3AFileStatus;
+import org.apache.hadoop.fs.s3a.S3ATestUtils;
+import org.apache.hadoop.io.IOUtils;
+
+import static org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.SUCCESS;
+
+/**
+ * Common functionality for S3GuardTool test cases.
+ */
+public abstract class AbstractS3GuardToolTestBase extends AbstractS3ATestBase {
+
+  protected static final String OWNER = "hdfs";
+
+  private MetadataStore ms;
+
+  protected static void expectResult(int expected,
+      String message,
+      S3GuardTool tool,
+      String... args) throws Exception {
+    assertEquals(message, expected, tool.run(args));
+  }
+
+  protected static void expectSuccess(
+      String message,
+      S3GuardTool tool,
+      String... args) throws Exception {
+    assertEquals(message, SUCCESS, tool.run(args));
+  }
+
+  protected MetadataStore getMetadataStore() {
+    return ms;
+  }
+
+  protected abstract MetadataStore newMetadataStore();
+
+  @Override
+  public void setup() throws Exception {
+    super.setup();
+    S3ATestUtils.assumeS3GuardState(true, getConfiguration());
+    ms = newMetadataStore();
+    ms.initialize(getFileSystem());
+  }
+
+  @Override
+  public void teardown() throws Exception {
+    super.teardown();
+    IOUtils.cleanupWithLogger(LOG, ms);
+  }
+
+  protected void mkdirs(Path path, boolean onS3, boolean onMetadataStore)
+      throws IOException {
+    if (onS3) {
+      getFileSystem().mkdirs(path);
+    }
+    if (onMetadataStore) {
+      S3AFileStatus status = new S3AFileStatus(true, path, OWNER);
+      ms.put(new PathMetadata(status));
+    }
+  }
+
+  protected static void putFile(MetadataStore ms, S3AFileStatus f)
+      throws IOException {
+    assertNotNull(f);
+    ms.put(new PathMetadata(f));
+    Path parent = f.getPath().getParent();
+    while (parent != null) {
+      S3AFileStatus dir = new S3AFileStatus(false, parent, f.getOwner());
+      ms.put(new PathMetadata(dir));
+      parent = parent.getParent();
+    }
+  }
+
+  /**
+   * Create file either on S3 or in metadata store.
+   * @param path the file path.
+   * @param onS3 set to true to create the file on S3.
+   * @param onMetadataStore set to true to create the file on the
+   *                        metadata store.
+   * @throws IOException IO problem
+   */
+  protected void createFile(Path path, boolean onS3, boolean onMetadataStore)
+      throws IOException {
+    if (onS3) {
+      ContractTestUtils.touch(getFileSystem(), path);
+    }
+
+    if (onMetadataStore) {
+      S3AFileStatus status = new S3AFileStatus(100L, System.currentTimeMillis(),
+          getFileSystem().qualify(path), 512L, "hdfs");
+      putFile(ms, status);
+    }
+  }
+
+  private void testPruneCommand(Configuration cmdConf, String...args)
+      throws Exception {
+    Path parent = path("prune-cli");
+    try {
+      getFileSystem().mkdirs(parent);
+
+      S3GuardTool.Prune cmd = new S3GuardTool.Prune(cmdConf);
+      cmd.setMetadataStore(ms);
+
+      createFile(new Path(parent, "stale"), true, true);
+      Thread.sleep(TimeUnit.SECONDS.toMillis(2));
+      createFile(new Path(parent, "fresh"), true, true);
+
+      assertEquals(2, ms.listChildren(parent).getListing().size());
+      expectSuccess("Prune command did not exit successfully - see output", cmd,
+          args);
+      assertEquals(1, ms.listChildren(parent).getListing().size());
+    } finally {
+      getFileSystem().delete(parent, true);
+      ms.prune(Long.MAX_VALUE);
+    }
+  }
+
+  @Test
+  public void testPruneCommandCLI() throws Exception {
+    String testPath = path("testPruneCommandCLI").toString();
+    testPruneCommand(getFileSystem().getConf(),
+        "prune", "-seconds", "1", testPath);
+  }
+
+  @Test
+  public void testPruneCommandConf() throws Exception {
+    getConfiguration().setLong(Constants.S3GUARD_CLI_PRUNE_AGE,
+        TimeUnit.SECONDS.toMillis(1));
+    String testPath = path("testPruneCommandConf").toString();
+    testPruneCommand(getConfiguration(), "prune", testPath);
+  }
+}

http://git-wip-us.apache.org/repos/asf/hadoop/blob/e531ae25/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/DynamoDBLocalClientFactory.java
----------------------------------------------------------------------
diff --git a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/DynamoDBLocalClientFactory.java b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/DynamoDBLocalClientFactory.java
index 750cfb3..d584850 100644
--- a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/DynamoDBLocalClientFactory.java
+++ b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/DynamoDBLocalClientFactory.java
@@ -48,12 +48,20 @@ import static org.apache.hadoop.fs.s3a.s3guard.DynamoDBClientFactory.DefaultDyna
  * in DynamoDBLocal. This is for testing purpose only.
  *
  * To use this for creating DynamoDB client in tests:
- * 1. As all DynamoDBClientFactory implementations, this should be configured.
- * 2. The singleton DynamoDBLocal server instance is started automatically when
+ * <ol>
+ * <li>
+ *    As all DynamoDBClientFactory implementations, this should be configured.
+ * </li>
+ * <li>
+ *    The singleton DynamoDBLocal server instance is started automatically when
  *    creating the AmazonDynamoDB client for the first time. It still merits to
  *    launch the server before all the tests and fail fast if error happens.
- * 3. The sever can be stopped explicitly, which is not actually needed in tests
- *    as JVM termination will do that.
+ * </li>
+ * <li>
+ *    The server can be stopped explicitly, which is not actually needed in
+ *    tests as JVM termination will do that.
+ * </li>
+ * </ol>
  *
  * @see DefaultDynamoDBClientFactory
  */

http://git-wip-us.apache.org/repos/asf/hadoop/blob/e531ae25/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/ITestS3GuardConcurrentOps.java
----------------------------------------------------------------------
diff --git a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/ITestS3GuardConcurrentOps.java b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/ITestS3GuardConcurrentOps.java
index 24eb6fb..21f2cc8 100644
--- a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/ITestS3GuardConcurrentOps.java
+++ b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/ITestS3GuardConcurrentOps.java
@@ -18,10 +18,6 @@
 
 package org.apache.hadoop.fs.s3a.s3guard;
 
-import com.amazonaws.services.dynamodbv2.document.DynamoDB;
-import com.amazonaws.services.dynamodbv2.document.Table;
-import com.amazonaws.services.dynamodbv2.model.ResourceNotFoundException;
-
 import java.util.ArrayList;
 import java.util.List;
 import java.util.Random;
@@ -33,17 +29,20 @@ import java.util.concurrent.ThreadFactory;
 import java.util.concurrent.ThreadPoolExecutor;
 import java.util.concurrent.atomic.AtomicInteger;
 
-import org.apache.hadoop.conf.Configuration;
-import org.apache.hadoop.fs.contract.ContractTestUtils;
-import org.apache.hadoop.fs.s3a.AbstractS3ATestBase;
-import org.apache.hadoop.fs.s3a.Constants;
-
-import org.apache.commons.lang3.StringUtils;
+import com.amazonaws.services.dynamodbv2.document.DynamoDB;
+import com.amazonaws.services.dynamodbv2.document.Table;
+import com.amazonaws.services.dynamodbv2.model.ResourceNotFoundException;
 import org.junit.Assume;
 import org.junit.Rule;
 import org.junit.Test;
 import org.junit.rules.Timeout;
 
+import org.apache.commons.lang3.StringUtils;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.contract.ContractTestUtils;
+import org.apache.hadoop.fs.s3a.AbstractS3ATestBase;
+import org.apache.hadoop.fs.s3a.Constants;
+
 import static org.apache.hadoop.fs.s3a.Constants.S3GUARD_DDB_REGION_KEY;
 
 /**

http://git-wip-us.apache.org/repos/asf/hadoop/blob/e531ae25/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/ITestS3GuardToolDynamoDB.java
----------------------------------------------------------------------
diff --git a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/ITestS3GuardToolDynamoDB.java b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/ITestS3GuardToolDynamoDB.java
index c2e4f5c..d2aa56f 100644
--- a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/ITestS3GuardToolDynamoDB.java
+++ b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/ITestS3GuardToolDynamoDB.java
@@ -18,24 +18,24 @@
 
 package org.apache.hadoop.fs.s3a.s3guard;
 
+import java.io.IOException;
+import java.util.Random;
+import java.util.concurrent.Callable;
+
 import com.amazonaws.services.dynamodbv2.document.DynamoDB;
 import com.amazonaws.services.dynamodbv2.document.Table;
 import com.amazonaws.services.dynamodbv2.model.ResourceNotFoundException;
+import org.junit.Test;
+
 import org.apache.hadoop.fs.s3a.S3AFileSystem;
 import org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.Destroy;
 import org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.Init;
-import org.junit.Test;
-
-import java.io.IOException;
-import java.util.Random;
-
-import static org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.INVALID_ARGUMENT;
-import static org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.SUCCESS;
+import org.apache.hadoop.test.LambdaTestUtils;
 
 /**
  * Test S3Guard related CLI commands against DynamoDB.
  */
-public class ITestS3GuardToolDynamoDB extends S3GuardToolTestBase {
+public class ITestS3GuardToolDynamoDB extends AbstractS3GuardToolTestBase {
 
   @Override
   protected MetadataStore newMetadataStore() {
@@ -58,39 +58,38 @@ public class ITestS3GuardToolDynamoDB extends S3GuardToolTestBase {
 
   @Test
   public void testInvalidRegion() throws Exception {
-    String testTableName = "testInvalidRegion" + new Random().nextInt();
+    final String testTableName = "testInvalidRegion" + new Random().nextInt();
     String testRegion = "invalidRegion";
     // Initialize MetadataStore
-    Init initCmd = new Init(getFs().getConf());
-    try {
-      initCmd.run(new String[]{
-          "init",
-          "-region", testRegion,
-          "-meta", "dynamodb://" + testTableName
-      });
-    } catch (IOException e) {
-      // Expected
-      return;
-    }
-    fail("Use of invalid region did not fail - table may have been " +
-        "created and not cleaned up: " + testTableName);
+    Init initCmd = new Init(getFileSystem().getConf());
+    LambdaTestUtils.intercept(IOException.class,
+        new Callable<String>() {
+          @Override
+          public String call() throws Exception {
+            int res = initCmd.run(new String[]{
+                "init",
+                "-region", testRegion,
+                "-meta", "dynamodb://" + testTableName
+            });
+            return "Use of invalid region did not fail, returning " + res
+                + "- table may have been " +
+                "created and not cleaned up: " + testTableName;
+          }
+        });
   }
 
   @Test
-  public void testDynamoDBInitDestroyCycle() throws IOException,
-      InterruptedException {
+  public void testDynamoDBInitDestroyCycle() throws Exception {
     String testTableName = "testDynamoDBInitDestroy" + new Random().nextInt();
     String testS3Url = path(testTableName).toString();
-    S3AFileSystem fs = getFs();
+    S3AFileSystem fs = getFileSystem();
     DynamoDB db = null;
     try {
       // Initialize MetadataStore
       Init initCmd = new Init(fs.getConf());
-      assertEquals("Init command did not exit successfully - see output",
-          SUCCESS, initCmd.run(new String[]{
-              "init", "-meta", "dynamodb://" + testTableName,
-              testS3Url
-          }));
+      expectSuccess("Init command did not exit successfully - see output",
+          initCmd,
+          "init", "-meta", "dynamodb://" + testTableName, testS3Url);
       // Verify it exists
       MetadataStore ms = getMetadataStore();
       assertTrue("metadata store should be DynamoDBMetadataStore",
@@ -102,18 +101,24 @@ public class ITestS3GuardToolDynamoDB extends S3GuardToolTestBase {
 
       // Destroy MetadataStore
       Destroy destroyCmd = new Destroy(fs.getConf());
-      assertEquals("Destroy command did not exit successfully - see output",
-          SUCCESS, destroyCmd.run(new String[]{
-              "destroy", "-meta", "dynamodb://" + testTableName,
-              testS3Url
-          }));
+
+      expectSuccess("Destroy command did not exit successfully - see output",
+          destroyCmd,
+          "destroy", "-meta", "dynamodb://" + testTableName, testS3Url);
       // Verify it does not exist
       assertFalse(String.format("%s still exists", testTableName),
           exist(db, testTableName));
+
+      // delete again and expect success again
+      expectSuccess("Destroy command did not exit successfully - see output",
+          destroyCmd,
+          "destroy", "-meta", "dynamodb://" + testTableName, testS3Url);
     } catch (ResourceNotFoundException e) {
-      fail(String.format("DynamoDB table %s does not exist", testTableName));
+      throw new AssertionError(
+          String.format("DynamoDB table %s does not exist", testTableName),
+          e);
     } finally {
-      System.out.println("Warning! Table may have not been cleaned up: " +
+      LOG.warn("Table may have not been cleaned up: " +
           testTableName);
       if (db != null) {
         Table table = db.getTable(testTableName);

http://git-wip-us.apache.org/repos/asf/hadoop/blob/e531ae25/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/ITestS3GuardToolLocal.java
----------------------------------------------------------------------
diff --git a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/ITestS3GuardToolLocal.java b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/ITestS3GuardToolLocal.java
index d9d6a42..992b8f6 100644
--- a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/ITestS3GuardToolLocal.java
+++ b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/ITestS3GuardToolLocal.java
@@ -18,12 +18,6 @@
 
 package org.apache.hadoop.fs.s3a.s3guard;
 
-import org.apache.hadoop.fs.FSDataOutputStream;
-import org.apache.hadoop.fs.Path;
-import org.apache.hadoop.fs.s3a.S3AFileSystem;
-import org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.Diff;
-import org.junit.Test;
-
 import java.io.BufferedReader;
 import java.io.ByteArrayInputStream;
 import java.io.ByteArrayOutputStream;
@@ -33,14 +27,20 @@ import java.io.PrintStream;
 import java.util.HashSet;
 import java.util.Set;
 
+import org.junit.Test;
+
+import org.apache.hadoop.fs.FSDataOutputStream;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.fs.s3a.S3AFileSystem;
+import org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.Diff;
+
 import static org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.SUCCESS;
-import static org.junit.Assert.assertEquals;
-import static org.junit.Assert.fail;
+
 
 /**
  * Test S3Guard related CLI commands against a LocalMetadataStore.
  */
-public class ITestS3GuardToolLocal extends S3GuardToolTestBase {
+public class ITestS3GuardToolLocal extends AbstractS3GuardToolTestBase {
 
   @Override
   protected MetadataStore newMetadataStore() {
@@ -48,8 +48,8 @@ public class ITestS3GuardToolLocal extends S3GuardToolTestBase {
   }
 
   @Test
-  public void testImportCommand() throws IOException {
-    S3AFileSystem fs = getFs();
+  public void testImportCommand() throws Exception {
+    S3AFileSystem fs = getFileSystem();
     MetadataStore ms = getMetadataStore();
     Path parent = path("test-import");
     fs.mkdirs(parent);
@@ -67,8 +67,9 @@ public class ITestS3GuardToolLocal extends S3GuardToolTestBase {
     S3GuardTool.Import cmd = new S3GuardTool.Import(fs.getConf());
     cmd.setMetadataStore(ms);
 
-    assertEquals("Import command did not exit successfully - see output",
-        SUCCESS, cmd.run(new String[]{"import", parent.toString()}));
+    expectSuccess("Import command did not exit successfully - see output",
+        cmd,
+        "import", parent.toString());
 
     DirListingMetadata children =
         ms.listChildren(dir);
@@ -81,7 +82,7 @@ public class ITestS3GuardToolLocal extends S3GuardToolTestBase {
 
   @Test
   public void testDiffCommand() throws IOException {
-    S3AFileSystem fs = getFs();
+    S3AFileSystem fs = getFileSystem();
     MetadataStore ms = getMetadataStore();
     Set<Path> filesOnS3 = new HashSet<>(); // files on S3.
     Set<Path> filesOnMS = new HashSet<>(); // files on metadata store.
@@ -114,30 +115,29 @@ public class ITestS3GuardToolLocal extends S3GuardToolTestBase {
     assertEquals("Diff command did not exit successfully - see output", SUCCESS,
         cmd.run(new String[]{"diff", "-meta", "local://metadata",
             testPath.toString()}, out));
+    out.close();
 
     Set<Path> actualOnS3 = new HashSet<>();
     Set<Path> actualOnMS = new HashSet<>();
     boolean duplicates = false;
-    try (ByteArrayInputStream in =
-             new ByteArrayInputStream(buf.toByteArray())) {
-      try (BufferedReader reader =
-               new BufferedReader(new InputStreamReader(in))) {
-        String line;
-        while ((line = reader.readLine()) != null) {
-          String[] fields = line.split("\\s");
-          assertEquals("[" + line + "] does not have enough fields",
-              4, fields.length);
-          String where = fields[0];
-          Path path = new Path(fields[3]);
-          if (Diff.S3_PREFIX.equals(where)) {
-            duplicates = duplicates || actualOnS3.contains(path);
-            actualOnS3.add(path);
-          } else if (Diff.MS_PREFIX.equals(where)) {
-            duplicates = duplicates || actualOnMS.contains(path);
-            actualOnMS.add(path);
-          } else {
-            fail("Unknown prefix: " + where);
-          }
+    try (BufferedReader reader =
+             new BufferedReader(new InputStreamReader(
+                 new ByteArrayInputStream(buf.toByteArray())))) {
+      String line;
+      while ((line = reader.readLine()) != null) {
+        String[] fields = line.split("\\s");
+        assertEquals("[" + line + "] does not have enough fields",
+            4, fields.length);
+        String where = fields[0];
+        Path path = new Path(fields[3]);
+        if (Diff.S3_PREFIX.equals(where)) {
+          duplicates = duplicates || actualOnS3.contains(path);
+          actualOnS3.add(path);
+        } else if (Diff.MS_PREFIX.equals(where)) {
+          duplicates = duplicates || actualOnMS.contains(path);
+          actualOnMS.add(path);
+        } else {
+          fail("Unknown prefix: " + where);
         }
       }
     }

http://git-wip-us.apache.org/repos/asf/hadoop/blob/e531ae25/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/MetadataStoreTestBase.java
----------------------------------------------------------------------
diff --git a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/MetadataStoreTestBase.java b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/MetadataStoreTestBase.java
index 72cc512..c19ae91 100644
--- a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/MetadataStoreTestBase.java
+++ b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/MetadataStoreTestBase.java
@@ -18,14 +18,12 @@
 
 package org.apache.hadoop.fs.s3a.s3guard;
 
-import org.apache.hadoop.conf.Configuration;
-import org.apache.hadoop.fs.FileStatus;
-import org.apache.hadoop.fs.Path;
-import org.apache.hadoop.fs.RemoteIterator;
-import org.apache.hadoop.fs.permission.FsPermission;
-import org.apache.hadoop.fs.s3a.S3ATestUtils;
-import org.apache.hadoop.fs.s3a.Tristate;
-import org.apache.hadoop.io.IOUtils;
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collection;
+import java.util.HashSet;
+import java.util.Set;
 
 import com.google.common.collect.Sets;
 import org.junit.After;
@@ -36,12 +34,14 @@ import org.junit.Test;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
-import java.io.IOException;
-import java.util.ArrayList;
-import java.util.Arrays;
-import java.util.Collection;
-import java.util.HashSet;
-import java.util.Set;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FileStatus;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.fs.RemoteIterator;
+import org.apache.hadoop.fs.permission.FsPermission;
+import org.apache.hadoop.fs.s3a.S3ATestUtils;
+import org.apache.hadoop.fs.s3a.Tristate;
+import org.apache.hadoop.io.IOUtils;
 
 /**
  * Main test class for MetadataStore implementations.
@@ -139,7 +139,6 @@ public abstract class MetadataStoreTestBase extends Assert {
    * MetadataStoreListFilesIterator behavior.
    * @param createNodes List of paths to create
    * @param checkNodes List of paths that the iterator should return
-   * @throws IOException
    */
   private void doTestDescendantsIterator(
       Class implementation, String[] createNodes,
@@ -504,13 +503,7 @@ public abstract class MetadataStoreTestBase extends Assert {
       assertListingsEqual(dirMeta.getListing(), "/a1/b1", "/a1/b2");
     }
 
-    // TODO
-    // 1. Add properties query to MetadataStore interface
-    // supportsAuthoritativeDirectories() or something.
-    // 2. Add "isNew" flag to MetadataStore.put(DirListingMetadata)
-    // 3. If #1 is true, assert that directory is still fully cached here.
-    // assertTrue("Created dir is fully cached", dirMeta.isAuthoritative());
-
+    // TODO HADOOP-14756 instrument MetadataStore for asserting & testing
     dirMeta = ms.listChildren(strToPath("/a1/b1"));
     if (!allowMissing() || dirMeta != null) {
       assertListingsEqual(dirMeta.getListing(), "/a1/b1/file1", "/a1/b1/file2",
@@ -599,7 +592,6 @@ public abstract class MetadataStoreTestBase extends Assert {
   /**
    * Test that the MetadataStore differentiates between the same path in two
    * different buckets.
-   * @throws Exception
    */
   @Test
   public void testMultiBucketPaths() throws Exception {
@@ -621,7 +613,7 @@ public abstract class MetadataStoreTestBase extends Assert {
     if (!allowMissing()) {
       ms.delete(new Path(p2));
       meta = ms.get(new Path(p1));
-      assertNotNull("Path should not have been deleted");
+      assertNotNull("Path should not have been deleted", meta);
     }
     ms.delete(new Path(p1));
   }
@@ -722,7 +714,8 @@ public abstract class MetadataStoreTestBase extends Assert {
    */
 
   /** Modifies paths input array and returns it. */
-  private String[] buildPathStrings(String parent, String... paths) {
+  private String[] buildPathStrings(String parent, String... paths)
+      throws IOException {
     for (int i = 0; i < paths.length; i++) {
       Path p = new Path(strToPath(parent), paths[i]);
       paths[i] = p.toString();
@@ -752,7 +745,7 @@ public abstract class MetadataStoreTestBase extends Assert {
   }
 
   private void assertListingsEqual(Collection<PathMetadata> listing,
-      String ...pathStrs) {
+      String ...pathStrs) throws IOException {
     Set<Path> a = new HashSet<>();
     for (PathMetadata meta : listing) {
       a.add(meta.getFileStatus().getPath());
@@ -768,8 +761,8 @@ public abstract class MetadataStoreTestBase extends Assert {
   private void putListStatusFiles(String dirPath, boolean authoritative,
       String... filenames) throws IOException {
     ArrayList<PathMetadata> metas = new ArrayList<>(filenames .length);
-    for (int i = 0; i < filenames.length; i++) {
-      metas.add(new PathMetadata(makeFileStatus(filenames[i], 100)));
+    for (String filename : filenames) {
+      metas.add(new PathMetadata(makeFileStatus(filename, 100)));
     }
     DirListingMetadata dirMeta =
         new DirListingMetadata(strToPath(dirPath), metas, authoritative);
@@ -825,7 +818,7 @@ public abstract class MetadataStoreTestBase extends Assert {
   /**
    * Convenience to create a fully qualified Path from string.
    */
-  Path strToPath(String p) {
+  Path strToPath(String p) throws IOException {
     final Path path = new Path(p);
     assert path.isAbsolute();
     return path.makeQualified(contract.getFileSystem().getUri(), null);

http://git-wip-us.apache.org/repos/asf/hadoop/blob/e531ae25/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/S3GuardToolTestBase.java
----------------------------------------------------------------------
diff --git a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/S3GuardToolTestBase.java b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/S3GuardToolTestBase.java
deleted file mode 100644
index 5254010..0000000
--- a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/S3GuardToolTestBase.java
+++ /dev/null
@@ -1,159 +0,0 @@
-/**
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- * <p>
- * http://www.apache.org/licenses/LICENSE-2.0
- * <p>
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.hadoop.fs.s3a.s3guard;
-
-import org.apache.hadoop.conf.Configuration;
-import org.apache.hadoop.fs.Path;
-import org.apache.hadoop.fs.contract.ContractTestUtils;
-import org.apache.hadoop.fs.s3a.AbstractS3ATestBase;
-import org.apache.hadoop.fs.s3a.Constants;
-import org.apache.hadoop.fs.s3a.S3AFileStatus;
-import org.apache.hadoop.fs.s3a.S3AFileSystem;
-import org.apache.hadoop.fs.s3a.S3ATestUtils;
-import org.junit.After;
-import org.junit.Before;
-import org.junit.Test;
-import static org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.SUCCESS;
-
-import java.io.IOException;
-import java.util.concurrent.TimeUnit;
-
-import static org.junit.Assert.assertNotNull;
-import static org.junit.Assert.assertEquals;
-
-/**
- * Common functionality for S3GuardTool test cases.
- */
-public abstract class S3GuardToolTestBase extends AbstractS3ATestBase {
-
-  protected static final String OWNER = "hdfs";
-
-  private Configuration conf;
-  private MetadataStore ms;
-  private S3AFileSystem fs;
-
-  protected Configuration getConf() {
-    return conf;
-  }
-
-  protected MetadataStore getMetadataStore() {
-    return ms;
-  }
-
-  protected S3AFileSystem getFs() {
-    return fs;
-  }
-
-  protected abstract MetadataStore newMetadataStore();
-
-  @Before
-  public void setUp() throws Exception {
-    conf = new Configuration();
-    fs = S3ATestUtils.createTestFileSystem(conf);
-    S3ATestUtils.assumeS3GuardState(true, getConf());
-    ms = newMetadataStore();
-    ms.initialize(fs);
-  }
-
-  @After
-  public void tearDown() {
-  }
-
-  protected void mkdirs(Path path, boolean onS3, boolean onMetadataStore)
-      throws IOException {
-    if (onS3) {
-      fs.mkdirs(path);
-    }
-    if (onMetadataStore) {
-      S3AFileStatus status = new S3AFileStatus(true, path, OWNER);
-      ms.put(new PathMetadata(status));
-    }
-  }
-
-  protected static void putFile(MetadataStore ms, S3AFileStatus f)
-      throws IOException {
-    assertNotNull(f);
-    ms.put(new PathMetadata(f));
-    Path parent = f.getPath().getParent();
-    while (parent != null) {
-      S3AFileStatus dir = new S3AFileStatus(false, parent, f.getOwner());
-      ms.put(new PathMetadata(dir));
-      parent = parent.getParent();
-    }
-  }
-
-  /**
-   * Create file either on S3 or in metadata store.
-   * @param path the file path.
-   * @param onS3 set to true to create the file on S3.
-   * @param onMetadataStore set to true to create the file on the
-   *                        metadata store.
-   * @throws IOException
-   */
-  protected void createFile(Path path, boolean onS3, boolean onMetadataStore)
-      throws IOException {
-    if (onS3) {
-      ContractTestUtils.touch(fs, path);
-    }
-
-    if (onMetadataStore) {
-      S3AFileStatus status = new S3AFileStatus(100L, System.currentTimeMillis(),
-          fs.qualify(path), 512L, "hdfs");
-      putFile(ms, status);
-    }
-  }
-
-  private void testPruneCommand(Configuration cmdConf, String[] args)
-      throws Exception {
-    Path parent = path("prune-cli");
-    try {
-      fs.mkdirs(parent);
-
-      S3GuardTool.Prune cmd = new S3GuardTool.Prune(cmdConf);
-      cmd.setMetadataStore(ms);
-
-      createFile(new Path(parent, "stale"), true, true);
-      Thread.sleep(TimeUnit.SECONDS.toMillis(2));
-      createFile(new Path(parent, "fresh"), true, true);
-
-      assertEquals(2, ms.listChildren(parent).getListing().size());
-      assertEquals("Prune command did not exit successfully - see output",
-          SUCCESS, cmd.run(args));
-      assertEquals(1, ms.listChildren(parent).getListing().size());
-    } finally {
-      fs.delete(parent, true);
-      ms.prune(Long.MAX_VALUE);
-    }
-  }
-
-  @Test
-  public void testPruneCommandCLI() throws Exception {
-    String testPath = path("testPruneCommandCLI").toString();
-    testPruneCommand(fs.getConf(), new String[]{"prune", "-seconds", "1",
-        testPath});
-  }
-
-  @Test
-  public void testPruneCommandConf() throws Exception {
-    conf.setLong(Constants.S3GUARD_CLI_PRUNE_AGE,
-        TimeUnit.SECONDS.toMillis(1));
-    String testPath = path("testPruneCommandConf").toString();
-    testPruneCommand(conf, new String[]{"prune", testPath});
-  }
-}

http://git-wip-us.apache.org/repos/asf/hadoop/blob/e531ae25/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/TestDirListingMetadata.java
----------------------------------------------------------------------
diff --git a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/TestDirListingMetadata.java b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/TestDirListingMetadata.java
index 3b0a059..957ebe0 100644
--- a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/TestDirListingMetadata.java
+++ b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/TestDirListingMetadata.java
@@ -18,20 +18,20 @@
 
 package org.apache.hadoop.fs.s3a.s3guard;
 
-import static org.hamcrest.CoreMatchers.*;
-import static org.junit.Assert.*;
-
 import java.util.Arrays;
 import java.util.Collections;
 import java.util.List;
 
-import org.apache.hadoop.fs.Path;
-import org.apache.hadoop.fs.s3a.S3AFileStatus;
-
 import org.junit.Rule;
 import org.junit.Test;
 import org.junit.rules.ExpectedException;
 
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.fs.s3a.S3AFileStatus;
+
+import static org.hamcrest.CoreMatchers.notNullValue;
+import static org.junit.Assert.*;
+
 /**
  * Unit tests of {@link DirListingMetadata}.
  */

http://git-wip-us.apache.org/repos/asf/hadoop/blob/e531ae25/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/TestDynamoDBMetadataStore.java
----------------------------------------------------------------------
diff --git a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/TestDynamoDBMetadataStore.java b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/TestDynamoDBMetadataStore.java
index bde624d..038a399 100644
--- a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/TestDynamoDBMetadataStore.java
+++ b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/TestDynamoDBMetadataStore.java
@@ -50,7 +50,6 @@ import org.apache.hadoop.fs.CommonConfigurationKeysPublic;
 import org.apache.hadoop.fs.FileStatus;
 import org.apache.hadoop.fs.FileSystem;
 import org.apache.hadoop.fs.Path;
-import org.apache.hadoop.fs.s3a.Constants;
 import org.apache.hadoop.fs.s3a.MockS3ClientFactory;
 import org.apache.hadoop.fs.s3a.S3AFileStatus;
 import org.apache.hadoop.fs.s3a.S3AFileSystem;
@@ -81,7 +80,7 @@ public class TestDynamoDBMetadataStore extends MetadataStoreTestBase {
       LoggerFactory.getLogger(TestDynamoDBMetadataStore.class);
   private static final String BUCKET = "TestDynamoDBMetadataStore";
   private static final String S3URI =
-      URI.create(Constants.FS_S3A + "://" + BUCKET + "/").toString();
+      URI.create(FS_S3A + "://" + BUCKET + "/").toString();
   public static final PrimaryKey
       VERSION_MARKER_PRIMARY_KEY = createVersionMarkerPrimaryKey(
       DynamoDBMetadataStore.VERSION_MARKER);
@@ -135,9 +134,9 @@ public class TestDynamoDBMetadataStore extends MetadataStoreTestBase {
           S3ClientFactory.class);
       conf.set(CommonConfigurationKeysPublic.FS_DEFAULT_NAME_KEY, S3URI);
       // setting config for creating a DynamoDBClient against local server
-      conf.set(Constants.ACCESS_KEY, "dummy-access-key");
-      conf.set(Constants.SECRET_KEY, "dummy-secret-key");
-      conf.setBoolean(Constants.S3GUARD_DDB_TABLE_CREATE_KEY, true);
+      conf.set(ACCESS_KEY, "dummy-access-key");
+      conf.set(SECRET_KEY, "dummy-secret-key");
+      conf.setBoolean(S3GUARD_DDB_TABLE_CREATE_KEY, true);
       conf.setClass(S3Guard.S3GUARD_DDB_CLIENT_FACTORY_IMPL,
           DynamoDBLocalClientFactory.class, DynamoDBClientFactory.class);
 
@@ -181,7 +180,7 @@ public class TestDynamoDBMetadataStore extends MetadataStoreTestBase {
     return (DynamoDBMetadataStore) getContract().getMetadataStore();
   }
 
-  private S3AFileSystem getFileSystem() {
+  private S3AFileSystem getFileSystem() throws IOException {
     return (S3AFileSystem) getContract().getFileSystem();
   }
 
@@ -194,13 +193,13 @@ public class TestDynamoDBMetadataStore extends MetadataStoreTestBase {
     final String tableName = "testInitializeWithFileSystem";
     final S3AFileSystem s3afs = getFileSystem();
     final Configuration conf = s3afs.getConf();
-    conf.set(Constants.S3GUARD_DDB_TABLE_NAME_KEY, tableName);
+    conf.set(S3GUARD_DDB_TABLE_NAME_KEY, tableName);
     try (DynamoDBMetadataStore ddbms = new DynamoDBMetadataStore()) {
       ddbms.initialize(s3afs);
       verifyTableInitialized(tableName);
       assertNotNull(ddbms.getTable());
       assertEquals(tableName, ddbms.getTable().getTableName());
-      String expectedRegion = conf.get(Constants.S3GUARD_DDB_REGION_KEY,
+      String expectedRegion = conf.get(S3GUARD_DDB_REGION_KEY,
           s3afs.getBucketLocation(tableName));
       assertEquals("DynamoDB table should be in configured region or the same" +
               " region as S3 bucket",
@@ -217,24 +216,24 @@ public class TestDynamoDBMetadataStore extends MetadataStoreTestBase {
   public void testInitializeWithConfiguration() throws IOException {
     final String tableName = "testInitializeWithConfiguration";
     final Configuration conf = getFileSystem().getConf();
-    conf.unset(Constants.S3GUARD_DDB_TABLE_NAME_KEY);
-    String savedRegion = conf.get(Constants.S3GUARD_DDB_REGION_KEY,
+    conf.unset(S3GUARD_DDB_TABLE_NAME_KEY);
+    String savedRegion = conf.get(S3GUARD_DDB_REGION_KEY,
         getFileSystem().getBucketLocation());
-    conf.unset(Constants.S3GUARD_DDB_REGION_KEY);
+    conf.unset(S3GUARD_DDB_REGION_KEY);
     try (DynamoDBMetadataStore ddbms = new DynamoDBMetadataStore()) {
       ddbms.initialize(conf);
       fail("Should have failed because the table name is not set!");
     } catch (IllegalArgumentException ignored) {
     }
     // config table name
-    conf.set(Constants.S3GUARD_DDB_TABLE_NAME_KEY, tableName);
-    try  (DynamoDBMetadataStore ddbms = new DynamoDBMetadataStore()){
+    conf.set(S3GUARD_DDB_TABLE_NAME_KEY, tableName);
+    try (DynamoDBMetadataStore ddbms = new DynamoDBMetadataStore()) {
       ddbms.initialize(conf);
       fail("Should have failed because as the region is not set!");
     } catch (IllegalArgumentException ignored) {
     }
     // config region
-    conf.set(Constants.S3GUARD_DDB_REGION_KEY, savedRegion);
+    conf.set(S3GUARD_DDB_REGION_KEY, savedRegion);
     try (DynamoDBMetadataStore ddbms = new DynamoDBMetadataStore()) {
       ddbms.initialize(conf);
       verifyTableInitialized(tableName);
@@ -348,12 +347,12 @@ public class TestDynamoDBMetadataStore extends MetadataStoreTestBase {
   @Test
   public void testTableVersionRequired() throws Exception {
     Configuration conf = getFileSystem().getConf();
-    int maxRetries = conf.getInt(Constants.S3GUARD_DDB_MAX_RETRIES, Constants
-        .S3GUARD_DDB_MAX_RETRIES_DEFAULT);
-    conf.setInt(Constants.S3GUARD_DDB_MAX_RETRIES, 3);
+    int maxRetries = conf.getInt(S3GUARD_DDB_MAX_RETRIES,
+        S3GUARD_DDB_MAX_RETRIES_DEFAULT);
+    conf.setInt(S3GUARD_DDB_MAX_RETRIES, 3);
 
     final DynamoDBMetadataStore ddbms = createContract(conf).getMetadataStore();
-    String tableName = conf.get(Constants.S3GUARD_DDB_TABLE_NAME_KEY, BUCKET);
+    String tableName = conf.get(S3GUARD_DDB_TABLE_NAME_KEY, BUCKET);
     Table table = verifyTableInitialized(tableName);
     table.deleteItem(VERSION_MARKER_PRIMARY_KEY);
 
@@ -361,7 +360,7 @@ public class TestDynamoDBMetadataStore extends MetadataStoreTestBase {
     intercept(IOException.class, E_NO_VERSION_MARKER,
         () -> ddbms.initTable());
 
-    conf.setInt(Constants.S3GUARD_DDB_MAX_RETRIES, maxRetries);
+    conf.setInt(S3GUARD_DDB_MAX_RETRIES, maxRetries);
   }
 
   /**
@@ -371,16 +370,15 @@ public class TestDynamoDBMetadataStore extends MetadataStoreTestBase {
   @Test
   public void testTableVersionMismatch() throws Exception {
     final DynamoDBMetadataStore ddbms = createContract().getMetadataStore();
-    String tableName = getFileSystem().getConf().get(Constants
-        .S3GUARD_DDB_TABLE_NAME_KEY, BUCKET);
+    String tableName = getFileSystem().getConf()
+        .get(S3GUARD_DDB_TABLE_NAME_KEY, BUCKET);
     Table table = verifyTableInitialized(tableName);
     table.deleteItem(VERSION_MARKER_PRIMARY_KEY);
     Item v200 = createVersionMarker(VERSION_MARKER, 200, 0);
     table.putItem(v200);
 
     // create existing table
-    intercept(IOException.class, E_INCOMPATIBLE_VERSION,
-        () -> ddbms.initTable());
+    intercept(IOException.class, E_INCOMPATIBLE_VERSION, ddbms::initTable);
   }
 
   /**
@@ -392,13 +390,12 @@ public class TestDynamoDBMetadataStore extends MetadataStoreTestBase {
     final String tableName = "testFailNonexistentTable";
     final S3AFileSystem s3afs = getFileSystem();
     final Configuration conf = s3afs.getConf();
-    conf.set(Constants.S3GUARD_DDB_TABLE_NAME_KEY, tableName);
-    conf.unset(Constants.S3GUARD_DDB_TABLE_CREATE_KEY);
-    try {
-      final DynamoDBMetadataStore ddbms = new DynamoDBMetadataStore();
+    conf.set(S3GUARD_DDB_TABLE_NAME_KEY, tableName);
+    conf.unset(S3GUARD_DDB_TABLE_CREATE_KEY);
+    try (DynamoDBMetadataStore ddbms = new DynamoDBMetadataStore()) {
       ddbms.initialize(s3afs);
-      fail("Should have failed as table does not exist and table auto-creation "
-          + "is disabled");
+      fail("Should have failed as table does not exist and table auto-creation"
+          + " is disabled");
     } catch (IOException ignored) {
     }
   }
@@ -425,11 +422,13 @@ public class TestDynamoDBMetadataStore extends MetadataStoreTestBase {
     assertTrue(status.isDirectory());
     // UNKNOWN is always a valid option, but true / false should not contradict
     if (isEmpty) {
-      assertTrue("Should not be marked non-empty",
-          rootMeta.isEmptyDirectory() != Tristate.FALSE);
+      assertNotSame("Should not be marked non-empty",
+          Tristate.FALSE,
+          rootMeta.isEmptyDirectory());
     } else {
-      assertTrue("Should not be marked empty",
-          rootMeta.isEmptyDirectory() != Tristate.TRUE);
+      assertNotSame("Should not be marked empty",
+          Tristate.TRUE,
+          rootMeta.isEmptyDirectory());
     }
   }
 
@@ -437,7 +436,7 @@ public class TestDynamoDBMetadataStore extends MetadataStoreTestBase {
    * Test that when moving nested paths, all its ancestors up to destination
    * root will also be created.
    * Here is the directory tree before move:
-   *
+   * <pre>
    * testMovePopulateAncestors
    * ├── a
    * │   └── b
@@ -448,7 +447,7 @@ public class TestDynamoDBMetadataStore extends MetadataStoreTestBase {
    * └── c
    *     └── d
    *         └── dest
-   *
+   *</pre>
    * As part of rename(a/b/src, d/c/dest), S3A will enumerate the subtree at
    * a/b/src.  This test verifies that after the move, the new subtree at
    * 'dest' is reachable from the root (i.e. c/ and c/d exist in the table.
@@ -525,7 +524,7 @@ public class TestDynamoDBMetadataStore extends MetadataStoreTestBase {
     final String tableName = "testDeleteTable";
     final S3AFileSystem s3afs = getFileSystem();
     final Configuration conf = s3afs.getConf();
-    conf.set(Constants.S3GUARD_DDB_TABLE_NAME_KEY, tableName);
+    conf.set(S3GUARD_DDB_TABLE_NAME_KEY, tableName);
     try (DynamoDBMetadataStore ddbms = new DynamoDBMetadataStore()) {
       ddbms.initialize(s3afs);
       // we can list the empty table

http://git-wip-us.apache.org/repos/asf/hadoop/blob/e531ae25/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/TestLocalMetadataStore.java
----------------------------------------------------------------------
diff --git a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/TestLocalMetadataStore.java b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/TestLocalMetadataStore.java
index 89d0498..1b765af 100644
--- a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/TestLocalMetadataStore.java
+++ b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/TestLocalMetadataStore.java
@@ -18,18 +18,18 @@
 
 package org.apache.hadoop.fs.s3a.s3guard;
 
+import java.io.IOException;
+import java.util.HashMap;
+import java.util.Map;
+
+import org.junit.Test;
+
 import org.apache.hadoop.conf.Configuration;
 import org.apache.hadoop.fs.FileStatus;
 import org.apache.hadoop.fs.FileSystem;
 import org.apache.hadoop.fs.Path;
 import org.apache.hadoop.fs.s3a.S3ATestUtils;
 
-import org.junit.Test;
-
-import java.io.IOException;
-import java.util.HashMap;
-import java.util.Map;
-
 /**
  * MetadataStore unit test for {@link LocalMetadataStore}.
  */

http://git-wip-us.apache.org/repos/asf/hadoop/blob/e531ae25/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/TestNullMetadataStore.java
----------------------------------------------------------------------
diff --git a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/TestNullMetadataStore.java b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/TestNullMetadataStore.java
index 5b19efa..c0541ea 100644
--- a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/TestNullMetadataStore.java
+++ b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/TestNullMetadataStore.java
@@ -29,14 +29,9 @@ import java.io.IOException;
 public class TestNullMetadataStore extends MetadataStoreTestBase {
   private static class NullMSContract extends AbstractMSContract {
     @Override
-    public FileSystem getFileSystem() {
+    public FileSystem getFileSystem() throws IOException {
       Configuration config = new Configuration();
-      try {
-        return FileSystem.getLocal(config);
-      } catch (IOException e) {
-        fail("Error creating LocalFileSystem");
-        return null;
-      }
+      return FileSystem.getLocal(config);
     }
 
     @Override


---------------------------------------------------------------------
To unsubscribe, e-mail: common-commits-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-commits-help@hadoop.apache.org