You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@kudu.apache.org by ad...@apache.org on 2018/01/22 05:38:58 UTC

kudu git commit: KUDU-2202: support for removing data directories (take two)

Repository: kudu
Updated Branches:
  refs/heads/master b81d5569a -> 415170808


KUDU-2202: support for removing data directories (take two)

This commit extends "kudu fs update_dirs" to allow the removal of data
directories. A data dir may be removed provided there are no tablets
configured to use it. The --force flag may be used to override this safety
check; tablets that depend on the removed data dir will fail to bootstrap
the next time the server is started.

I'll be the first to admit that this functionality isn't terribly useful at
the moment because Kudu deployments are configured to stripe data across all
data directories, so removing a data dir will fail all tablets, which is
equivalent to reformatting the node. Nevertheless, it will become more
useful as users customize the new --fs_target_data_dirs_per_tablet gflag.

This is the second attempt at implementing this functionality. The first
used a different interface and was vulnerable to a data loss issue
documented in KUDU-2202. Andrew fixed that in commit 47b81c4, which reopens
the door for safely removing data directories.

I considered a stronger check that only prohibits data dir removal if there
were data blocks on the to-be-removed data dir, but decided against it:
- It's rare to find a tablet configured to use a data dir but without any
  actual blocks on it. In fact, that's only likely for empty tables, or
  tables with empty range partitions.
- We'd still need to rewrite any unaffected tablet superblocks and remove
  the data dir from their data dir groups.

There are several interesting modifications here:
1. The DataDirManager and FsManager's update_on_disk boolean option has been
   converted into a tri-state. The new IGNORE_INCONSISTENCY state is used by
   the CLI tool to create a "speculative" FsManager using the new data dir
   configuration. This FsManager is used to check whether existing tablets
   depend on the data dir to be removed without making destructive changes
   to on-disk data structures.
3. The DataDirManager now checks for integrity after making on-disk changes.
   While not strictly necessary, this seems like a good sanity check to
   avoid introducing bugs.
4. The "kudu fs update_dirs" action was updated for data dir removal. It now
   accepts the --force option, without which removal will only be allowed if
   no tablets are configured to use the removed data dir.

Change-Id: Iacb650115bcdf055542ef2777641a3201fc8e30b
Reviewed-on: http://gerrit.cloudera.org:8080/8978
Reviewed-by: Andrew Wong <aw...@cloudera.com>
Tested-by: Kudu Jenkins


Project: http://git-wip-us.apache.org/repos/asf/kudu/repo
Commit: http://git-wip-us.apache.org/repos/asf/kudu/commit/41517080
Tree: http://git-wip-us.apache.org/repos/asf/kudu/tree/41517080
Diff: http://git-wip-us.apache.org/repos/asf/kudu/diff/41517080

Branch: refs/heads/master
Commit: 4151708087cbd385765cd160afb693c78a688ca5
Parents: b81d556
Author: Adar Dembo <ad...@cloudera.com>
Authored: Mon Jan 8 18:28:15 2018 -0800
Committer: Adar Dembo <ad...@cloudera.com>
Committed: Mon Jan 22 05:38:37 2018 +0000

----------------------------------------------------------------------
 src/kudu/fs/data_dirs.cc         |  56 +++++++---
 src/kudu/fs/data_dirs.h          |  41 ++++---
 src/kudu/fs/fs_manager-test.cc   | 203 +++++++++++++++++++++++++++++-----
 src/kudu/fs/fs_manager.cc        |  20 ++--
 src/kudu/fs/fs_manager.h         |  25 +++--
 src/kudu/tools/kudu-tool-test.cc | 130 ++++++++++++++++++++--
 src/kudu/tools/tool_action_fs.cc |  54 ++++++++-
 7 files changed, 439 insertions(+), 90 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/kudu/blob/41517080/src/kudu/fs/data_dirs.cc
----------------------------------------------------------------------
diff --git a/src/kudu/fs/data_dirs.cc b/src/kudu/fs/data_dirs.cc
index 46cae9c..f883797 100644
--- a/src/kudu/fs/data_dirs.cc
+++ b/src/kudu/fs/data_dirs.cc
@@ -338,7 +338,7 @@ Status DataDirGroup::CopyToPB(const UuidByUuidIndexMap& uuid_by_uuid_idx,
 DataDirManagerOptions::DataDirManagerOptions()
     : block_manager_type(FLAGS_block_manager),
       read_only(false),
-      update_on_disk(false) {
+      consistency_check(ConsistencyCheckBehavior::ENFORCE_CONSISTENCY) {
 }
 
 ////////////////////////////////////////////////////////////
@@ -360,7 +360,8 @@ DataDirManager::DataDirManager(Env* env,
       canonicalized_data_fs_roots_(std::move(canonicalized_data_roots)),
       rng_(GetRandomSeed32()) {
   DCHECK_GT(canonicalized_data_fs_roots_.size(), 0);
-  DCHECK(!opts_.update_on_disk || !opts_.read_only);
+  DCHECK(opts_.consistency_check != ConsistencyCheckBehavior::UPDATE_ON_DISK ||
+         !opts_.read_only);
 
   if (opts_.metric_entity) {
     metrics_.reset(new DataDirMetrics(opts_.metric_entity));
@@ -591,9 +592,10 @@ Status DataDirManager::LoadInstances(
     } else {
       // This may return OK and mark 'instance' as failed.
       Status s = instance->LoadFromDisk();
-      if (s.IsNotFound() && opts_.update_on_disk) {
-        // A missing instance is only tolerated if we've been asked to add new
-        // data directories.
+      if (s.IsNotFound() &&
+          opts_.consistency_check != ConsistencyCheckBehavior::ENFORCE_CONSISTENCY) {
+        // A missing instance is not tolerated if we've been asked to enforce
+        // consistency checks.
         missing_roots_tmp.emplace_back(root.path);
         continue;
       }
@@ -645,17 +647,11 @@ Status DataDirManager::Open() {
   vector<unique_ptr<PathInstanceMetadataFile>> loaded_instances;
   RETURN_NOT_OK(LoadInstances(&missing_roots, &loaded_instances));
 
-  // Check the integrity of all loaded instances.
-  RETURN_NOT_OK_PREPEND(
-      PathInstanceMetadataFile::CheckIntegrity(loaded_instances),
-      Substitute("could not verify integrity of files: $0",
-                 JoinStrings(GetDataDirs(), ",")));
-
-  // Create any missing data directories, if desired.
-  if (!missing_roots.empty()) {
+  // Add new or remove existing data directories, if desired.
+  if (opts_.consistency_check == ConsistencyCheckBehavior::UPDATE_ON_DISK) {
     if (opts_.block_manager_type == "file") {
       return Status::InvalidArgument(
-          "file block manager may not add new data directories");
+          "file block manager may not add or remove data directories");
     }
 
     // Prepare to create new directories and update existing instances. We
@@ -686,6 +682,7 @@ Status DataDirManager::Open() {
 
     // Now that we've created the missing directories, try loading the
     // directories again.
+    //
     // Note: 'loaded_instances' must be cleared to unlock the instance files.
     loaded_instances.clear();
     missing_roots.clear();
@@ -693,6 +690,14 @@ Status DataDirManager::Open() {
     DCHECK(missing_roots.empty());
   }
 
+  // Check the integrity of all loaded instances.
+  if (opts_.consistency_check != ConsistencyCheckBehavior::IGNORE_INCONSISTENCY) {
+    RETURN_NOT_OK_PREPEND(
+        PathInstanceMetadataFile::CheckIntegrity(loaded_instances),
+        Substitute("could not verify integrity of files: $0",
+                   JoinStrings(GetDataDirs(), ",")));
+  }
+
   // All instances are present and accounted for. Time to create the in-memory
   // data directory structures.
   int i = 0;
@@ -778,7 +783,11 @@ Status DataDirManager::Open() {
         break;
       }
     }
-    DCHECK_NE(idx, -1); // Guaranteed by CheckIntegrity().
+    if (idx == -1) {
+      return Status::IOError(Substitute(
+          "corrupt path set for data directory $0: uuid $1 not found in path set",
+          dd->dir(), path_set.uuid()));
+    }
     if (idx > kMaxDataDirs) {
       return Status::NotSupported(
           Substitute("block manager supports a maximum of $0 paths", kMaxDataDirs));
@@ -794,7 +803,22 @@ Status DataDirManager::Open() {
   for (int uuid_idx = 0; uuid_idx < path_set.all_uuids_size(); uuid_idx++) {
     if (!ContainsKey(uuid_by_idx, uuid_idx)) {
       const string& unassigned_uuid = path_set.all_uuids(uuid_idx);
-      DCHECK_LT(failed_dir_idx, unassigned_dirs.size());
+      if (failed_dir_idx == unassigned_dirs.size()) {
+        // This uuid's data directory is missing outright, which can only
+        // happen when opts_.consistency_check is IGNORE_INCONSISTENCY (i.e. if
+        // this DataDirManager is trialing a new data dir configuration that
+        // removes an existing data dir). In such a case, we don't bother
+        // tracking the missing directory at all.
+        //
+        // If opts_.consistency_check is UPDATE_ON_DISK, the on-disk instances
+        // were updated and reloaded above, ensuring that the list of data
+        // directories now excludes the missing directory.
+        //
+        // If opts_.consistency_check is ENFORCE_CONSISTENCY, the
+        // CheckIntegrity call above requires that on-disk instances and the
+        // list of data directories match, so there cannot be a missing data directory.
+        continue;
+      }
       insert_to_maps(uuid_idx, unassigned_uuid, unassigned_dirs[failed_dir_idx]);
 
       // Record the directory as failed.

http://git-wip-us.apache.org/repos/asf/kudu/blob/41517080/src/kudu/fs/data_dirs.h
----------------------------------------------------------------------
diff --git a/src/kudu/fs/data_dirs.h b/src/kudu/fs/data_dirs.h
index 8dcf371..0398e15 100644
--- a/src/kudu/fs/data_dirs.h
+++ b/src/kudu/fs/data_dirs.h
@@ -115,6 +115,21 @@ enum class DataDirFsType {
   OTHER
 };
 
+// Defines the behavior of the consistency checks performed when the directory
+// manager is opened.
+enum class ConsistencyCheckBehavior {
+  // If the data directories don't match the on-disk path sets, fail.
+  ENFORCE_CONSISTENCY,
+
+  // If the data directories don't match the on-disk path sets, update the
+  // on-disk data to match. The directory manager must not be read-only.
+  UPDATE_ON_DISK,
+
+  // If the data directories don't match the on-disk path sets, continue
+  // without updating the on-disk data.
+  IGNORE_INCONSISTENCY
+};
+
 struct DataDirMetrics {
   explicit DataDirMetrics(const scoped_refptr<MetricEntity>& entity);
 
@@ -213,14 +228,11 @@ struct DataDirManagerOptions {
   // Defaults to false.
   bool read_only;
 
-  // Whether or not the provided directories should be considered the new
-  // canonical set. If true, on-disk data structures will be updated
-  // accordingly when the DataDirManager is opened.
-  //
-  // Defaults to false. If true, 'read_only' must be false.
+  // The behavior to use when comparing the provided data directories to the
+  // on-disk path sets.
   //
-  // Currently only supports adding new data directories.
-  bool update_on_disk;
+  // Defaults to ENFORCE_CONSISTENCY.
+  ConsistencyCheckBehavior consistency_check;
 };
 
 // Encapsulates knowledge of data directory management on behalf of block
@@ -363,8 +375,7 @@ class DataDirManager {
   // Representation Conversion
   // ==========================================================================
 
-  // Finds a data directory by uuid index, returning nullptr if it can't be
-  // found.
+  // Finds a data directory by uuid index, returning null if it can't be found.
   //
   // More information on uuid indexes and their relation to data directories
   // can be found next to PathSetPB in fs.proto.
@@ -407,11 +418,15 @@ class DataDirManager {
   // metadata directory (i.e. the first one) fails to load.
   Status Open();
 
-  // Loads the instance files for each directory, populating 'missing_roots'
-  // with the data roots that are missing instance files, and
-  // 'loaded_instances' with successfully loaded instances.
+  // Loads the instance files for each data directory.
+  //
+  // If consistency checking is disabled, missing instance files are tolerated
+  // and will be written to 'missing_roots' on success.
+  //
+  // On success, 'loaded_instances' contains all successfully loaded instance files.
   //
-  // Returns an error if the metadata directory fails to load.
+  // Returns an error if an instance file fails to load or if all instance
+  // files are unhealthy.
   Status LoadInstances(std::vector<std::string>* missing_roots,
       std::vector<std::unique_ptr<PathInstanceMetadataFile>>* loaded_instances);
 

http://git-wip-us.apache.org/repos/asf/kudu/blob/41517080/src/kudu/fs/fs_manager-test.cc
----------------------------------------------------------------------
diff --git a/src/kudu/fs/fs_manager-test.cc b/src/kudu/fs/fs_manager-test.cc
index 8157aae..e21fe2c 100644
--- a/src/kudu/fs/fs_manager-test.cc
+++ b/src/kudu/fs/fs_manager-test.cc
@@ -21,6 +21,7 @@
 #include <cstddef>
 #include <cstdint>
 #include <iostream>
+#include <iterator>
 #include <memory>
 #include <set>
 #include <string>
@@ -47,11 +48,13 @@
 #include "kudu/util/flags.h"
 #include "kudu/util/oid_generator.h"
 #include "kudu/util/path_util.h"
+#include "kudu/util/random.h"
 #include "kudu/util/slice.h"
 #include "kudu/util/status.h"
 #include "kudu/util/test_macros.h"
 #include "kudu/util/test_util.h"
 
+using kudu::fs::ConsistencyCheckBehavior;
 using std::shared_ptr;
 using std::string;
 using std::unique_ptr;
@@ -299,7 +302,7 @@ TEST_F(FsManagerTestBase, TestMetadataDirInDataRoot) {
   // Adding a data dir to the front of the FS root list (i.e. such that the
   // metadata root is no longer at the front) will prevent Kudu from starting.
   opts.data_roots = { GetTestPath("data2"), GetTestPath("data1") };
-  opts.update_on_disk = true;
+  opts.consistency_check = ConsistencyCheckBehavior::UPDATE_ON_DISK;
   ReinitFsManagerWithOpts(opts);
   Status s = fs_manager()->Open();
   ASSERT_STR_CONTAINS(s.ToString(), "could not verify required directory");
@@ -533,7 +536,7 @@ TEST_F(FsManagerTestBase, TestOpenFailsWhenMissingImportantDir) {
 }
 
 
-TEST_F(FsManagerTestBase, TestAddDataDirs) {
+TEST_F(FsManagerTestBase, TestAddRemoveDataDirs) {
   if (FLAGS_block_manager == "file") {
     LOG(INFO) << "Skipping test, file block manager not supported";
     return;
@@ -550,24 +553,31 @@ TEST_F(FsManagerTestBase, TestAddDataDirs) {
   ASSERT_STR_CONTAINS(s.ToString(), fs_manager()->GetInstanceMetadataPath(new_path1));
 
   // This time allow new data directories to be created.
-  opts.update_on_disk = true;
+  opts.consistency_check = ConsistencyCheckBehavior::UPDATE_ON_DISK;
   ReinitFsManagerWithOpts(opts);
   ASSERT_OK(fs_manager()->Open());
   ASSERT_EQ(2, fs_manager()->dd_manager()->GetDataDirs().size());
 
   // Try to open with a data dir removed; this should fail.
   opts.data_roots = { fs_root_ };
+  opts.consistency_check = ConsistencyCheckBehavior::ENFORCE_CONSISTENCY;
   ReinitFsManagerWithOpts(opts);
   s = fs_manager()->Open();
   ASSERT_TRUE(s.IsIOError());
   ASSERT_STR_CONTAINS(s.ToString(), "could not verify integrity of files");
 
+  // This time allow data directories to be removed.
+  opts.consistency_check = ConsistencyCheckBehavior::UPDATE_ON_DISK;
+  ReinitFsManagerWithOpts(opts);
+  ASSERT_OK(fs_manager()->Open());
+  ASSERT_EQ(1, fs_manager()->dd_manager()->GetDataDirs().size());
+
   // We should be able to add new directories anywhere in the list.
   const string new_path2 = GetTestPath("new_path2");
-  opts.data_roots = { new_path2, fs_root_, new_path1 };
+  opts.data_roots = { new_path2, fs_root_ };
   ReinitFsManagerWithOpts(opts);
   ASSERT_OK(fs_manager()->Open());
-  ASSERT_EQ(3, fs_manager()->dd_manager()->GetDataDirs().size());
+  ASSERT_EQ(2, fs_manager()->dd_manager()->GetDataDirs().size());
 
   // Try to open with a new data dir although an existing data dir has failed;
   // this should fail.
@@ -578,7 +588,7 @@ TEST_F(FsManagerTestBase, TestAddDataDirs) {
     FLAGS_env_inject_eio_globs = JoinPathSegments(new_path2, "**");
 
     const string new_path3 = GetTestPath("new_path3");
-    opts.data_roots = { fs_root_, new_path2, new_path1, new_path3 };
+    opts.data_roots = { fs_root_, new_path2, new_path3 };
     ReinitFsManagerWithOpts(opts);
     Status s = fs_manager()->Open();
     ASSERT_TRUE(s.IsIOError());
@@ -586,7 +596,121 @@ TEST_F(FsManagerTestBase, TestAddDataDirs) {
   }
 }
 
-TEST_F(FsManagerTestBase, TestAddDataDirsFuzz) {
+TEST_F(FsManagerTestBase, TestReAddRemovedDataDir) {
+  if (FLAGS_block_manager == "file") {
+    LOG(INFO) << "Skipping test, file block manager not supported";
+    return;
+  }
+
+  // Add a new data directory, remove it, and add it back.
+  const string new_path1 = GetTestPath("new_path1");
+  for (const auto& data_roots : vector<vector<string>>({{ fs_root_, new_path1 },
+                                                        { fs_root_ },
+                                                        { fs_root_, new_path1 }})) {
+    FsManagerOpts opts;
+    opts.wal_root = fs_root_;
+    opts.data_roots = data_roots;
+    opts.consistency_check = ConsistencyCheckBehavior::UPDATE_ON_DISK;
+    ReinitFsManagerWithOpts(std::move(opts));
+    ASSERT_OK(fs_manager()->Open());
+    ASSERT_EQ(data_roots.size(), fs_manager()->dd_manager()->GetDataDirs().size());
+  }
+}
+
+TEST_F(FsManagerTestBase, TestCannotRemoveDataDirServingAsMetadataDir) {
+  if (FLAGS_block_manager == "file") {
+    LOG(INFO) << "Skipping test, file block manager not supported";
+    return;
+  }
+
+  // Create a new fs layout with a metadata root explicitly set to the first
+  // data root.
+  ASSERT_OK(env_->DeleteRecursively(fs_root_));
+  ASSERT_OK(env_->CreateDir(fs_root_));
+
+  FsManagerOpts opts;
+  opts.data_roots = { JoinPathSegments(fs_root_, "data1"),
+                      JoinPathSegments(fs_root_, "data2") };
+  opts.metadata_root = opts.data_roots[0];
+  opts.wal_root = JoinPathSegments(fs_root_, "wal");
+  ReinitFsManagerWithOpts(opts);
+  ASSERT_OK(fs_manager()->CreateInitialFileSystemLayout());
+  ASSERT_OK(fs_manager()->Open());
+
+  // Stop specifying the metadata root. The FsManager will automatically look
+  // for and find it in the first data root.
+  opts.metadata_root.clear();
+  ReinitFsManagerWithOpts(opts);
+  ASSERT_OK(fs_manager()->Open());
+
+  // Now try to remove the first data root. This should fail because in the
+  // absence of a defined metadata root, the FsManager will try looking for it
+  // in the wal root (not found), and the first data dir (not found).
+  opts.data_roots = { opts.data_roots[1] };
+  opts.consistency_check = ConsistencyCheckBehavior::UPDATE_ON_DISK;
+  ReinitFsManagerWithOpts(opts);
+  Status s = fs_manager()->Open();
+  ASSERT_TRUE(s.IsNotFound());
+  ASSERT_STR_CONTAINS(s.ToString(), "could not verify required directory");
+}
+
+TEST_F(FsManagerTestBase, TestAddRemoveSpeculative) {
+  if (FLAGS_block_manager == "file") {
+    LOG(INFO) << "Skipping test, file block manager not supported";
+    return;
+  }
+
+  // Add a second data directory.
+  const string new_path1 = GetTestPath("new_path1");
+  FsManagerOpts opts;
+  opts.wal_root = fs_root_;
+  opts.data_roots = { fs_root_, new_path1 };
+  opts.consistency_check = ConsistencyCheckBehavior::UPDATE_ON_DISK;
+  ReinitFsManagerWithOpts(opts);
+  ASSERT_OK(fs_manager()->Open());
+  ASSERT_EQ(2, fs_manager()->dd_manager()->GetDataDirs().size());
+
+  // Create a 'speculative' FsManager with the second data directory removed.
+  opts.data_roots = { fs_root_ };
+  opts.consistency_check = ConsistencyCheckBehavior::IGNORE_INCONSISTENCY;
+  ReinitFsManagerWithOpts(opts);
+  ASSERT_OK(fs_manager()->Open());
+  ASSERT_EQ(1, fs_manager()->dd_manager()->GetDataDirs().size());
+
+  // Do the same thing, but with a new data directory added.
+  const string new_path2 = GetTestPath("new_path2");
+  opts.data_roots = { fs_root_, new_path1, new_path2 };
+  ReinitFsManagerWithOpts(opts);
+  ASSERT_OK(fs_manager()->Open());
+  ASSERT_EQ(3, fs_manager()->dd_manager()->GetDataDirs().size());
+
+  // Neither of those attempts should have changed the on-disk state. Verify
+  // this by retrying all three combinations with consistency checking
+  // re-enabled. Only the two-directory case should pass.
+  opts.consistency_check = ConsistencyCheckBehavior::ENFORCE_CONSISTENCY;
+  for (const auto& data_roots : vector<vector<string>>({{ fs_root_ },
+                                                        { fs_root_, new_path1 },
+                                                        { fs_root_, new_path1, new_path2 }})) {
+    opts.data_roots = data_roots;
+    ReinitFsManagerWithOpts(opts);
+    Status s = fs_manager()->Open();
+    if (data_roots.size() == 1) {
+      // The first data directory's path set refers to a data directory that
+      // wasn't in data_roots.
+      ASSERT_TRUE(s.IsIOError()) << s.ToString();
+      ASSERT_STR_CONTAINS(s.ToString(), "could not verify integrity of files");
+    } else if (data_roots.size() == 2) {
+      ASSERT_OK(s);
+      ASSERT_EQ(2, fs_manager()->dd_manager()->GetDataDirs().size());
+    } else {
+      // The third data directory has no instance file.
+      ASSERT_TRUE(s.IsNotFound()) << s.ToString();
+      ASSERT_STR_CONTAINS(s.ToString(), "No such file or directory");
+    }
+  }
+}
+
+TEST_F(FsManagerTestBase, TestAddRemoveDataDirsFuzz) {
   const int kNumAttempts = AllowSlowTests() ? 1000 : 100;
 
   if (FLAGS_block_manager == "file") {
@@ -594,16 +718,36 @@ TEST_F(FsManagerTestBase, TestAddDataDirsFuzz) {
     return;
   }
 
+  Random rng_(SeedRandom());
+
   FsManagerOpts fs_opts;
   fs_opts.wal_root = fs_root_;
   fs_opts.data_roots = { fs_root_ };
   for (int i = 0; i < kNumAttempts; i++) {
-    // Create a new fs root.
-    string new_data_root = GetTestPath(Substitute("new_data_$0", i));
-    fs_opts.data_roots.emplace_back(new_data_root);
+    // Randomly create a directory to add, or choose an existing one to remove.
+    //
+    // Note: we skip removing the last data directory because the FsManager
+    // treats that as a signal to use the wal root as the sole data root.
+    vector<string> old_data_roots = fs_opts.data_roots;
+    bool action_was_add;
+    string fs_root;
+    if (rng_.Uniform(2) == 0 || fs_opts.data_roots.size() == 1) {
+      action_was_add = true;
+      fs_root = GetTestPath(Substitute("new_data_$0", i));
+      fs_opts.data_roots.emplace_back(fs_root);
+    } else {
+      action_was_add = false;
+      DCHECK_GT(fs_opts.data_roots.size(), 1);
+      int removed_idx = rng_.Uniform(fs_opts.data_roots.size());
+      fs_root = fs_opts.data_roots[removed_idx];
+      auto iter = fs_opts.data_roots.begin();
+      std::advance(iter, removed_idx);
+      fs_opts.data_roots.erase(iter);
+    }
 
-    // Try to add it with failure injection enabled.
-    bool add_succeeded;
+    // Try to add or remove it with failure injection enabled.
+    LOG(INFO) << Substitute("$0ing $1", action_was_add ? "Add" : "Remov", fs_root);
+    bool update_succeeded;
     {
       google::FlagSaver saver;
       FLAGS_crash_on_eio = false;
@@ -611,35 +755,36 @@ TEST_F(FsManagerTestBase, TestAddDataDirsFuzz) {
       // This value isn't arbitrary: most attempts fail and only some succeed.
       FLAGS_env_inject_eio = 0.01;
 
-      fs_opts.update_on_disk = true;
+      fs_opts.consistency_check = ConsistencyCheckBehavior::UPDATE_ON_DISK;
       ReinitFsManagerWithOpts(fs_opts);
-      add_succeeded = fs_manager()->Open().ok();
+      update_succeeded = fs_manager()->Open().ok();
     }
 
     // Reopen regardless, to ensure that failures didn't corrupt anything.
-    fs_opts.update_on_disk = false;
+    fs_opts.consistency_check = ConsistencyCheckBehavior::ENFORCE_CONSISTENCY;
     ReinitFsManagerWithOpts(fs_opts);
     Status open_status = fs_manager()->Open();
-    if (add_succeeded) {
+    if (update_succeeded) {
       ASSERT_OK(open_status);
     }
 
-    // The rollback logic built into the add fs root operation isn't robust
-    // enough to handle every possible sequence of injected failures. Let's see
-    // if we need to apply a "workaround" in order to fix the filesystem.
+    // The rollback logic built into the update operation isn't robust enough
+    // to handle every possible sequence of injected failures. Let's see if we
+    // need to apply a "workaround" in order to fix the filesystem.
+
     if (!open_status.ok()) {
-      // Failed? Perhaps the new fs root and data directory were created, but
-      // there was an injected failure later on, which led to a rollback and
-      // the removal of the new fs instance file.
+      // Perhaps a new fs root and data directory were created, but there was
+      // an injected failure later on, which led to a rollback and the removal
+      // of the new fs instance file.
       //
       // Fix this as a user might (by copying the original fs instance file
       // into the new fs root) then retry.
       string source_instance = fs_manager()->GetInstanceMetadataPath(fs_root_);
       bool is_dir;
-      Status s = env_->IsDirectory(new_data_root, &is_dir);
+      Status s = env_->IsDirectory(fs_root, &is_dir);
       if (s.ok()) {
         ASSERT_TRUE(is_dir);
-        string new_instance = fs_manager()->GetInstanceMetadataPath(new_data_root);
+        string new_instance = fs_manager()->GetInstanceMetadataPath(fs_root);
         if (!env_->FileExists(new_instance)) {
           WritableFileOptions wr_opts;
           wr_opts.mode = Env::CREATE_NON_EXISTING;
@@ -651,11 +796,11 @@ TEST_F(FsManagerTestBase, TestAddDataDirsFuzz) {
     }
     if (!open_status.ok()) {
       // Still failing? Unfortunately, there's not enough information to know
-      // whether the injected failure occurred during creation of the new data
-      // directory or just afterwards, when the DataDirManager reopened itself.
-      // If the former, the add itself failed and the failure should resolve if
-      // we drop the new fs root and retry.
-      fs_opts.data_roots.pop_back();
+      // whether the injected failure occurred during the update or just
+      // afterwards, when the DataDirManager reloaded the data directory
+      // instance files. If the former, the failure should resolve itself if we
+      // restore the old data roots.
+      fs_opts.data_roots = old_data_roots;
       ReinitFsManagerWithOpts(fs_opts);
       open_status = fs_manager()->Open();
     }

http://git-wip-us.apache.org/repos/asf/kudu/blob/41517080/src/kudu/fs/fs_manager.cc
----------------------------------------------------------------------
diff --git a/src/kudu/fs/fs_manager.cc b/src/kudu/fs/fs_manager.cc
index df61bf6..380169d 100644
--- a/src/kudu/fs/fs_manager.cc
+++ b/src/kudu/fs/fs_manager.cc
@@ -98,6 +98,7 @@ DEFINE_string(fs_metadata_dir, "",
 TAG_FLAG(fs_metadata_dir, stable);
 
 using kudu::fs::BlockManagerOptions;
+using kudu::fs::ConsistencyCheckBehavior;
 using kudu::fs::CreateBlockOptions;
 using kudu::fs::DataDirManager;
 using kudu::fs::DataDirManagerOptions;
@@ -137,7 +138,7 @@ FsManagerOpts::FsManagerOpts()
     metadata_root(FLAGS_fs_metadata_dir),
     block_manager_type(FLAGS_block_manager),
     read_only(false),
-    update_on_disk(false) {
+    consistency_check(ConsistencyCheckBehavior::ENFORCE_CONSISTENCY) {
   data_roots = strings::Split(FLAGS_fs_data_dirs, ",", strings::SkipEmpty());
 }
 
@@ -146,21 +147,21 @@ FsManagerOpts::FsManagerOpts(const string& root)
     data_roots({ root }),
     block_manager_type(FLAGS_block_manager),
     read_only(false),
-    update_on_disk(false) {}
+    consistency_check(ConsistencyCheckBehavior::ENFORCE_CONSISTENCY) {}
 
 FsManager::FsManager(Env* env, const string& root_path)
   : env_(DCHECK_NOTNULL(env)),
     opts_(FsManagerOpts(root_path)),
     error_manager_(new FsErrorManager()),
-    initted_(false) {
-}
+    initted_(false) {}
 
 FsManager::FsManager(Env* env, FsManagerOpts opts)
   : env_(DCHECK_NOTNULL(env)),
     opts_(std::move(opts)),
     error_manager_(new FsErrorManager()),
     initted_(false) {
-  DCHECK(!opts_.update_on_disk || !opts_.read_only);
+  DCHECK(opts_.consistency_check != ConsistencyCheckBehavior::UPDATE_ON_DISK ||
+         !opts_.read_only);
 }
 
 FsManager::~FsManager() {}
@@ -325,7 +326,8 @@ Status FsManager::Open(FsReport* report) {
     Status s = pb_util::ReadPBContainerFromPath(env_, GetInstanceMetadataPath(root.path),
                                                 pb.get());
     if (PREDICT_FALSE(!s.ok())) {
-      if (s.IsNotFound() && opts_.update_on_disk) {
+      if (s.IsNotFound() &&
+          opts_.consistency_check != ConsistencyCheckBehavior::ENFORCE_CONSISTENCY) {
         missing_roots.emplace_back(root);
         continue;
       }
@@ -379,8 +381,8 @@ Status FsManager::Open(FsReport* report) {
     }
   });
 
-  // Create any missing roots.
-  if (!missing_roots.empty()) {
+  // Create any missing roots, if desired.
+  if (opts_.consistency_check == ConsistencyCheckBehavior::UPDATE_ON_DISK) {
     RETURN_NOT_OK_PREPEND(CreateFileSystemRoots(
         missing_roots, *metadata_, &created_dirs, &created_files),
                           "unable to create missing filesystem roots");
@@ -401,7 +403,7 @@ Status FsManager::Open(FsReport* report) {
     dm_opts.metric_entity = opts_.metric_entity;
     dm_opts.block_manager_type = opts_.block_manager_type;
     dm_opts.read_only = opts_.read_only;
-    dm_opts.update_on_disk = opts_.update_on_disk;
+    dm_opts.consistency_check = opts_.consistency_check;
     LOG_TIMING(INFO, "opening directory manager") {
       RETURN_NOT_OK(DataDirManager::OpenExisting(env_,
           canonicalized_data_fs_roots_, std::move(dm_opts), &dd_manager_));

http://git-wip-us.apache.org/repos/asf/kudu/blob/41517080/src/kudu/fs/fs_manager.h
----------------------------------------------------------------------
diff --git a/src/kudu/fs/fs_manager.h b/src/kudu/fs/fs_manager.h
index dc158bd..955d159 100644
--- a/src/kudu/fs/fs_manager.h
+++ b/src/kudu/fs/fs_manager.h
@@ -73,14 +73,16 @@ struct FsManagerOpts {
   // Should only be used in unit tests.
   explicit FsManagerOpts(const std::string& root);
 
-  // The entity under which all metrics should be grouped. If NULL, metrics
+  // The entity under which all metrics should be grouped. If null, metrics
   // will not be produced.
   //
-  // Defaults to NULL.
+  // Defaults to null.
   scoped_refptr<MetricEntity> metric_entity;
 
   // The memory tracker under which all new memory trackers will be parented.
-  // If NULL, new memory trackers will be parented to the root tracker.
+  // If null, new memory trackers will be parented to the root tracker.
+  //
+  // Defaults to null.
   std::shared_ptr<MemTracker> parent_mem_tracker;
 
   // The directory root where WALs will be stored. Cannot be empty.
@@ -97,20 +99,19 @@ struct FsManagerOpts {
   std::string metadata_root;
 
   // The block manager type. Must be either "file" or "log".
+  //
   // Defaults to the value of FLAGS_block_manager.
   std::string block_manager_type;
 
-  // Whether or not read-write operations should be allowed. Defaults to false.
+  // Whether or not read-write operations should be allowed.
+  //
+  // Defaults to false.
   bool read_only;
 
-  // Whether or not the contents of 'data_roots' should be considered the new
-  // canonical set. If true, on-disk data structures will be updated
-  // accordingly when the FsManager is opened.
-  //
-  // If true, 'read_only' must be false.
+  // The behavior to use when comparing 'data_roots' to the on-disk path sets.
   //
-  // TODO(KUDU-2202): only supports adding new data directories.
-  bool update_on_disk;
+  // Defaults to ENFORCE_CONSISTENCY.
+  fs::ConsistencyCheckBehavior consistency_check;
 };
 
 // FsManager provides helpers to read data and metadata files,
@@ -138,7 +139,7 @@ class FsManager {
   // inconsistencies. If found, and if the FsManager was not constructed in
   // read-only mode, an attempt will be made to repair them.
   //
-  // If 'report' is not nullptr, it will be populated with the results of the
+  // If 'report' is not null, it will be populated with the results of the
   // check (and repair, if applicable); otherwise, the results of the check
   // will be logged and the presence of fatal inconsistencies will manifest as
   // a returned error.

http://git-wip-us.apache.org/repos/asf/kudu/blob/41517080/src/kudu/tools/kudu-tool-test.cc
----------------------------------------------------------------------
diff --git a/src/kudu/tools/kudu-tool-test.cc b/src/kudu/tools/kudu-tool-test.cc
index 493c60d..bec122a 100644
--- a/src/kudu/tools/kudu-tool-test.cc
+++ b/src/kudu/tools/kudu-tool-test.cc
@@ -181,10 +181,10 @@ class ToolTest : public KuduTest {
   }
 
   Status RunTool(const string& arg_str,
-                 string* stdout,
-                 string* stderr,
-                 vector<string>* stdout_lines,
-                 vector<string>* stderr_lines) const {
+                 string* stdout = nullptr,
+                 string* stderr = nullptr,
+                 vector<string>* stdout_lines = nullptr,
+                 vector<string>* stderr_lines = nullptr) const {
     string out;
     string err;
     Status s = RunKuduTool(strings::Split(arg_str, " ", strings::SkipEmpty()),
@@ -230,6 +230,10 @@ class ToolTest : public KuduTest {
     ASSERT_OK(s);
   }
 
+  Status RunActionStderrString(const string& arg_str, string* stderr) const {
+    return RunTool(arg_str, nullptr, stderr, nullptr, nullptr);
+  }
+
   void RunActionStdoutLines(const string& arg_str, vector<string>* stdout_lines) const {
     string stderr;
     Status s = RunTool(arg_str, nullptr, &stderr, stdout_lines, nullptr);
@@ -2129,7 +2133,7 @@ TEST_P(ControlShellToolTest, TestControlShell) {
 }
 
 static void CreateTableWithFlushedData(const string& table_name,
-                                        InternalMiniCluster* cluster) {
+                                       InternalMiniCluster* cluster) {
   // Use a schema with a high number of columns to encourage the creation of
   // many data blocks.
   KuduSchemaBuilder schema_builder;
@@ -2169,7 +2173,10 @@ static void CreateTableWithFlushedData(const string& table_name,
   }
 }
 
-TEST_F(ToolTest, TestFsAddDataDirEndToEnd) {
+TEST_F(ToolTest, TestFsAddRemoveDataDirEndToEnd) {
+  const string kTableFoo = "foo";
+  const string kTableBar = "bar";
+
   if (FLAGS_block_manager == "file") {
     LOG(INFO) << "Skipping test, only log block manager is supported";
     return;
@@ -2181,7 +2188,7 @@ TEST_F(ToolTest, TestFsAddDataDirEndToEnd) {
   NO_FATALS(StartMiniCluster(std::move(opts)));
 
   // Create a table and flush some test data to it.
-  NO_FATALS(CreateTableWithFlushedData("foo", mini_cluster_.get()));
+  NO_FATALS(CreateTableWithFlushedData(kTableFoo, mini_cluster_.get()));
 
   // Add a new data directory.
   MiniTabletServer* mts = mini_cluster_->mini_tablet_server(0);
@@ -2195,6 +2202,8 @@ TEST_F(ToolTest, TestFsAddDataDirEndToEnd) {
       wal_root, JoinStrings(data_roots, ","))));
 
   // Reconfigure the tserver to use the newly added data directory and restart it.
+  //
+  // Note: WaitStarted() will return a bad status if any tablets fail to bootstrap.
   mts->options()->fs_opts.data_roots = data_roots;
   ASSERT_OK(mts->Start());
   ASSERT_OK(mts->WaitStarted());
@@ -2203,10 +2212,115 @@ TEST_F(ToolTest, TestFsAddDataDirEndToEnd) {
   // data to the newly added data directory.
   uint64_t disk_space_used_before;
   ASSERT_OK(env_->GetFileSizeOnDiskRecursively(to_add, &disk_space_used_before));
-  NO_FATALS(CreateTableWithFlushedData("bar", mini_cluster_.get()));
+  NO_FATALS(CreateTableWithFlushedData(kTableBar, mini_cluster_.get()));
   uint64_t disk_space_used_after;
   ASSERT_OK(env_->GetFileSizeOnDiskRecursively(to_add, &disk_space_used_after));
   ASSERT_GT(disk_space_used_after, disk_space_used_before);
+
+  // Try to remove the newly added data directory. This will fail because
+  // tablets from the second table are configured to use it.
+  mts->Shutdown();
+  data_roots.pop_back();
+  string stderr;
+  ASSERT_TRUE(RunActionStderrString(Substitute(
+      "fs update_dirs --fs_wal_dir=$0 --fs_data_dirs=$1",
+      wal_root, JoinStrings(data_roots, ",")), &stderr).IsRuntimeError());
+  ASSERT_STR_CONTAINS(
+      stderr, "Not found: cannot update data directories: at least one "
+      "tablet is configured to use removed data directory. Retry with --force "
+      "to override this");
+
+  // Make sure the failure really had no effect.
+  ASSERT_OK(mts->Start());
+  ASSERT_OK(mts->WaitStarted());
+  mts->Shutdown();
+
+  // If we force the removal it'll succeed, but the tserver will fail to
+  // bootstrap some tablets when restarted.
+  NO_FATALS(RunActionStdoutNone(Substitute(
+      "fs update_dirs --fs_wal_dir=$0 --fs_data_dirs=$1 --force",
+      wal_root, JoinStrings(data_roots, ","))));
+  mts->options()->fs_opts.data_roots = data_roots;
+  ASSERT_OK(mts->Start());
+  Status s = mts->WaitStarted();
+  ASSERT_TRUE(s.IsNotFound());
+  ASSERT_STR_CONTAINS(s.ToString(), "one or more data dirs may have been removed");
+
+  // Tablets belonging to the first table should still be OK, but those in the
+  // second table should have all failed.
+  {
+    vector<scoped_refptr<TabletReplica>> replicas;
+    mts->server()->tablet_manager()->GetTabletReplicas(&replicas);
+    ASSERT_GT(replicas.size(), 0);
+    for (const auto& r : replicas) {
+      const string& table_name = r->tablet_metadata()->table_name();
+      if (table_name == kTableFoo) {
+        ASSERT_EQ(tablet::RUNNING, r->state());
+      } else {
+        ASSERT_EQ(kTableBar, table_name);
+        ASSERT_EQ(tablet::FAILED, r->state());
+      }
+    }
+  }
+
+  // Add the removed data directory back. All tablets should successfully
+  // bootstrap.
+  mts->Shutdown();
+  data_roots.emplace_back(to_add);
+  NO_FATALS(RunActionStdoutNone(Substitute(
+      "fs update_dirs --fs_wal_dir=$0 --fs_data_dirs=$1",
+      wal_root, JoinStrings(data_roots, ","))));
+  mts->options()->fs_opts.data_roots = data_roots;
+  ASSERT_OK(mts->Start());
+  ASSERT_OK(mts->WaitStarted());
+  mts->Shutdown();
+
+  // Remove it again so that the second table's tablets fail once again.
+  data_roots.pop_back();
+  NO_FATALS(RunActionStdoutNone(Substitute(
+      "fs update_dirs --fs_wal_dir=$0 --fs_data_dirs=$1 --force",
+      wal_root, JoinStrings(data_roots, ","))));
+  mts->options()->fs_opts.data_roots = data_roots;
+  ASSERT_OK(mts->Start());
+  s = mts->WaitStarted();
+  ASSERT_TRUE(s.IsNotFound());
+  ASSERT_STR_CONTAINS(s.ToString(), "one or more data dirs may have been removed");
+
+  // Delete the second table and wait for all of its tablets to be deleted.
+  shared_ptr<client::KuduClient> client;
+  ASSERT_OK(mini_cluster_->CreateClient(nullptr, &client));
+  ASSERT_OK(client->DeleteTable(kTableBar));
+  ASSERT_EVENTUALLY([&]{
+    vector<scoped_refptr<TabletReplica>> replicas;
+    mts->server()->tablet_manager()->GetTabletReplicas(&replicas);
+    for (const auto& r : replicas) {
+      ASSERT_NE(kTableBar, r->tablet_metadata()->table_name());
+    }
+  });
+
+  // Shut down the tserver, add a new data directory, and restart it. There
+  // should be no bootstrapping errors because the second table (whose tablets
+  // were configured to use the removed data directory) is gone.
+  mts->Shutdown();
+  data_roots.emplace_back(JoinPathSegments(DirName(data_roots.back()), "data-new2"));
+  NO_FATALS(RunActionStdoutNone(Substitute(
+      "fs update_dirs --fs_wal_dir=$0 --fs_data_dirs=$1",
+      wal_root, JoinStrings(data_roots, ","))));
+  mts->options()->fs_opts.data_roots = data_roots;
+  ASSERT_OK(mts->Start());
+  ASSERT_OK(mts->WaitStarted());
+
+  // Shut down again, delete the newly added data directory, and restart. No
+  // errors, because the first table's tablets were never configured to use the
+  // new data directory.
+  mts->Shutdown();
+  data_roots.pop_back();
+  NO_FATALS(RunActionStdoutNone(Substitute(
+      "fs update_dirs --fs_wal_dir=$0 --fs_data_dirs=$1",
+      wal_root, JoinStrings(data_roots, ","))));
+  mts->options()->fs_opts.data_roots = data_roots;
+  ASSERT_OK(mts->Start());
+  ASSERT_OK(mts->WaitStarted());
 }
 
 TEST_F(ToolTest, TestDumpFSWithNonDefaultMetadataDir) {

http://git-wip-us.apache.org/repos/asf/kudu/blob/41517080/src/kudu/tools/tool_action_fs.cc
----------------------------------------------------------------------
diff --git a/src/kudu/tools/tool_action_fs.cc b/src/kudu/tools/tool_action_fs.cc
index 16e1de5..57adf22 100644
--- a/src/kudu/tools/tool_action_fs.cc
+++ b/src/kudu/tools/tool_action_fs.cc
@@ -45,6 +45,8 @@
 #include "kudu/common/types.h"
 #include "kudu/fs/block_id.h"
 #include "kudu/fs/block_manager.h"
+#include "kudu/fs/data_dirs.h"
+#include "kudu/fs/fs.pb.h"
 #include "kudu/fs/fs_manager.h"
 #include "kudu/fs/fs_report.h"
 #include "kudu/gutil/gscoped_ptr.h"
@@ -73,6 +75,7 @@
 #include "kudu/util/slice.h"
 #include "kudu/util/status.h"
 
+DECLARE_bool(force);
 DECLARE_bool(print_meta);
 DECLARE_string(columns);
 
@@ -104,6 +107,7 @@ using cfile::CFileIterator;
 using cfile::CFileReader;
 using cfile::ReaderOptions;
 using fs::BlockDeletionTransaction;
+using fs::ConsistencyCheckBehavior;
 using fs::FsReport;
 using fs::ReadableBlock;
 using std::cout;
@@ -314,10 +318,49 @@ Status DumpFsTree(const RunnerContext& /*context*/) {
   return Status::OK();
 }
 
+Status CheckForTabletsThatWillFailWithUpdate() {
+  FsManagerOpts opts;
+  opts.read_only = true;
+  opts.consistency_check = ConsistencyCheckBehavior::IGNORE_INCONSISTENCY;
+  FsManager fs(Env::Default(), std::move(opts));
+  RETURN_NOT_OK(fs.Open());
+
+  vector<string> tablet_ids;
+  RETURN_NOT_OK(fs.ListTabletIds(&tablet_ids));
+  for (const auto& t : tablet_ids) {
+    scoped_refptr<TabletMetadata> meta;
+    RETURN_NOT_OK(TabletMetadata::Load(&fs, t, &meta));
+    DataDirGroupPB group;
+    RETURN_NOT_OK_PREPEND(
+        fs.dd_manager()->GetDataDirGroupPB(t, &group),
+        "at least one tablet is configured to use removed data directory. "
+        "Retry with --force to override this");
+  }
+  return Status::OK();
+}
+
 Status Update(const RunnerContext& /*context*/) {
   Env* env = Env::Default();
+
+  // First, ensure that if we're removing a data directory, no existing tablets
+  // are configured to use it. We approximate this by creating an FsManager
+  // that reflects the new data directory configuration, loading all tablet
+  // metadata, and retrieving all tablets' data dir groups. If a data dir group
+  // cannot be retrieved, it is assumed that it's because the tablet is
+  // configured to use a data directory that's now missing.
+  //
+  // If the user specifies --force, we assume they know what they're doing and
+  // skip this check.
+  Status s = CheckForTabletsThatWillFailWithUpdate();
+  if (FLAGS_force) {
+    WARN_NOT_OK(s, "continuing due to --force");
+  } else {
+    RETURN_NOT_OK_PREPEND(s, "cannot update data directories");
+  }
+
+  // Now perform the update.
   FsManagerOpts opts;
-  opts.update_on_disk = true;
+  opts.consistency_check = ConsistencyCheckBehavior::UPDATE_ON_DISK;
   FsManager fs(env, std::move(opts));
   return fs.Open();
 }
@@ -824,7 +867,12 @@ unique_ptr<Mode> BuildFsMode() {
   unique_ptr<Action> update =
       ActionBuilder("update_dirs", &Update)
       .Description("Updates the set of data directories in an existing Kudu filesystem")
-      .ExtraDescription("Cannot currently be used to remove data directories")
+      .ExtraDescription("If a data directory is in use by a tablet and is "
+          "removed, the operation will fail unless --force is also used")
+      .AddOptionalParameter("force", boost::none, string("If true, permits "
+          "the removal of a data directory that is configured for use by "
+          "existing tablets. Those tablets will fail the next time the server "
+          "is started"))
       .AddOptionalParameter("fs_data_dirs")
       .AddOptionalParameter("fs_metadata_dir")
       .AddOptionalParameter("fs_wal_dir")
@@ -861,10 +909,10 @@ unique_ptr<Mode> BuildFsMode() {
   return ModeBuilder("fs")
       .Description("Operate on a local Kudu filesystem")
       .AddMode(BuildFsDumpMode())
-      .AddAction(std::move(update))
       .AddAction(std::move(check))
       .AddAction(std::move(format))
       .AddAction(std::move(list))
+      .AddAction(std::move(update))
       .Build();
 }