You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mesos.apache.org by ji...@apache.org on 2017/04/12 19:31:53 UTC

[2/3] mesos git commit: Lazily unmount persistent volumes in MesosContainerizer.

Lazily unmount persistent volumes in MesosContainerizer.

Use MNT_DETACH when unmounting persistent volumes in Linux filesystem
isolator to workaround an issue of incorrect handling of container
destroy failures. Currently, if isolator cleanup returns a failure,
the slave will treat the container as terminated, and will schedule
the cleanup of the container's sandbox. Since the mount hasn't been
removed in the sandbox (e.g., due to EBUSY), that'll result in data in
the persistent volume being incorrectly deleted. Use MNT_DETACH so
that the mount point in the sandbox will be removed immediately.  See
MESOS-7366 for more details.

Review: https://reviews.apache.org/r/58278


Project: http://git-wip-us.apache.org/repos/asf/mesos/repo
Commit: http://git-wip-us.apache.org/repos/asf/mesos/commit/d72f3e13
Tree: http://git-wip-us.apache.org/repos/asf/mesos/tree/d72f3e13
Diff: http://git-wip-us.apache.org/repos/asf/mesos/diff/d72f3e13

Branch: refs/heads/1.0.x
Commit: d72f3e138ee0866d1d457f68315414a7af5a047e
Parents: f5721b5
Author: Jie Yu <yu...@gmail.com>
Authored: Fri Apr 7 16:33:53 2017 -0700
Committer: Jie Yu <yu...@gmail.com>
Committed: Wed Apr 12 12:31:34 2017 -0700

----------------------------------------------------------------------
 .../mesos/isolators/filesystem/linux.cpp        | 22 ++++++++++++++++++--
 1 file changed, 20 insertions(+), 2 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/mesos/blob/d72f3e13/src/slave/containerizer/mesos/isolators/filesystem/linux.cpp
----------------------------------------------------------------------
diff --git a/src/slave/containerizer/mesos/isolators/filesystem/linux.cpp b/src/slave/containerizer/mesos/isolators/filesystem/linux.cpp
index 31aa3e7..e63304e 100644
--- a/src/slave/containerizer/mesos/isolators/filesystem/linux.cpp
+++ b/src/slave/containerizer/mesos/isolators/filesystem/linux.cpp
@@ -706,6 +706,8 @@ Future<Nothing> LinuxFilesystemIsolatorProcess::cleanup(
     return Failure("Failed to get mount table: " + table.error());
   }
 
+  vector<string> unmountErrors;
+
   // Reverse unmount order to handle nested mount points.
   foreach (const fs::MountInfoTable::Entry& entry,
            adaptor::reverse(table->entries)) {
@@ -716,15 +718,31 @@ Future<Nothing> LinuxFilesystemIsolatorProcess::cleanup(
       LOG(INFO) << "Unmounting volume '" << entry.target
                 << "' for container " << containerId;
 
-      Try<Nothing> unmount = fs::unmount(entry.target);
+      // TODO(jieyu): Use MNT_DETACH here to workaround an issue of
+      // incorrect handling of container destroy failures. Currently,
+      // if isolator cleanup returns a failure, the slave will treat
+      // the container as terminated, and will schedule the cleanup of
+      // the container's sandbox. Since the mount hasn't been removed
+      // in the sandbox, that'll result in data in the persistent
+      // volume being incorrectly deleted. Use MNT_DETACH here so that
+      // the mount point in the sandbox will be removed immediately.
+      // See MESOS-7366 for more details.
+      Try<Nothing> unmount = fs::unmount(entry.target, MNT_DETACH);
       if (unmount.isError()) {
-        return Failure(
+        // NOTE: Instead of short circuit, we try to perform as many
+        // unmount as possible. We'll accumulate the errors together
+        // in the end.
+        unmountErrors.push_back(
             "Failed to unmount volume '" + entry.target +
             "': " + unmount.error());
       }
     }
   }
 
+  if (!unmountErrors.empty()) {
+    return Failure(strings::join(", ", unmountErrors));
+  }
+
   return Nothing();
 }