You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mesos.apache.org by gi...@apache.org on 2018/09/01 08:14:34 UTC

[mesos] branch 1.6.x updated (d2a344c -> 81ebab5)

This is an automated email from the ASF dual-hosted git repository.

gilbert pushed a change to branch 1.6.x
in repository https://gitbox.apache.org/repos/asf/mesos.git.


    from d2a344c  Added MESOS-9170 to 1.6.2 CHANGELOG.
     new 01c9859  Made overlay backend destroy more robust.
     new ede59ed  Made aufs backend destroy more robust.
     new a1a229b  Made bind backend destroy more robust.
     new 363bb84  Made copy backend destroy more robust.
     new 81ebab5  Added MESOS-9196 to 1.6.2 CHANGELOG.

The 5 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 CHANGELOG                                            |  1 +
 .../mesos/provisioner/backends/aufs.cpp              | 20 +++++++++++++++-----
 .../mesos/provisioner/backends/bind.cpp              |  7 +++++--
 .../mesos/provisioner/backends/copy.cpp              | 12 +++++++++---
 .../mesos/provisioner/backends/overlay.cpp           | 20 +++++++++++++++-----
 5 files changed, 45 insertions(+), 15 deletions(-)


[mesos] 04/05: Made copy backend destroy more robust.

Posted by gi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

gilbert pushed a commit to branch 1.6.x
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit 363bb8467f5c25c77fe68dbd11ebec7f1cf1b350
Author: Jie Yu <yu...@gmail.com>
AuthorDate: Fri Aug 31 22:29:48 2018 -0700

    Made copy backend destroy more robust.
    
    Do not return hard failure if `rm -rf` failed. It's also possible that
    it fails if some other processes are accessing the files there. The
    cleanup will be re-attempted during provisioner destroy and agent
    recovery.
    
    Review: https://reviews.apache.org/r/68597/
---
 src/slave/containerizer/mesos/provisioner/backends/copy.cpp | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/src/slave/containerizer/mesos/provisioner/backends/copy.cpp b/src/slave/containerizer/mesos/provisioner/backends/copy.cpp
index c48e93f..f6b0ece 100644
--- a/src/slave/containerizer/mesos/provisioner/backends/copy.cpp
+++ b/src/slave/containerizer/mesos/provisioner/backends/copy.cpp
@@ -327,9 +327,15 @@ Future<bool> CopyBackendProcess::destroy(const string& rootfs)
     .then([](const Option<int>& status) -> Future<bool> {
       if (status.isNone()) {
         return Failure("Failed to reap subprocess to destroy rootfs");
-      } else if (status.get() != 0) {
-        return Failure("Failed to destroy rootfs, exit status: " +
-                       WSTRINGIFY(status.get()));
+      }
+
+      if (status.get() != 0) {
+        // It's possible that `rm -rf` will fail if some other
+        // programs are accessing the files.  No need to return a hard
+        // failure here because the directory will be removed later
+        // and re-attempted on agent recovery.
+        LOG(ERROR) << "Failed to destroy rootfs, exit status: "
+                   << WSTRINGIFY(status.get());
       }
 
       return true;


[mesos] 02/05: Made aufs backend destroy more robust.

Posted by gi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

gilbert pushed a commit to branch 1.6.x
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit ede59ed2b7c33c176d2e5a66d91d8c02203d96a6
Author: Jie Yu <yu...@gmail.com>
AuthorDate: Fri Aug 31 22:29:41 2018 -0700

    Made aufs backend destroy more robust.
    
    Use MNT_DETACH for `unmount` so that if there are still processes
    holding files or directories in the rootfs (e.g., regular filesystem
    indexing), the unmount will still be successful. The kernel will cleanup
    the mount when the number of references reach zero. Also, do not return
    hard failure if `rmdir` failed. It's also possible that `rmdir` returns
    EBUSY. See more details in MESOS-9196.
    
    Review: https://reviews.apache.org/r/68595/
---
 .../mesos/provisioner/backends/aufs.cpp              | 20 +++++++++++++++-----
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/src/slave/containerizer/mesos/provisioner/backends/aufs.cpp b/src/slave/containerizer/mesos/provisioner/backends/aufs.cpp
index c2cdd93..2eba552 100644
--- a/src/slave/containerizer/mesos/provisioner/backends/aufs.cpp
+++ b/src/slave/containerizer/mesos/provisioner/backends/aufs.cpp
@@ -251,8 +251,11 @@ Future<bool> AufsBackendProcess::destroy(
 
   foreach (const fs::MountInfoTable::Entry& entry, mountTable->entries) {
     if (entry.target == rootfs) {
-      // NOTE: This would fail if the rootfs is still in use.
-      Try<Nothing> unmount = fs::unmount(entry.target);
+      // NOTE: Use MNT_DETACH here so that if there are still
+      // processes holding files or directories in the rootfs, the
+      // unmount will still be successful. The kernel will cleanup the
+      // mount when the number of references reach zero.
+      Try<Nothing> unmount = fs::unmount(entry.target, MNT_DETACH);
       if (unmount.isError()) {
         return Failure(
             "Failed to destroy aufs-mounted rootfs '" + rootfs + "': " +
@@ -261,9 +264,16 @@ Future<bool> AufsBackendProcess::destroy(
 
       Try<Nothing> rmdir = os::rmdir(rootfs);
       if (rmdir.isError()) {
-        return Failure(
-            "Failed to remove rootfs mount point '" + rootfs + "': " +
-            rmdir.error());
+        // NOTE: Due to the use of MNT_DETACH above, it's possible
+        // that `rmdir` will fail with EBUSY if some other mounts in
+        // other mount namespaces are still on this mount point on
+        // some old kernel (https://lwn.net/Articles/570338/). No need
+        // to return a hard failure here because the directory will be
+        // removed later and re-attempted on agent recovery.
+        //
+        // TODO(jieyu): Consider only ignore EBUSY error.
+        LOG(ERROR) << "Failed to remove rootfs mount point "
+                   << "'" << rootfs << "': " << rmdir.error();
       }
 
       // Clean up tempDir used for image layer links.


[mesos] 05/05: Added MESOS-9196 to 1.6.2 CHANGELOG.

Posted by gi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

gilbert pushed a commit to branch 1.6.x
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit 81ebab5a539c1e38a36be882ecfa23c4bed7cf3a
Author: Gilbert Song <so...@gmail.com>
AuthorDate: Sat Sep 1 01:07:39 2018 -0700

    Added MESOS-9196 to 1.6.2 CHANGELOG.
---
 CHANGELOG | 1 +
 1 file changed, 1 insertion(+)

diff --git a/CHANGELOG b/CHANGELOG
index 373db28..ae1ef2c 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -14,6 +14,7 @@ Release Notes - Mesos - Version 1.6.2 (WIP)
   * [MESOS-9146] - Agent has a fragile burn-in 5s authentication timeout.
   * [MESOS-9147] - Agent and scheduler driver authentication retry backoff time could overflow.
   * [MESOS-9170] - Zookeeper doesn't compile with newer gcc due to format error.
+  * [MESOS-9196] - Removing rootfs mounts may fail with EBUSY.
 
 
 Release Notes - Mesos - Version 1.6.1


[mesos] 03/05: Made bind backend destroy more robust.

Posted by gi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

gilbert pushed a commit to branch 1.6.x
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit a1a229b381d5c50a9f182ef3727a72e5ad7b0823
Author: Jie Yu <yu...@gmail.com>
AuthorDate: Fri Aug 31 22:29:45 2018 -0700

    Made bind backend destroy more robust.
    
    Use MNT_DETACH for `unmount` so that if there are still processes
    holding files or directories in the rootfs (e.g., regular filesystem
    indexing), the unmount will still be successful. The kernel will cleanup
    the mount when the number of references reach zero. See more details in
    MESOS-9196.
    
    Review: https://reviews.apache.org/r/68596/
---
 src/slave/containerizer/mesos/provisioner/backends/bind.cpp | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/src/slave/containerizer/mesos/provisioner/backends/bind.cpp b/src/slave/containerizer/mesos/provisioner/backends/bind.cpp
index 6ad97f9..7d564dc 100644
--- a/src/slave/containerizer/mesos/provisioner/backends/bind.cpp
+++ b/src/slave/containerizer/mesos/provisioner/backends/bind.cpp
@@ -195,8 +195,11 @@ Future<bool> BindBackendProcess::destroy(const string& rootfs)
     // to check `strings::startsWith(entry.target, rootfs)` here to
     // unmount all nested mounts.
     if (entry.target == rootfs) {
-      // NOTE: This would fail if the rootfs is still in use.
-      Try<Nothing> unmount = fs::unmount(entry.target);
+      // NOTE: Use MNT_DETACH here so that if there are still
+      // processes holding files or directories in the rootfs, the
+      // unmount will still be successful. The kernel will cleanup the
+      // mount when the number of references reach zero.
+      Try<Nothing> unmount = fs::unmount(entry.target, MNT_DETACH);
       if (unmount.isError()) {
         return Failure(
             "Failed to destroy bind-mounted rootfs '" + rootfs + "': " +


[mesos] 01/05: Made overlay backend destroy more robust.

Posted by gi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

gilbert pushed a commit to branch 1.6.x
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit 01c98593fec1ae763ad496b03e11bfbca9fb50d0
Author: Jie Yu <yu...@gmail.com>
AuthorDate: Fri Aug 31 22:29:36 2018 -0700

    Made overlay backend destroy more robust.
    
    Use MNT_DETACH for `unmount` so that if there are still processes
    holding files or directories in the rootfs (e.g., regular filesystem
    indexing), the unmount will still be successful. The kernel will cleanup
    the mount when the number of references reach zero. Also, do not return
    hard failure if `rmdir` failed. It's also possible that `rmdir` returns
    EBUSY. See more details in MESOS-9196.
    
    Review: https://reviews.apache.org/r/68594/
---
 .../mesos/provisioner/backends/overlay.cpp           | 20 +++++++++++++++-----
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/src/slave/containerizer/mesos/provisioner/backends/overlay.cpp b/src/slave/containerizer/mesos/provisioner/backends/overlay.cpp
index e32b991..b12b5b2 100644
--- a/src/slave/containerizer/mesos/provisioner/backends/overlay.cpp
+++ b/src/slave/containerizer/mesos/provisioner/backends/overlay.cpp
@@ -252,8 +252,11 @@ Future<bool> OverlayBackendProcess::destroy(
 
   foreach (const fs::MountInfoTable::Entry& entry, mountTable->entries) {
     if (entry.target == rootfs) {
-      // NOTE: This would fail if the rootfs is still in use.
-      Try<Nothing> unmount = fs::unmount(entry.target);
+      // NOTE: Use MNT_DETACH here so that if there are still
+      // processes holding files or directories in the rootfs, the
+      // unmount will still be successful. The kernel will cleanup the
+      // mount when the number of references reach zero.
+      Try<Nothing> unmount = fs::unmount(entry.target, MNT_DETACH);
       if (unmount.isError()) {
         return Failure(
             "Failed to destroy overlay-mounted rootfs '" + rootfs + "': " +
@@ -262,9 +265,16 @@ Future<bool> OverlayBackendProcess::destroy(
 
       Try<Nothing> rmdir = os::rmdir(rootfs);
       if (rmdir.isError()) {
-        return Failure(
-            "Failed to remove rootfs mount point '" + rootfs + "': " +
-            rmdir.error());
+        // NOTE: Due to the use of MNT_DETACH above, it's possible
+        // that `rmdir` will fail with EBUSY if some other mounts in
+        // other mount namespaces are still on this mount point on
+        // some old kernel (https://lwn.net/Articles/570338/). No need
+        // to return a hard failure here because the directory will be
+        // removed later and re-attempted on agent recovery.
+        //
+        // TODO(jieyu): Consider only ignore EBUSY error.
+        LOG(ERROR) << "Failed to remove rootfs mount point "
+                   << "'" << rootfs << "': " << rmdir.error();
       }
 
       // Clean up tempDir used for image layer links.