You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mesos.apache.org by gi...@apache.org on 2018/09/01 08:13:59 UTC

[mesos] branch master updated (66728b6 -> 6f010b6)

This is an automated email from the ASF dual-hosted git repository.

gilbert pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/mesos.git.


    from 66728b6  Moved MESOS-9185 to the 1.7.0 CHANGELOG.
     new 33710c0  Made overlay backend destroy more robust.
     new 8715f13  Made aufs backend destroy more robust.
     new 576a455  Made bind backend destroy more robust.
     new 3e4e780  Made copy backend destroy more robust.
     new f89dd15  Added a missing failure message in overlay backend.
     new ee8b917  Added MESOS-9196 to 1.7.0 CHANGELOG.
     new 1f3c4d5  Added MESOS-9196 to 1.6.2 CHANGELOG.
     new 6f010b6  Added MESOS-9196 to 1.5.2 CHANGELOG.

The 8 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 CHANGELOG                                          |  3 +++
 .../mesos/provisioner/backends/aufs.cpp            | 20 +++++++++++++-----
 .../mesos/provisioner/backends/bind.cpp            |  7 +++++--
 .../mesos/provisioner/backends/copy.cpp            | 12 ++++++++---
 .../mesos/provisioner/backends/overlay.cpp         | 24 ++++++++++++++++------
 5 files changed, 50 insertions(+), 16 deletions(-)


[mesos] 05/08: Added a missing failure message in overlay backend.

Posted by gi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

gilbert pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit f89dd15a119043832064ef9a5237a1ba173e7943
Author: Jie Yu <yu...@gmail.com>
AuthorDate: Fri Aug 31 22:29:52 2018 -0700

    Added a missing failure message in overlay backend.
    
    Review: https://reviews.apache.org/r/68598/
---
 src/slave/containerizer/mesos/provisioner/backends/overlay.cpp | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/slave/containerizer/mesos/provisioner/backends/overlay.cpp b/src/slave/containerizer/mesos/provisioner/backends/overlay.cpp
index b12b5b2..f2040cf 100644
--- a/src/slave/containerizer/mesos/provisioner/backends/overlay.cpp
+++ b/src/slave/containerizer/mesos/provisioner/backends/overlay.cpp
@@ -302,7 +302,9 @@ Future<bool> OverlayBackendProcess::destroy(
       if (realpath.isSome()) {
         Try<Nothing> rmdir = os::rmdir(realpath.get());
         if (rmdir.isError()) {
-          return Failure("");
+          return Failure(
+              "Failed to remove temporary directory for symlinks at "
+              "'" + realpath.get() + "': " + rmdir.error());
         }
 
         VLOG(1) << "Removed temporary directory '" << realpath.get()


[mesos] 04/08: Made copy backend destroy more robust.

Posted by gi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

gilbert pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit 3e4e78009d94c15409735ecf4bc2941ec7523d6b
Author: Jie Yu <yu...@gmail.com>
AuthorDate: Fri Aug 31 22:29:48 2018 -0700

    Made copy backend destroy more robust.
    
    Do not return hard failure if `rm -rf` failed. It's also possible that
    it fails if some other processes are accessing the files there. The
    cleanup will be re-attempted during provisioner destroy and agent
    recovery.
    
    Review: https://reviews.apache.org/r/68597/
---
 src/slave/containerizer/mesos/provisioner/backends/copy.cpp | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/src/slave/containerizer/mesos/provisioner/backends/copy.cpp b/src/slave/containerizer/mesos/provisioner/backends/copy.cpp
index d3eb6b4..a3237a3 100644
--- a/src/slave/containerizer/mesos/provisioner/backends/copy.cpp
+++ b/src/slave/containerizer/mesos/provisioner/backends/copy.cpp
@@ -326,9 +326,15 @@ Future<bool> CopyBackendProcess::destroy(const string& rootfs)
     .then([](const Option<int>& status) -> Future<bool> {
       if (status.isNone()) {
         return Failure("Failed to reap subprocess to destroy rootfs");
-      } else if (status.get() != 0) {
-        return Failure("Failed to destroy rootfs, exit status: " +
-                       WSTRINGIFY(status.get()));
+      }
+
+      if (status.get() != 0) {
+        // It's possible that `rm -rf` will fail if some other
+        // programs are accessing the files.  No need to return a hard
+        // failure here because the directory will be removed later
+        // and re-attempted on agent recovery.
+        LOG(ERROR) << "Failed to destroy rootfs, exit status: "
+                   << WSTRINGIFY(status.get());
       }
 
       return true;


[mesos] 02/08: Made aufs backend destroy more robust.

Posted by gi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

gilbert pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit 8715f135376719613276130f3e3af3b244ff06ee
Author: Jie Yu <yu...@gmail.com>
AuthorDate: Fri Aug 31 22:29:41 2018 -0700

    Made aufs backend destroy more robust.
    
    Use MNT_DETACH for `unmount` so that if there are still processes
    holding files or directories in the rootfs (e.g., regular filesystem
    indexing), the unmount will still be successful. The kernel will cleanup
    the mount when the number of references reach zero. Also, do not return
    hard failure if `rmdir` failed. It's also possible that `rmdir` returns
    EBUSY. See more details in MESOS-9196.
    
    Review: https://reviews.apache.org/r/68595/
---
 .../mesos/provisioner/backends/aufs.cpp              | 20 +++++++++++++++-----
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/src/slave/containerizer/mesos/provisioner/backends/aufs.cpp b/src/slave/containerizer/mesos/provisioner/backends/aufs.cpp
index c2cdd93..2eba552 100644
--- a/src/slave/containerizer/mesos/provisioner/backends/aufs.cpp
+++ b/src/slave/containerizer/mesos/provisioner/backends/aufs.cpp
@@ -251,8 +251,11 @@ Future<bool> AufsBackendProcess::destroy(
 
   foreach (const fs::MountInfoTable::Entry& entry, mountTable->entries) {
     if (entry.target == rootfs) {
-      // NOTE: This would fail if the rootfs is still in use.
-      Try<Nothing> unmount = fs::unmount(entry.target);
+      // NOTE: Use MNT_DETACH here so that if there are still
+      // processes holding files or directories in the rootfs, the
+      // unmount will still be successful. The kernel will cleanup the
+      // mount when the number of references reach zero.
+      Try<Nothing> unmount = fs::unmount(entry.target, MNT_DETACH);
       if (unmount.isError()) {
         return Failure(
             "Failed to destroy aufs-mounted rootfs '" + rootfs + "': " +
@@ -261,9 +264,16 @@ Future<bool> AufsBackendProcess::destroy(
 
       Try<Nothing> rmdir = os::rmdir(rootfs);
       if (rmdir.isError()) {
-        return Failure(
-            "Failed to remove rootfs mount point '" + rootfs + "': " +
-            rmdir.error());
+        // NOTE: Due to the use of MNT_DETACH above, it's possible
+        // that `rmdir` will fail with EBUSY if some other mounts in
+        // other mount namespaces are still on this mount point on
+        // some old kernel (https://lwn.net/Articles/570338/). No need
+        // to return a hard failure here because the directory will be
+        // removed later and re-attempted on agent recovery.
+        //
+        // TODO(jieyu): Consider only ignore EBUSY error.
+        LOG(ERROR) << "Failed to remove rootfs mount point "
+                   << "'" << rootfs << "': " << rmdir.error();
       }
 
       // Clean up tempDir used for image layer links.


[mesos] 01/08: Made overlay backend destroy more robust.

Posted by gi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

gilbert pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit 33710c0af5662c53ee8b309b4474cfa005eac2c6
Author: Jie Yu <yu...@gmail.com>
AuthorDate: Fri Aug 31 22:29:36 2018 -0700

    Made overlay backend destroy more robust.
    
    Use MNT_DETACH for `unmount` so that if there are still processes
    holding files or directories in the rootfs (e.g., regular filesystem
    indexing), the unmount will still be successful. The kernel will cleanup
    the mount when the number of references reach zero. Also, do not return
    hard failure if `rmdir` failed. It's also possible that `rmdir` returns
    EBUSY. See more details in MESOS-9196.
    
    Review: https://reviews.apache.org/r/68594/
---
 .../mesos/provisioner/backends/overlay.cpp           | 20 +++++++++++++++-----
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/src/slave/containerizer/mesos/provisioner/backends/overlay.cpp b/src/slave/containerizer/mesos/provisioner/backends/overlay.cpp
index e32b991..b12b5b2 100644
--- a/src/slave/containerizer/mesos/provisioner/backends/overlay.cpp
+++ b/src/slave/containerizer/mesos/provisioner/backends/overlay.cpp
@@ -252,8 +252,11 @@ Future<bool> OverlayBackendProcess::destroy(
 
   foreach (const fs::MountInfoTable::Entry& entry, mountTable->entries) {
     if (entry.target == rootfs) {
-      // NOTE: This would fail if the rootfs is still in use.
-      Try<Nothing> unmount = fs::unmount(entry.target);
+      // NOTE: Use MNT_DETACH here so that if there are still
+      // processes holding files or directories in the rootfs, the
+      // unmount will still be successful. The kernel will cleanup the
+      // mount when the number of references reach zero.
+      Try<Nothing> unmount = fs::unmount(entry.target, MNT_DETACH);
       if (unmount.isError()) {
         return Failure(
             "Failed to destroy overlay-mounted rootfs '" + rootfs + "': " +
@@ -262,9 +265,16 @@ Future<bool> OverlayBackendProcess::destroy(
 
       Try<Nothing> rmdir = os::rmdir(rootfs);
       if (rmdir.isError()) {
-        return Failure(
-            "Failed to remove rootfs mount point '" + rootfs + "': " +
-            rmdir.error());
+        // NOTE: Due to the use of MNT_DETACH above, it's possible
+        // that `rmdir` will fail with EBUSY if some other mounts in
+        // other mount namespaces are still on this mount point on
+        // some old kernel (https://lwn.net/Articles/570338/). No need
+        // to return a hard failure here because the directory will be
+        // removed later and re-attempted on agent recovery.
+        //
+        // TODO(jieyu): Consider only ignore EBUSY error.
+        LOG(ERROR) << "Failed to remove rootfs mount point "
+                   << "'" << rootfs << "': " << rmdir.error();
       }
 
       // Clean up tempDir used for image layer links.


[mesos] 06/08: Added MESOS-9196 to 1.7.0 CHANGELOG.

Posted by gi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

gilbert pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit ee8b9176c748d277d80fe207ecc42886ebf866df
Author: Gilbert Song <so...@gmail.com>
AuthorDate: Sat Sep 1 01:07:09 2018 -0700

    Added MESOS-9196 to 1.7.0 CHANGELOG.
---
 CHANGELOG | 1 +
 1 file changed, 1 insertion(+)

diff --git a/CHANGELOG b/CHANGELOG
index e3d8b5a..5f2b814 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -237,6 +237,7 @@ All Resolved Issues:
   * [MESOS-9177] - Mesos master segfaults when responding to /state requests.
   * [MESOS-9185] - An attempt to remove or destroy container in composing containerizer leads to segfault.
   * [MESOS-9193] - Mesos build fail with Clang 3.5.
+  * [MESOS-9196] - Removing rootfs mounts may fail with EBUSY.
 
 ** Epic
   * [MESOS-8564] - Port libprocess-tests suites to Windows


[mesos] 07/08: Added MESOS-9196 to 1.6.2 CHANGELOG.

Posted by gi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

gilbert pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit 1f3c4d5d20e5bc041f4155748a40c8ed87e5fba9
Author: Gilbert Song <so...@gmail.com>
AuthorDate: Sat Sep 1 01:07:39 2018 -0700

    Added MESOS-9196 to 1.6.2 CHANGELOG.
---
 CHANGELOG | 1 +
 1 file changed, 1 insertion(+)

diff --git a/CHANGELOG b/CHANGELOG
index 5f2b814..6b7e5e2 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -358,6 +358,7 @@ Release Notes - Mesos - Version 1.6.2 (WIP)
   * [MESOS-9146] - Agent has a fragile burn-in 5s authentication timeout.
   * [MESOS-9147] - Agent and scheduler driver authentication retry backoff time could overflow.
   * [MESOS-9170] - Zookeeper doesn't compile with newer gcc due to format error.
+  * [MESOS-9196] - Removing rootfs mounts may fail with EBUSY.
 
 
 Release Notes - Mesos - Version 1.6.1


[mesos] 08/08: Added MESOS-9196 to 1.5.2 CHANGELOG.

Posted by gi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

gilbert pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit 6f010b6b61433fd783e5619ed0cada298b8fe30e
Author: Gilbert Song <so...@gmail.com>
AuthorDate: Sat Sep 1 01:08:05 2018 -0700

    Added MESOS-9196 to 1.5.2 CHANGELOG.
---
 CHANGELOG | 1 +
 1 file changed, 1 insertion(+)

diff --git a/CHANGELOG b/CHANGELOG
index 6b7e5e2..3bdad5a 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -775,6 +775,7 @@ Release Notes - Mesos - Version 1.5.2 (WIP)
   * [MESOS-9146] - Agent has a fragile burn-in 5s authentication timeout.
   * [MESOS-9147] - Agent and scheduler driver authentication retry backoff time could overflow.
   * [MESOS-9170] - Zookeeper doesn't compile with newer gcc due to format error.
+  * [MESOS-9196] - Removing rootfs mounts may fail with EBUSY.
 
 
 Release Notes - Mesos - Version 1.5.1


[mesos] 03/08: Made bind backend destroy more robust.

Posted by gi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

gilbert pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit 576a455149cfe5721a3447a100548dae69b6d49a
Author: Jie Yu <yu...@gmail.com>
AuthorDate: Fri Aug 31 22:29:45 2018 -0700

    Made bind backend destroy more robust.
    
    Use MNT_DETACH for `unmount` so that if there are still processes
    holding files or directories in the rootfs (e.g., regular filesystem
    indexing), the unmount will still be successful. The kernel will cleanup
    the mount when the number of references reach zero. See more details in
    MESOS-9196.
    
    Review: https://reviews.apache.org/r/68596/
---
 src/slave/containerizer/mesos/provisioner/backends/bind.cpp | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/src/slave/containerizer/mesos/provisioner/backends/bind.cpp b/src/slave/containerizer/mesos/provisioner/backends/bind.cpp
index 6ad97f9..7d564dc 100644
--- a/src/slave/containerizer/mesos/provisioner/backends/bind.cpp
+++ b/src/slave/containerizer/mesos/provisioner/backends/bind.cpp
@@ -195,8 +195,11 @@ Future<bool> BindBackendProcess::destroy(const string& rootfs)
     // to check `strings::startsWith(entry.target, rootfs)` here to
     // unmount all nested mounts.
     if (entry.target == rootfs) {
-      // NOTE: This would fail if the rootfs is still in use.
-      Try<Nothing> unmount = fs::unmount(entry.target);
+      // NOTE: Use MNT_DETACH here so that if there are still
+      // processes holding files or directories in the rootfs, the
+      // unmount will still be successful. The kernel will cleanup the
+      // mount when the number of references reach zero.
+      Try<Nothing> unmount = fs::unmount(entry.target, MNT_DETACH);
       if (unmount.isError()) {
         return Failure(
             "Failed to destroy bind-mounted rootfs '" + rootfs + "': " +