You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mesos.apache.org by ji...@apache.org on 2018/08/14 21:07:39 UTC

[mesos] branch 1.6.x updated (ab1606c -> 557086b)

This is an automated email from the ASF dual-hosted git repository.

jieyu pushed a change to branch 1.6.x
in repository https://gitbox.apache.org/repos/asf/mesos.git.


    from ab1606c  Added MESOS-9127 to 1.6.2 CHANGELOG.
     new 2bd73af  Made CNI isolator cleanup more robust.
     new cb2962f  Used state::checkpoint instead in CNI isolator.
     new 557086b  Added MESOS-9142 to 1.6.2 CHANGELOG.

The 3 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 CHANGELOG                                          |  1 +
 .../mesos/isolators/network/cni/cni.cpp            | 51 ++++++++++++++++------
 2 files changed, 38 insertions(+), 14 deletions(-)


[mesos] 02/03: Used state::checkpoint instead in CNI isolator.

Posted by ji...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

jieyu pushed a commit to branch 1.6.x
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit cb2962f18c4bdf959ec80c43b1c9cf6928f88733
Author: Jie Yu <yu...@gmail.com>
AuthorDate: Mon Aug 13 15:53:33 2018 -0700

    Used state::checkpoint instead in CNI isolator.
    
    This is to ensure all or nothing semantics. We don't want to deal with a
    particially written file in case agent crashes.
    
    Review: https://reviews.apache.org/r/68334
    (cherry picked from commit 64400867a984794320cd38e43dc694a863128f9e)
---
 .../containerizer/mesos/isolators/network/cni/cni.cpp | 19 ++++++++++++-------
 1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/src/slave/containerizer/mesos/isolators/network/cni/cni.cpp b/src/slave/containerizer/mesos/isolators/network/cni/cni.cpp
index 72ee161..106f062 100644
--- a/src/slave/containerizer/mesos/isolators/network/cni/cni.cpp
+++ b/src/slave/containerizer/mesos/isolators/network/cni/cni.cpp
@@ -44,6 +44,8 @@
 #include "linux/fs.hpp"
 #include "linux/ns.hpp"
 
+#include "slave/state.hpp"
+
 namespace io = process::io;
 namespace paths = mesos::internal::slave::cni::paths;
 namespace spec = mesos::internal::slave::cni::spec;
@@ -1263,13 +1265,14 @@ Future<Nothing> NetworkCniIsolatorProcess::attach(
       containerId,
       networkName);
 
-  Try<Nothing> write =
-    os::write(networkConfigPath, stringify(networkConfigJSON.get()));
+  Try<Nothing> checkpoint = slave::state::checkpoint(
+      networkConfigPath,
+      stringify(networkConfigJSON.get()));
 
-  if (write.isError()) {
+  if (checkpoint.isError()) {
     return Failure(
         "Failed to checkpoint the CNI network configuration '" +
-        stringify(networkConfigJSON.get()) + "': " + write.error());
+        stringify(networkConfigJSON.get()) + "': " + checkpoint.error());
   }
 
   VLOG(1) << "Invoking CNI plugin '" << plugin.get()
@@ -1384,11 +1387,13 @@ Future<Nothing> NetworkCniIsolatorProcess::_attach(
       networkName,
       containerNetwork.ifName);
 
-  Try<Nothing> write = os::write(networkInfoPath, output.get());
-  if (write.isError()) {
+  Try<Nothing> checkpoint =
+    slave::state::checkpoint(networkInfoPath, output.get());
+
+  if (checkpoint.isError()) {
     return Failure(
         "Failed to checkpoint the output of CNI plugin '" +
-        output.get() + "': " + write.error());
+        output.get() + "': " + checkpoint.error());
   }
 
   containerNetwork.cniNetworkInfo = parse.get();


[mesos] 01/03: Made CNI isolator cleanup more robust.

Posted by ji...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

jieyu pushed a commit to branch 1.6.x
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit 2bd73af62268c573bf4374e29e4a09b678f97dff
Author: Jie Yu <yu...@gmail.com>
AuthorDate: Mon Aug 13 15:51:30 2018 -0700

    Made CNI isolator cleanup more robust.
    
    If the container is destroyed while in isolator preparing state, the
    cleanup might fail due to missing files or directories. This patch makes
    the cleanup path in CNI isolator more robust so that the cleanup does
    not fail in those scenarios.
    
    Review: https://reviews.apache.org/r/68333
    (cherry picked from commit 3c79314e24592b6bd82457249746500d27c4e072)
---
 .../mesos/isolators/network/cni/cni.cpp            | 32 +++++++++++++++++-----
 1 file changed, 25 insertions(+), 7 deletions(-)

diff --git a/src/slave/containerizer/mesos/isolators/network/cni/cni.cpp b/src/slave/containerizer/mesos/isolators/network/cni/cni.cpp
index c4b549f..72ee161 100644
--- a/src/slave/containerizer/mesos/isolators/network/cni/cni.cpp
+++ b/src/slave/containerizer/mesos/isolators/network/cni/cni.cpp
@@ -1545,14 +1545,16 @@ Future<Nothing> NetworkCniIsolatorProcess::_cleanup(
               << target << "' for container " << containerId;
   }
 
-  Try<Nothing> rmdir = os::rmdir(containerDir);
-  if (rmdir.isError()) {
-    return Failure(
-        "Failed to remove the container directory '" +
-        containerDir + "': " + rmdir.error());
-  }
+  if (os::exists(containerDir)) {
+    Try<Nothing> rmdir = os::rmdir(containerDir);
+    if (rmdir.isError()) {
+      return Failure(
+          "Failed to remove the container directory '" +
+          containerDir + "': " + rmdir.error());
+    }
 
-  LOG(INFO) << "Removed the container directory '" << containerDir << "'";
+    LOG(INFO) << "Removed the container directory '" << containerDir << "'";
+  }
 
   infos.erase(containerId);
 
@@ -1596,6 +1598,22 @@ Future<Nothing> NetworkCniIsolatorProcess::detach(
       containerId,
       networkName);
 
+  // There are two cases that the network config file might not exist
+  // when `detach` happens:
+  // (1) The container is destroyed when preparing. In that case, we
+  //     know that `attach` hasn't been called. Therefore, no need to
+  //     call `detach`.
+  // (2) The agent crashes when isolating, but before the network
+  //     config file is checkpointed. In that case, we also know that
+  //     CNI ADD command hasn't been called. Therefore, no need to
+  //     call `detach`.
+  if (!os::exists(networkConfigPath)) {
+    LOG(WARNING) << "Skip detach since network config file for container "
+                 << containerId << " and network name '" << networkName << "' "
+                 << "does not exist";
+    return Nothing();
+  }
+
   Try<JSON::Object> networkConfigJSON = getNetworkConfigJSON(
       networkName,
       networkConfigPath);


[mesos] 03/03: Added MESOS-9142 to 1.6.2 CHANGELOG.

Posted by ji...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

jieyu pushed a commit to branch 1.6.x
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit 557086b49232b74581c399d37edb83bcb6f948d3
Author: Jie Yu <yu...@gmail.com>
AuthorDate: Tue Aug 14 14:05:44 2018 -0700

    Added MESOS-9142 to 1.6.2 CHANGELOG.
    
    (cherry picked from commit e3c4c8e5977ba391e52061bf81fedf5447dc9f5f)
---
 CHANGELOG | 1 +
 1 file changed, 1 insertion(+)

diff --git a/CHANGELOG b/CHANGELOG
index ecef4ae..892f49d 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -6,6 +6,7 @@ Release Notes - Mesos - Version 1.6.2 (WIP)
   * [MESOS-8418] - mesos-agent high cpu usage because of numerous /proc/mounts reads.
   * [MESOS-9125] - Port mapper CNI plugin might fail with "Resource temporarily unavailable"
   * [MESOS-9127] - Port mapper CNI plugin might deadlock iptables on the agent.
+  * [MESOS-9142] - CNI detach might fail due to missing network config file.
 
 
 Release Notes - Mesos - Version 1.6.1