You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mesos.apache.org by ji...@apache.org on 2018/08/14 21:07:59 UTC

[mesos] branch 1.5.x updated (3ce0e64 -> 852e27a)

This is an automated email from the ASF dual-hosted git repository.

jieyu pushed a change to branch 1.5.x
in repository https://gitbox.apache.org/repos/asf/mesos.git.


    from 3ce0e64  Added MESOS-9127 to 1.5.2 CHANGELOG.
     new e50a4c0  Made CNI isolator cleanup more robust.
     new f6bef21  Used state::checkpoint instead in CNI isolator.
     new 852e27a  Added MESOS-9142 to 1.5.2 CHANGELOG.

The 3 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 CHANGELOG                                          |  1 +
 .../mesos/isolators/network/cni/cni.cpp            | 53 ++++++++++++++++------
 2 files changed, 39 insertions(+), 15 deletions(-)


[mesos] 01/03: Made CNI isolator cleanup more robust.

Posted by ji...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

jieyu pushed a commit to branch 1.5.x
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit e50a4c065e82448038762c917179ca1fb3a14ee4
Author: Jie Yu <yu...@gmail.com>
AuthorDate: Mon Aug 13 15:51:30 2018 -0700

    Made CNI isolator cleanup more robust.
    
    If the container is destroyed while in isolator preparing state, the
    cleanup might fail due to missing files or directories. This patch makes
    the cleanup path in CNI isolator more robust so that the cleanup does
    not fail in those scenarios.
    
    Review: https://reviews.apache.org/r/68333
    (cherry picked from commit 3c79314e24592b6bd82457249746500d27c4e072)
---
 .../mesos/isolators/network/cni/cni.cpp            | 32 +++++++++++++++++-----
 1 file changed, 25 insertions(+), 7 deletions(-)

diff --git a/src/slave/containerizer/mesos/isolators/network/cni/cni.cpp b/src/slave/containerizer/mesos/isolators/network/cni/cni.cpp
index af1d477..0de336d 100644
--- a/src/slave/containerizer/mesos/isolators/network/cni/cni.cpp
+++ b/src/slave/containerizer/mesos/isolators/network/cni/cni.cpp
@@ -1520,14 +1520,16 @@ Future<Nothing> NetworkCniIsolatorProcess::_cleanup(
               << target << "' for container " << containerId;
   }
 
-  Try<Nothing> rmdir = os::rmdir(containerDir);
-  if (rmdir.isError()) {
-    return Failure(
-        "Failed to remove the container directory '" +
-        containerDir + "': " + rmdir.error());
-  }
+  if (os::exists(containerDir)) {
+    Try<Nothing> rmdir = os::rmdir(containerDir);
+    if (rmdir.isError()) {
+      return Failure(
+          "Failed to remove the container directory '" +
+          containerDir + "': " + rmdir.error());
+    }
 
-  LOG(INFO) << "Removed the container directory '" << containerDir << "'";
+    LOG(INFO) << "Removed the container directory '" << containerDir << "'";
+  }
 
   infos.erase(containerId);
 
@@ -1571,6 +1573,22 @@ Future<Nothing> NetworkCniIsolatorProcess::detach(
       containerId.value(),
       networkName);
 
+  // There are two cases that the network config file might not exist
+  // when `detach` happens:
+  // (1) The container is destroyed when preparing. In that case, we
+  //     know that `attach` hasn't been called. Therefore, no need to
+  //     call `detach`.
+  // (2) The agent crashes when isolating, but before the network
+  //     config file is checkpointed. In that case, we also know that
+  //     CNI ADD command hasn't been called. Therefore, no need to
+  //     call `detach`.
+  if (!os::exists(networkConfigPath)) {
+    LOG(WARNING) << "Skip detach since network config file for container "
+                 << containerId << " and network name '" << networkName << "' "
+                 << "does not exist";
+    return Nothing();
+  }
+
   Try<JSON::Object> networkConfigJSON = getNetworkConfigJSON(
       networkName,
       networkConfigPath);


[mesos] 02/03: Used state::checkpoint instead in CNI isolator.

Posted by ji...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

jieyu pushed a commit to branch 1.5.x
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit f6bef21962d8b8baf644e857096c2d75f211dfe7
Author: Jie Yu <yu...@gmail.com>
AuthorDate: Mon Aug 13 15:53:33 2018 -0700

    Used state::checkpoint instead in CNI isolator.
    
    This is to ensure all or nothing semantics. We don't want to deal with a
    particially written file in case agent crashes.
    
    Review: https://reviews.apache.org/r/68334
    (cherry picked from commit 64400867a984794320cd38e43dc694a863128f9e)
---
 .../mesos/isolators/network/cni/cni.cpp             | 21 +++++++++++++--------
 1 file changed, 13 insertions(+), 8 deletions(-)

diff --git a/src/slave/containerizer/mesos/isolators/network/cni/cni.cpp b/src/slave/containerizer/mesos/isolators/network/cni/cni.cpp
index 0de336d..5eaab09 100644
--- a/src/slave/containerizer/mesos/isolators/network/cni/cni.cpp
+++ b/src/slave/containerizer/mesos/isolators/network/cni/cni.cpp
@@ -14,6 +14,8 @@
 // See the License for the specific language governing permissions and
 // limitations under the License.
 
+#include "slave/containerizer/mesos/isolators/network/cni/cni.hpp"
+
 #include <iostream>
 #include <list>
 #include <set>
@@ -40,7 +42,7 @@
 #include "linux/fs.hpp"
 #include "linux/ns.hpp"
 
-#include "slave/containerizer/mesos/isolators/network/cni/cni.hpp"
+#include "slave/state.hpp"
 
 namespace io = process::io;
 namespace paths = mesos::internal::slave::cni::paths;
@@ -1241,13 +1243,14 @@ Future<Nothing> NetworkCniIsolatorProcess::attach(
       containerId.value(),
       networkName);
 
-  Try<Nothing> write =
-    os::write(networkConfigPath, stringify(networkConfigJSON.get()));
+  Try<Nothing> checkpoint = slave::state::checkpoint(
+      networkConfigPath,
+      stringify(networkConfigJSON.get()));
 
-  if (write.isError()) {
+  if (checkpoint.isError()) {
     return Failure(
         "Failed to checkpoint the CNI network configuration '" +
-        stringify(networkConfigJSON.get()) + "': " + write.error());
+        stringify(networkConfigJSON.get()) + "': " + checkpoint.error());
   }
 
   VLOG(1) << "Invoking CNI plugin '" << plugin.get()
@@ -1362,11 +1365,13 @@ Future<Nothing> NetworkCniIsolatorProcess::_attach(
       networkName,
       containerNetwork.ifName);
 
-  Try<Nothing> write = os::write(networkInfoPath, output.get());
-  if (write.isError()) {
+  Try<Nothing> checkpoint =
+    slave::state::checkpoint(networkInfoPath, output.get());
+
+  if (checkpoint.isError()) {
     return Failure(
         "Failed to checkpoint the output of CNI plugin '" +
-        output.get() + "': " + write.error());
+        output.get() + "': " + checkpoint.error());
   }
 
   containerNetwork.cniNetworkInfo = parse.get();


[mesos] 03/03: Added MESOS-9142 to 1.5.2 CHANGELOG.

Posted by ji...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

jieyu pushed a commit to branch 1.5.x
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit 852e27afb36bdebd822b89722d13df6ee73f139d
Author: Jie Yu <yu...@gmail.com>
AuthorDate: Tue Aug 14 14:05:56 2018 -0700

    Added MESOS-9142 to 1.5.2 CHANGELOG.
    
    (cherry picked from commit c633bf64d08be31affbd72889478ae2a2c4fa258)
---
 CHANGELOG | 1 +
 1 file changed, 1 insertion(+)

diff --git a/CHANGELOG b/CHANGELOG
index 70a12fd..3724929 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -22,6 +22,7 @@ Release Notes - Mesos - Version 1.5.2 (WIP)
   * [MESOS-9049] - Agent GC could unmount a dangling persistent volume multiple times.
   * [MESOS-9125] - Port mapper CNI plugin might fail with "Resource temporarily unavailable"
   * [MESOS-9127] - Port mapper CNI plugin might deadlock iptables on the agent.
+  * [MESOS-9142] - CNI detach might fail due to missing network config file.
 
 
 Release Notes - Mesos - Version 1.5.1