You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mesos.apache.org by ab...@apache.org on 2019/05/01 10:26:31 UTC

[mesos] branch 1.4.x updated (1a76202 -> f05058d)

This is an automated email from the ASF dual-hosted git repository.

abudnik pushed a change to branch 1.4.x
in repository https://gitbox.apache.org/repos/asf/mesos.git.


    from 1a76202  Added MESOS-9159 and MESOS-9675 to the 1.4.4 CHANGELOG.
     new ca65f99  Removed the duplicate pid check in Docker containerizer.
     new f05058d  Added MESOS-9695 to the 1.4.4 CHANGELOG.

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 CHANGELOG                          |  1 +
 src/slave/containerizer/docker.cpp | 27 ++++++---------------------
 2 files changed, 7 insertions(+), 21 deletions(-)


[mesos] 02/02: Added MESOS-9695 to the 1.4.4 CHANGELOG.

Posted by ab...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

abudnik pushed a commit to branch 1.4.x
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit f05058dad7219950fd9bdc2748ab9c9a79d6e7f1
Author: Andrei Budnik <ab...@mesosphere.com>
AuthorDate: Wed May 1 12:25:30 2019 +0200

    Added MESOS-9695 to the 1.4.4 CHANGELOG.
---
 CHANGELOG | 1 +
 1 file changed, 1 insertion(+)

diff --git a/CHANGELOG b/CHANGELOG
index 0f32baa..0ce1715 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -4,6 +4,7 @@ Release Notes - Mesos - Version 1.4.4 (WIP)
 
 ** Bug
   * [MESOS-9507] - Agent could not recover due to empty docker volume checkpointed files.
+  * [MESOS-9695] - Remove the duplicate pid check in Docker containerizer
 
 ** Improvement:
   * [MESOS-9159] - Support Foreign URLs in docker registry puller.


[mesos] 01/02: Removed the duplicate pid check in Docker containerizer.

Posted by ab...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

abudnik pushed a commit to branch 1.4.x
in repository https://gitbox.apache.org/repos/asf/mesos.git

commit ca65f991727494b9b78e64a19306231ac004289f
Author: Qian Zhang <zh...@gmail.com>
AuthorDate: Tue Apr 30 13:23:26 2019 +0200

    Removed the duplicate pid check in Docker containerizer.
    
    Review: https://reviews.apache.org/r/70561/
---
 src/slave/containerizer/docker.cpp | 27 ++++++---------------------
 1 file changed, 6 insertions(+), 21 deletions(-)

diff --git a/src/slave/containerizer/docker.cpp b/src/slave/containerizer/docker.cpp
index 31a47f0..97cd75e 100644
--- a/src/slave/containerizer/docker.cpp
+++ b/src/slave/containerizer/docker.cpp
@@ -925,10 +925,6 @@ Future<Nothing> DockerContainerizerProcess::_recover(
       }
     }
 
-    // Collection of pids that we've started reaping in order to
-    // detect very unlikely duplicate scenario (see below).
-    hashmap<ContainerID, pid_t> pids;
-
     foreachvalue (const FrameworkState& framework, state->frameworks) {
       foreachvalue (const ExecutorState& executor, framework.executors) {
         if (executor.info.isNone()) {
@@ -1007,9 +1003,12 @@ Future<Nothing> DockerContainerizerProcess::_recover(
 
         // Only reap the executor process if the executor can be connected
         // otherwise just set `container->status` to `None()`. This is to
-        // avoid reaping an irrelevant process, e.g., after the agent host is
-        // rebooted, the executor pid happens to be reused by another process.
-        // See MESOS-8125 for details.
+        // avoid reaping an irrelevant process, e.g., agent process is stopped
+        // for a long time, and during this time executor terminates and its
+        // pid happens to be reused by another irrelevant process. When agent
+        // is restarted, it still considers this executor not complete (i.e.,
+        // `run->completed` is false), so we would reap the irrelevant process
+        // if we do not check whether that process can be connected.
         // Note that if both the pid and the port of the executor are reused
         // by another process or two processes respectively after the agent
         // host reboots we will still reap an irrelevant process, but that
@@ -1045,20 +1044,6 @@ Future<Nothing> DockerContainerizerProcess::_recover(
         container->status.future().get()
           .onAny(defer(self(), &Self::reaped, containerId));
 
-        if (pids.containsValue(pid)) {
-          // This should (almost) never occur. There is the
-          // possibility that a new executor is launched with the same
-          // pid as one that just exited (highly unlikely) and the
-          // slave dies after the new executor is launched but before
-          // it hears about the termination of the earlier executor
-          // (also unlikely).
-          return Failure(
-              "Detected duplicate pid " + stringify(pid) +
-              " for container " + stringify(containerId));
-        }
-
-        pids.put(containerId, pid);
-
         const string sandboxDirectory = paths::getExecutorRunPath(
             flags.work_dir,
             state->id,