You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mesos.apache.org by bm...@apache.org on 2015/12/10 03:49:16 UTC

[1/2] mesos git commit: Fixed a message dropping bug in the health checker.

Repository: mesos
Updated Branches:
  refs/heads/master 2e7df0868 -> 90771c45b


Fixed a message dropping bug in the health checker.

Much like in the command executor, we need to sleep after we send
the final message in the health checker. Otherwise, we may exit
before libprocess is able to finish sending the message over the
local network.

This led to the following issues:
https://issues.apache.org/jira/browse/MESOS-1613
https://issues.apache.org/jira/browse/MESOS-4106

Review: https://reviews.apache.org/r/41178


Project: http://git-wip-us.apache.org/repos/asf/mesos/repo
Commit: http://git-wip-us.apache.org/repos/asf/mesos/commit/7aa7957f
Tree: http://git-wip-us.apache.org/repos/asf/mesos/tree/7aa7957f
Diff: http://git-wip-us.apache.org/repos/asf/mesos/diff/7aa7957f

Branch: refs/heads/master
Commit: 7aa7957fb9a5bce5a9d8ae5a2560d1bde97d1274
Parents: 2e7df08
Author: Benjamin Mahler <be...@gmail.com>
Authored: Wed Dec 9 17:43:27 2015 -0800
Committer: Benjamin Mahler <be...@gmail.com>
Committed: Wed Dec 9 18:47:49 2015 -0800

----------------------------------------------------------------------
 src/health-check/main.cpp        | 5 +++++
 src/tests/health_check_tests.cpp | 3 +--
 2 files changed, 6 insertions(+), 2 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/mesos/blob/7aa7957f/src/health-check/main.cpp
----------------------------------------------------------------------
diff --git a/src/health-check/main.cpp b/src/health-check/main.cpp
index 83ee38c..0beaed5 100644
--- a/src/health-check/main.cpp
+++ b/src/health-check/main.cpp
@@ -113,6 +113,11 @@ private:
     send(executor, taskHealthStatus);
 
     if (killTask) {
+      // This is a hack to ensure the message is sent to the
+      // executor before we exit the process. Without this,
+      // we may exit before libprocess has sent the data over
+      // the socket. See MESOS-4111.
+      os::sleep(Seconds(1));
       promise.fail(message);
     } else {
       reschedule();

http://git-wip-us.apache.org/repos/asf/mesos/blob/7aa7957f/src/tests/health_check_tests.cpp
----------------------------------------------------------------------
diff --git a/src/tests/health_check_tests.cpp b/src/tests/health_check_tests.cpp
index b1454b0..48f5bfb 100644
--- a/src/tests/health_check_tests.cpp
+++ b/src/tests/health_check_tests.cpp
@@ -630,8 +630,7 @@ TEST_F(HealthCheckTest, ROOT_DOCKER_DockerHealthStatusChange)
 
 
 // Testing killing task after number of consecutive failures.
-// Temporarily disabled due to MESOS-1613.
-TEST_F(HealthCheckTest, DISABLED_ConsecutiveFailures)
+TEST_F(HealthCheckTest, ConsecutiveFailures)
 {
   Try<PID<Master> > master = StartMaster();
   ASSERT_SOME(master);


[2/2] mesos git commit: Added a reference to MESOS-4111 in the command executor sleep hack.

Posted by bm...@apache.org.
Added a reference to MESOS-4111 in the command executor sleep hack.


Project: http://git-wip-us.apache.org/repos/asf/mesos/repo
Commit: http://git-wip-us.apache.org/repos/asf/mesos/commit/90771c45
Tree: http://git-wip-us.apache.org/repos/asf/mesos/tree/90771c45
Diff: http://git-wip-us.apache.org/repos/asf/mesos/diff/90771c45

Branch: refs/heads/master
Commit: 90771c45b38bddd92e00ecc299d2de818a70eb27
Parents: 7aa7957
Author: Benjamin Mahler <be...@gmail.com>
Authored: Wed Dec 9 18:48:53 2015 -0800
Committer: Benjamin Mahler <be...@gmail.com>
Committed: Wed Dec 9 18:48:53 2015 -0800

----------------------------------------------------------------------
 src/launcher/executor.cpp | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/mesos/blob/90771c45/src/launcher/executor.cpp
----------------------------------------------------------------------
diff --git a/src/launcher/executor.cpp b/src/launcher/executor.cpp
index f90ea01..09e7de6 100644
--- a/src/launcher/executor.cpp
+++ b/src/launcher/executor.cpp
@@ -524,8 +524,10 @@ private:
 
     driver->sendStatusUpdate(taskStatus);
 
-    // A hack for now ... but we need to wait until the status update
-    // is sent to the slave before we shut ourselves down.
+    // This is a hack to ensure the message is sent to the
+    // slave before we exit the process. Without this, we
+    // may exit before libprocess has sent the data over
+    // the socket. See MESOS-4111.
     os::sleep(Seconds(1));
     driver->stop();
   }