You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mesos.apache.org by jo...@apache.org on 2018/06/04 18:45:14 UTC

mesos git commit: Handled race condition when removing maintenance windows.

Repository: mesos
Updated Branches:
  refs/heads/master 351829123 -> 52660fe6a


Handled race condition when removing maintenance windows.

When executing the `Master::inverseOffers()` callback, it
could happen that the maintenance window the inverse offer
referred to was already removed by a concurrent call to
to the maintenance endpoint of Mesos.

In this case, we must not send out an inverse offer, because
having outstanding inverse offers for an agent without
any scheduled maintenance window will lead to a crash in
the allocator when attempting to remove this offer.

Review: https://reviews.apache.org/r/67403/


Project: http://git-wip-us.apache.org/repos/asf/mesos/repo
Commit: http://git-wip-us.apache.org/repos/asf/mesos/commit/52660fe6
Tree: http://git-wip-us.apache.org/repos/asf/mesos/tree/52660fe6
Diff: http://git-wip-us.apache.org/repos/asf/mesos/diff/52660fe6

Branch: refs/heads/master
Commit: 52660fe6ac7a205e168f3e03cddbff6e7c0de813
Parents: 3518291
Author: Benno Evers <be...@mesosphere.com>
Authored: Mon Jun 4 11:29:49 2018 -0700
Committer: Joseph Wu <jo...@apache.org>
Committed: Mon Jun 4 11:29:49 2018 -0700

----------------------------------------------------------------------
 src/master/master.cpp | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/mesos/blob/52660fe6/src/master/master.cpp
----------------------------------------------------------------------
diff --git a/src/master/master.cpp b/src/master/master.cpp
index f778e48..5db5a8d 100644
--- a/src/master/master.cpp
+++ b/src/master/master.cpp
@@ -9456,8 +9456,20 @@ void Master::inverseOffer(
     // before the slave was deactivated in the allocator.
     if (!slave->active) {
       LOG(INFO)
-        << "Master ignoring inverse offers because agent " << *slave
-        << " is " << (slave->connected ? "deactivated" : "disconnected");
+        << "Master ignoring inverse offers to framework " << *framework
+        << " because agent " << *slave << " is "
+        << (slave->connected ? "deactivated" : "disconnected");
+
+      continue;
+    }
+
+    // This could happen if the allocator dispatched `Master::inverseOffer`
+    // before the unavailability was removed in the master.
+    if (!machines.contains(slave->machineId) ||
+        !machines.at(slave->machineId).info.has_unavailability()) {
+      LOG(INFO)
+        << "Master dropping inverse offers to framework " << *framework
+        << " because agent " << *slave << " had its unavailability revoked.";
 
       continue;
     }