You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@mesos.apache.org by Benjamin Mahler <bm...@apache.org> on 2017/08/05 02:55:06 UTC

Review Request 61445: Fixed a bug in the agent where a kill task is dropped.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61445/
-----------------------------------------------------------

Review request for mesos, Anand Mazumdar and Vinod Kone.


Bugs: MESOS-7863
    https://issues.apache.org/jira/browse/MESOS-7863


Repository: mesos


Description
-------

Currently there is an assumption that when a pending task is killed,
the framework will still be stored in the agent. However, this
assumption can be violated in two cases:

  (1) Another pending task was killed and we removed the framework
      in 'Slave::run' thinking it was idle, because pending tasks
      were empty (we remove from pending tasks when processing the
      kill).

  (2) The last executor terminated without tasks to send terminal
      updates for, or the last terminated executor received its
      last acknowledgement. At this point, we remove the framework
      thinking there were no pending tasks if the task was killed
      (removed from pending).

The fix is to leave tasks as pending but mark that they have been
killed. This fixes both cases.


Diffs
-----

  src/slave/slave.hpp 1fe93dab1b2bef24721cc1bcffebe1b259e96d79 
  src/slave/slave.cpp 7381530515f86faf4c3e8f82bcd9483f6cf0498b 


Diff: https://reviews.apache.org/r/61445/diff/1/


Testing
-------

make check, added a test in the subsequent patch


Thanks,

Benjamin Mahler