You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@mesos.apache.org by Benjamin Mahler <bm...@apache.org> on 2017/08/05 02:55:06 UTC
Review Request 61445: Fixed a bug in the agent where a kill task is
dropped.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61445/
-----------------------------------------------------------
Review request for mesos, Anand Mazumdar and Vinod Kone.
Bugs: MESOS-7863
https://issues.apache.org/jira/browse/MESOS-7863
Repository: mesos
Description
-------
Currently there is an assumption that when a pending task is killed,
the framework will still be stored in the agent. However, this
assumption can be violated in two cases:
(1) Another pending task was killed and we removed the framework
in 'Slave::run' thinking it was idle, because pending tasks
were empty (we remove from pending tasks when processing the
kill).
(2) The last executor terminated without tasks to send terminal
updates for, or the last terminated executor received its
last acknowledgement. At this point, we remove the framework
thinking there were no pending tasks if the task was killed
(removed from pending).
The fix is to leave tasks as pending but mark that they have been
killed. This fixes both cases.
Diffs
-----
src/slave/slave.hpp 1fe93dab1b2bef24721cc1bcffebe1b259e96d79
src/slave/slave.cpp 7381530515f86faf4c3e8f82bcd9483f6cf0498b
Diff: https://reviews.apache.org/r/61445/diff/1/
Testing
-------
make check, added a test in the subsequent patch
Thanks,
Benjamin Mahler