You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@mesos.apache.org by "Benjamin Mahler (JIRA)" <ji...@apache.org> on 2013/10/22 10:45:42 UTC

[jira] [Created] (MESOS-759) The cgroups TaskKiller should skip freezing the cgroup if it is already empty.

Benjamin Mahler created MESOS-759:
-------------------------------------

             Summary: The cgroups TaskKiller should skip freezing the cgroup if it is already empty.
                 Key: MESOS-759
                 URL: https://issues.apache.org/jira/browse/MESOS-759
             Project: Mesos
          Issue Type: Bug
    Affects Versions: 0.14.1, 0.14.0, 0.13.0
            Reporter: Benjamin Mahler
            Assignee: Vinod Kone
            Priority: Critical
             Fix For: 0.15.0


The current TasksKiller code always freezes the cgroup when trying to kill the cgroup:

  void killTasks() {
    // Chain together the steps needed to kill the tasks. Note that we
    // ignore the return values of freeze, kill, and thaw because,
    // provided there are no errors, we'll just retry the chain as
    // long as tasks still exist.
    chain = kill(SIGSTOP)                        // Send stop signal to all tasks.
      .then(defer(self(), &Self::kill, SIGKILL)) // Now send kill signal.
      .then(defer(self(), &Self::empty))         // Wait until cgroup is empty.
      .then(defer(self(), &Self::freeze))        // Freeze cgroug.
      .then(defer(self(), &Self::kill, SIGKILL)) // Send kill signal to any remaining tasks.
      .then(defer(self(), &Self::thaw))          // Thaw cgroup to deliver signals.
      .then(defer(self(), &Self::empty));        // Wait until cgroup is empty.


This should avoid freezing the cgroup, as we've seen instances where the cgroup is unfreezable and thus this enters a loop attempting to freeze the cgroup as upon failures we retry this procedure.



--
This message was sent by Atlassian JIRA
(v6.1#6144)