You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Ian Downes (JIRA)" <ji...@apache.org> on 2015/02/27 21:50:04 UTC

[jira] [Created] (MESOS-2421) Processes can be stuck in D state and block destroy

Ian Downes created MESOS-2421:
---------------------------------

             Summary: Processes can be stuck in D state and block destroy
                 Key: MESOS-2421
                 URL: https://issues.apache.org/jira/browse/MESOS-2421
             Project: Mesos
          Issue Type: Bug
          Components: isolation
    Affects Versions: 0.21.1
         Environment: CentOS, 3.10 kernel
            Reporter: Ian Downes


We've observed processes getting stuck in D state (uninterruptible sleep) when using the cpu isolator. This prevents the MesosContainerizer launcher from killing all container processes and blocks destroying the container. 

It appears to be a kernel scheduler bug: the processes can be unstuck by modifying the cpu.cfs_quota_us for the cpu cgroup. This seems to run the processes, deliver the kill signal, and they exit.

We should implement this workaround in the launcher destroy path when processes are observed to be in D state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)