You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mesos.apache.org by "brian wickman (JIRA)" <ji...@apache.org> on 2012/10/04 23:29:47 UTC

[jira] [Created] (MESOS-287) have cgroups isolation module oversubscribe in order to killTask rather than OOM

brian wickman created MESOS-287:
-----------------------------------

             Summary: have cgroups isolation module oversubscribe in order to killTask rather than OOM
                 Key: MESOS-287
                 URL: https://issues.apache.org/jira/browse/MESOS-287
             Project: Mesos
          Issue Type: Improvement
          Components: isolation
            Reporter: brian wickman


Right now if you set the cgroup memory limit to exactly the memory specified in the ExecutorInfo, you're probably going to get an OOM that is out of your control, leaving your application in a state that's hard to diagnose.

What would be ideal is to set the memory limit of the cgroup to some (1+epsilon) * memory limit, where epsilon is initially something like, e.g. .2, but continuously varies depending upon the allocated resources on the box.  Obviously epsilon might need to go to 0 as the box becomes heavily subscribed.

The higher the epsilon however, the higher the chance you observe memory overuse prior to the OOM kill, allowing you to invoke a killTask which gives the executor the opportunity to invoke cleanup routines and the like.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira