You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mesos.apache.org by "Lin Zhao (JIRA)" <ji...@apache.org> on 2014/01/23 17:10:40 UTC

[jira] [Created] (MESOS-941) CPU limit not correctly set when executor launched with no memory resource

Lin Zhao created MESOS-941:
------------------------------

             Summary: CPU limit not correctly set when executor launched with no memory resource
                 Key: MESOS-941
                 URL: https://issues.apache.org/jira/browse/MESOS-941
             Project: Mesos
          Issue Type: Bug
          Components: slave
            Reporter: Lin Zhao


When a framework is launched with memory resource only set on the tasks, and non set on the executor level, the slave fails to apply the memory control needed to limit memory usage for the executor. The executor process can use more resident memory than specified in the tasks.

Example framework: https://gist.github.com/lin-zhao/8544495. This framework was tested with Mesos 0.14.2 on Centos 5. 

According to Benjamin Mahler:

What's happening is that you're launching an executor with no resources, consequently before we fork, we attempt to update the memory control but we don't call the memory handler since the executor has no memory resources:

I0121 19:39:01.660071  8566 cgroups_isolator.cpp:516] Launching default (/home/lin/test-executor) in /tmp/mesos/slaves/201312032357-3645772810-5050-2033-0/frameworks/201401171812-2907575306-5050-19011-0020/executors/default/runs/8bc2ab10-8988-4b22-afa2-3433bbedc3ed with resources  for framework 201401171812-2907575306-5050-19011-0020 in cgroup mesos/framework_201401171812-2907575306-5050-19011-0020_executor_default_tag_8bc2ab10-8988-4b22-afa2-3433bbedc3ed
I0121 19:39:01.663082  8566 cgroups_isolator.cpp:709] Changing cgroup controls for executor default of framework 201401171812-2907575306-5050-19011-0020 with resources 
I0121 19:39:01.667129  8566 cgroups_isolator.cpp:1163] Started listening for OOM events for executor default of framework 201401171812-2907575306-5050-19011-0020
I0121 19:39:01.681857  8566 cgroups_isolator.cpp:568] Forked executor at = 27609

Then, later, when we are updating the resources for your 128MB task, we set the soft limit, but we don't set the hard limit because the following buggy check is not satisfied:

  // Determine whether to set the hard limit. If this is the first
  // time (info->pid.isNone()), or we're raising the existing limit,
  // then we can update the hard limit safely. Otherwise, if we need
  // to decrease 'memory.limit_in_bytes' we may induce an OOM if too
  // much memory is in use. As a result, we only update the soft
  // limit when the memory reservation is being reduced. This is
  // probably okay if the machine has available resources.
  // TODO(benh): Introduce a MemoryWatcherProcess which monitors the
  // discrepancy between usage and soft limit and introduces a
  // "manual oom" if necessary.
  if (info->pid.isNone() || limit > currentLimit.get()) {



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)