You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "haosdent (JIRA)" <ji...@apache.org> on 2016/10/19 17:59:58 UTC

[jira] [Updated] (MESOS-6414) Task cleanup fails when the containers includes cgroups not owned by Mesos

     [ https://issues.apache.org/jira/browse/MESOS-6414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

haosdent updated MESOS-6414:
----------------------------
    Description: 
Now if we launch a docker container in Mesos containerizer, the racing may happen
between docker daemon and Mesos containerizer during cgroups operations.
For example, when the docker container which run in Mesos containerizer OOM exit,
Mesos containerizer would destroy following hierarchies

{code}
/sys/fs/cgroup/freezer/mesos/<mesos-cgroup>/<docker-cgroup>
/sys/fs/cgroup/freezer/mesos/<mesos-cgroup>
{code}

But the docker daemon may destroy 

{code}
/sys/fs/cgroup/freezer/mesos/<mesos-cgroup>/<docker-cgroup>
{code}

at the same time.

If the docker daemon destroy the hierarchy first, then the Mesos containerizer would
failed during {{CgroupsIsolatorProcess::cleanup}} because it could not find that hierarchy
when destroying.

  was:
If a mesos task is launched in a cgroup outside of the context of Mesos,  Mesos is unaware of that cgroup created in the task context.

Now when the Mesos task terminates: Mesos tries to cleanup all cgroups within the top level cgroup it knows about. If the cgroup created in the task context exists when LinuxLauncherProcess::destroy() is called but is eventually cleaned up by the container before we do a freeze() or thaw() or remove(), it fails at those stages leading to an incomplete cleanup of the container.


> Task cleanup fails when the containers includes cgroups not owned by Mesos
> --------------------------------------------------------------------------
>
>                 Key: MESOS-6414
>                 URL: https://issues.apache.org/jira/browse/MESOS-6414
>             Project: Mesos
>          Issue Type: Bug
>          Components: cgroups
>            Reporter: Anindya Sinha
>            Assignee: Anindya Sinha
>            Priority: Minor
>
> Now if we launch a docker container in Mesos containerizer, the racing may happen
> between docker daemon and Mesos containerizer during cgroups operations.
> For example, when the docker container which run in Mesos containerizer OOM exit,
> Mesos containerizer would destroy following hierarchies
> {code}
> /sys/fs/cgroup/freezer/mesos/<mesos-cgroup>/<docker-cgroup>
> /sys/fs/cgroup/freezer/mesos/<mesos-cgroup>
> {code}
> But the docker daemon may destroy 
> {code}
> /sys/fs/cgroup/freezer/mesos/<mesos-cgroup>/<docker-cgroup>
> {code}
> at the same time.
> If the docker daemon destroy the hierarchy first, then the Mesos containerizer would
> failed during {{CgroupsIsolatorProcess::cleanup}} because it could not find that hierarchy
> when destroying.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)