You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "James Peach (JIRA)" <ji...@apache.org> on 2017/09/25 21:09:00 UTC

[jira] [Comment Edited] (MESOS-3009) Reproduce systemd cgroup behavior

    [ https://issues.apache.org/jira/browse/MESOS-3009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16179773#comment-16179773 ] 

James Peach edited comment on MESOS-3009 at 9/25/17 9:08 PM:
-------------------------------------------------------------

With {{systemd-233}} I see systemd nuking the memory cgroup which breaks the Mesos agent:
{noformat}
systemd(kernel.function("SyS_rmdir@fs/namei.c:3936")): /sys/fs/cgroup/memory
 0x7f7559be3c47 : rmdir+0x7/0x30 [/usr/lib64/libc-2.25.so]
 0x7f755b2fa169 : cg_trim+0x109/0x1f0 [/usr/lib/systemd/libsystemd-shared-233.so]
 0x7f755b2fc280 : cg_create_everywhere+0xa0/0xb0 [/usr/lib/systemd/libsystemd-shared-233.so]
 0x55d79d8bf861 : unit_realize_cgroup_now.lto_priv.582+0x101/0x23b0 [/usr/lib/systemd/systemd]
 0x55d79d8bfc88 : unit_realize_cgroup_now.lto_priv.582+0x528/0x23b0 [/usr/lib/systemd/systemd]
 0x55d79d8bfc88 : unit_realize_cgroup_now.lto_priv.582+0x528/0x23b0 [/usr/lib/systemd/systemd]
 0x55d79d8c1ce2 : unit_realize_cgroup+0x1d2/0x200 [/usr/lib/systemd/systemd]
 0x55d79d8a1daa : slice_start.lto_priv.202+0x2a/0x90 [/usr/lib/systemd/systemd]
 0x55d79d8b898c : job_perform_on_unit.lto_priv.583+0x5fc/0x6d0 [/usr/lib/systemd/systemd]
 0x55d79d85e3a8 : manager_dispatch_run_queue+0x258/0x640 [/usr/lib/systemd/systemd]
 0x7f755b33f8ca : source_dispatch+0x14a/0x380 [/usr/lib/systemd/libsystemd-shared-233.so]
 0x7f755b33fbca : sd_event_dispatch+0xca/0x1d0 [/usr/lib/systemd/libsystemd-shared-233.so]
 0x7f755b341007 : sd_event_run+0x77/0x200 [/usr/lib/systemd/libsystemd-shared-233.so]
 0x55d79d853384 : manager_loop+0x605/0x676 [/usr/lib/systemd/systemd]
 0x55d79d85b2b6 : main+0x39b6/0x4710 [/usr/lib/systemd/systemd]
 0x7f7559b0250a : __libc_start_main+0xea/0x1c0 [/usr/lib64/libc-2.25.so]
 0x55d79d85c05a : _start+0x2a/0x30 [/usr/lib/systemd/systemd]
{noformat}


was (Author: jamespeach):
With {{systems-233}} I see systemd nuking the memory cgroup which breaks the Mesos agent:
{noformat}
systemd(kernel.function("SyS_rmdir@fs/namei.c:3936")): /sys/fs/cgroup/memory
 0x7f7559be3c47 : rmdir+0x7/0x30 [/usr/lib64/libc-2.25.so]
 0x7f755b2fa169 : cg_trim+0x109/0x1f0 [/usr/lib/systemd/libsystemd-shared-233.so]
 0x7f755b2fc280 : cg_create_everywhere+0xa0/0xb0 [/usr/lib/systemd/libsystemd-shared-233.so]
 0x55d79d8bf861 : unit_realize_cgroup_now.lto_priv.582+0x101/0x23b0 [/usr/lib/systemd/systemd]
 0x55d79d8bfc88 : unit_realize_cgroup_now.lto_priv.582+0x528/0x23b0 [/usr/lib/systemd/systemd]
 0x55d79d8bfc88 : unit_realize_cgroup_now.lto_priv.582+0x528/0x23b0 [/usr/lib/systemd/systemd]
 0x55d79d8c1ce2 : unit_realize_cgroup+0x1d2/0x200 [/usr/lib/systemd/systemd]
 0x55d79d8a1daa : slice_start.lto_priv.202+0x2a/0x90 [/usr/lib/systemd/systemd]
 0x55d79d8b898c : job_perform_on_unit.lto_priv.583+0x5fc/0x6d0 [/usr/lib/systemd/systemd]
 0x55d79d85e3a8 : manager_dispatch_run_queue+0x258/0x640 [/usr/lib/systemd/systemd]
 0x7f755b33f8ca : source_dispatch+0x14a/0x380 [/usr/lib/systemd/libsystemd-shared-233.so]
 0x7f755b33fbca : sd_event_dispatch+0xca/0x1d0 [/usr/lib/systemd/libsystemd-shared-233.so]
 0x7f755b341007 : sd_event_run+0x77/0x200 [/usr/lib/systemd/libsystemd-shared-233.so]
 0x55d79d853384 : manager_loop+0x605/0x676 [/usr/lib/systemd/systemd]
 0x55d79d85b2b6 : main+0x39b6/0x4710 [/usr/lib/systemd/systemd]
 0x7f7559b0250a : __libc_start_main+0xea/0x1c0 [/usr/lib64/libc-2.25.so]
 0x55d79d85c05a : _start+0x2a/0x30 [/usr/lib/systemd/systemd]
{noformat}

> Reproduce systemd cgroup behavior 
> ----------------------------------
>
>                 Key: MESOS-3009
>                 URL: https://issues.apache.org/jira/browse/MESOS-3009
>             Project: Mesos
>          Issue Type: Task
>            Reporter: Artem Harutyunyan
>            Assignee: Joris Van Remoortere
>              Labels: mesosphere
>
> It has been noticed before that systemd reorganizes cgroup hierarchy created by mesos slave. Because of this mesos is no longer able to find the cgroup, and there is also a chance of undoing the isolation that mesos slave puts in place. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)