You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Chengwei Yang (JIRA)" <ji...@apache.org> on 2015/09/22 10:45:04 UTC

[jira] [Comment Edited] (MESOS-3488) /sys/fs/cgroup/memory/mesos missing after running a while

    [ https://issues.apache.org/jira/browse/MESOS-3488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14902211#comment-14902211 ] 

Chengwei Yang edited comment on MESOS-3488 at 9/22/15 8:44 AM:
---------------------------------------------------------------

I setup auditd and auditctl for /sys/fs/cgroup/memory/mesos like below

\# auditctl -l
-w /sys/fs/cgroup/memory/mesos -p rwxa

And having mesos-slave running for about 1 hour

When I check again, /sys/fs/cgroup/memory/mesos gone and I see below logs in /var/log/audit/audit.log

type=ANOM_PROMISCUOUS msg=audit(1442908163.322:924): dev=vethff752e0 prom=0 old_prom=256 auid=4294967295 uid=0 gid=0 ses=4294967295
type=SYSCALL msg=audit(1442908165.127:925): arch=c000003e syscall=84 success=yes exit=0 a0=7f59840088c8 a1=7f5993e65730 a2=0 a3=0 items=2 ppid=1 pid=12682 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="mesos-slave" exe="/usr/local/sbin/mesos-slave" key=(null)
type=CWD msg=audit(1442908165.127:925):  cwd="/"
type=PATH msg=audit(1442908165.127:925): item=0 name="/sys/fs/cgroup/memory/mesos/" inode=112987 dev=00:18 mode=040755 ouid=0 ogid=0 rdev=00:00 objtype=PARENT
type=PATH msg=audit(1442908165.127:925): item=1 name="/sys/fs/cgroup/memory/mesos/be0aa404-c348-491b-ae58-518f306716ef" inode=121617 dev=00:18 mode=040755 ouid=0 ogid=0 rdev=00:00 objtype=DELETE
type=USER_ACCT msg=audit(1442908201.173:926): pid=1352 uid=0 auid=4294967295 ses=4294967295 msg='op=PAM:accounting grantors=pam_access,pam_unix,pam_localuser acct="root" exe="/usr/sbin/crond" hostname=? addr=? terminal=cron res=success'
type=USER_ACCT msg=audit(1442908201.173:927): pid=1353 uid=0 auid=4294967295 ses=4294967295 msg='op=PAM:accounting grantors=pam_access,pam_unix,pam_localuser acct="root" exe="/usr/sbin/crond" hostname=? addr=? terminal=cron res=success'
type=CRED_ACQ msg=audit(1442908201.173:928): pid=1352 uid=0 auid=4294967295 ses=4294967295 msg='op=PAM:setcred grantors=pam_env,pam_unix acct="root" exe="/usr/sbin/crond" hostname=? addr=? terminal=cron res=success'
type=USER_ACCT msg=audit(1442908201.173:929): pid=1354 uid=0 auid=4294967295 ses=4294967295 msg='op=PAM:accounting grantors=pam_access,pam_unix,pam_localuser acct="root" exe="/usr/sbin/crond" hostname=? addr=? terminal=cron res=success'
type=LOGIN msg=audit(1442908201.173:930): pid=1352 uid=0 old-auid=4294967295 auid=0 old-ses=4294967295 ses=291 res=1
type=CRED_ACQ msg=audit(1442908201.174:931): pid=1353 uid=0 auid=4294967295 ses=4294967295 msg='op=PAM:setcred grantors=pam_env,pam_unix acct="root" exe="/usr/sbin/crond" hostname=? addr=? terminal=cron res=success'
type=CRED_ACQ msg=audit(1442908201.174:932): pid=1354 uid=0 auid=4294967295 ses=4294967295 msg='op=PAM:setcred grantors=pam_env,pam_unix acct="root" exe="/usr/sbin/crond" hostname=? addr=? terminal=cron res=success'
type=LOGIN msg=audit(1442908201.174:933): pid=1353 uid=0 old-auid=4294967295 auid=0 old-ses=4294967295 ses=292 res=1
type=LOGIN msg=audit(1442908201.174:934): pid=1354 uid=0 old-auid=4294967295 auid=0 old-ses=4294967295 ses=293 res=1
type=CONFIG_CHANGE msg=audit(1442908201.201:935): auid=4294967295 ses=4294967295 op="updated rules" path="/sys/fs/cgroup/memory/mesos" key=(null) list=4 res=1
type=CONFIG_CHANGE msg=audit(1442908201.201:936): auid=4294967295 ses=4294967295 op="updated rules" path="/sys/fs/cgroup/memory/mesos" key=(null) list=4 res=1
type=USER_START msg=audit(1442908201.209:937): pid=1352 uid=0 auid=0 ses=291 msg='op=PAM:session_open grantors=pam_loginuid,pam_keyinit,pam_limits,pam_systemd acct="root" exe="/usr/sbin/crond" hostname=? addr=? terminal=cron res=success'
type=USER_START msg=audit(1442908201.209:938): pid=1354 uid=0 auid=0 ses=293 msg='op=PAM:session_open grantors=pam_loginuid,pam_keyinit,pam_limits,pam_systemd acct="root" exe="/usr/sbin/crond" hostname=? addr=? terminal=cron res=success'
type=CRED_REFR msg=audit(1442908201.210:939): pid=1352 uid=0 auid=0 ses=291 msg='op=PAM:setcred grantors=pam_env,pam_unix acct="root" exe="/usr/sbin/crond" hostname=? addr=? terminal=cron res=success'


I didn't found any DELETE for /sys/fs/cgroup/memory/mesos in the log, doesn't understand what the last two "CONFIG_CHANGE" for /sys/fs/cgroup/memory/mesos means. And I did see *auditctl* says the rule is here.


was (Author: chengwei-yang):
I setup auditd and auditctl for /sys/fs/cgroup/memory/mesos like below

# auditctl -l
-w /sys/fs/cgroup/memory/mesos -p rwxa

And having mesos-slave running for about 1 hour

When I check again, /sys/fs/cgroup/memory/mesos gone and I see below logs in /var/log/audit/audit.log

type=ANOM_PROMISCUOUS msg=audit(1442908163.322:924): dev=vethff752e0 prom=0 old_prom=256 auid=4294967295 uid=0 gid=0 ses=4294967295
type=SYSCALL msg=audit(1442908165.127:925): arch=c000003e syscall=84 success=yes exit=0 a0=7f59840088c8 a1=7f5993e65730 a2=0 a3=0 items=2 ppid=1 pid=12682 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="mesos-slave" exe="/usr/local/sbin/mesos-slave" key=(null)
type=CWD msg=audit(1442908165.127:925):  cwd="/"
type=PATH msg=audit(1442908165.127:925): item=0 name="/sys/fs/cgroup/memory/mesos/" inode=112987 dev=00:18 mode=040755 ouid=0 ogid=0 rdev=00:00 objtype=PARENT
type=PATH msg=audit(1442908165.127:925): item=1 name="/sys/fs/cgroup/memory/mesos/be0aa404-c348-491b-ae58-518f306716ef" inode=121617 dev=00:18 mode=040755 ouid=0 ogid=0 rdev=00:00 objtype=DELETE
type=USER_ACCT msg=audit(1442908201.173:926): pid=1352 uid=0 auid=4294967295 ses=4294967295 msg='op=PAM:accounting grantors=pam_access,pam_unix,pam_localuser acct="root" exe="/usr/sbin/crond" hostname=? addr=? terminal=cron res=success'
type=USER_ACCT msg=audit(1442908201.173:927): pid=1353 uid=0 auid=4294967295 ses=4294967295 msg='op=PAM:accounting grantors=pam_access,pam_unix,pam_localuser acct="root" exe="/usr/sbin/crond" hostname=? addr=? terminal=cron res=success'
type=CRED_ACQ msg=audit(1442908201.173:928): pid=1352 uid=0 auid=4294967295 ses=4294967295 msg='op=PAM:setcred grantors=pam_env,pam_unix acct="root" exe="/usr/sbin/crond" hostname=? addr=? terminal=cron res=success'
type=USER_ACCT msg=audit(1442908201.173:929): pid=1354 uid=0 auid=4294967295 ses=4294967295 msg='op=PAM:accounting grantors=pam_access,pam_unix,pam_localuser acct="root" exe="/usr/sbin/crond" hostname=? addr=? terminal=cron res=success'
type=LOGIN msg=audit(1442908201.173:930): pid=1352 uid=0 old-auid=4294967295 auid=0 old-ses=4294967295 ses=291 res=1
type=CRED_ACQ msg=audit(1442908201.174:931): pid=1353 uid=0 auid=4294967295 ses=4294967295 msg='op=PAM:setcred grantors=pam_env,pam_unix acct="root" exe="/usr/sbin/crond" hostname=? addr=? terminal=cron res=success'
type=CRED_ACQ msg=audit(1442908201.174:932): pid=1354 uid=0 auid=4294967295 ses=4294967295 msg='op=PAM:setcred grantors=pam_env,pam_unix acct="root" exe="/usr/sbin/crond" hostname=? addr=? terminal=cron res=success'
type=LOGIN msg=audit(1442908201.174:933): pid=1353 uid=0 old-auid=4294967295 auid=0 old-ses=4294967295 ses=292 res=1
type=LOGIN msg=audit(1442908201.174:934): pid=1354 uid=0 old-auid=4294967295 auid=0 old-ses=4294967295 ses=293 res=1
type=CONFIG_CHANGE msg=audit(1442908201.201:935): auid=4294967295 ses=4294967295 op="updated rules" path="/sys/fs/cgroup/memory/mesos" key=(null) list=4 res=1
type=CONFIG_CHANGE msg=audit(1442908201.201:936): auid=4294967295 ses=4294967295 op="updated rules" path="/sys/fs/cgroup/memory/mesos" key=(null) list=4 res=1
type=USER_START msg=audit(1442908201.209:937): pid=1352 uid=0 auid=0 ses=291 msg='op=PAM:session_open grantors=pam_loginuid,pam_keyinit,pam_limits,pam_systemd acct="root" exe="/usr/sbin/crond" hostname=? addr=? terminal=cron res=success'
type=USER_START msg=audit(1442908201.209:938): pid=1354 uid=0 auid=0 ses=293 msg='op=PAM:session_open grantors=pam_loginuid,pam_keyinit,pam_limits,pam_systemd acct="root" exe="/usr/sbin/crond" hostname=? addr=? terminal=cron res=success'
type=CRED_REFR msg=audit(1442908201.210:939): pid=1352 uid=0 auid=0 ses=291 msg='op=PAM:setcred grantors=pam_env,pam_unix acct="root" exe="/usr/sbin/crond" hostname=? addr=? terminal=cron res=success'


I didn't found any DELETE for /sys/fs/cgroup/memory/mesos in the log, doesn't understand what the last two "CONFIG_CHANGE" for /sys/fs/cgroup/memory/mesos means. And I did see *auditctl* says the rule is here.

> /sys/fs/cgroup/memory/mesos missing after running a while
> ---------------------------------------------------------
>
>                 Key: MESOS-3488
>                 URL: https://issues.apache.org/jira/browse/MESOS-3488
>             Project: Mesos
>          Issue Type: Bug
>    Affects Versions: 0.21.0
>         Environment: mesos 0.21.0 on CentOS 7.1
>            Reporter: Chengwei Yang
>
> I setup mesos 0.21.0 on CentOS 7.1 with mesos-0.21.0 rpm downloaded from mesosphere.
> at first, it works fine, jobs are finished correctly, however, after running a while, all task goes to LOST.
> And there is nothing in **sandbox** and I see below from mesos-slave.ERROR
> ```
> E0922 14:02:31.329264  8336 slave.cpp:2787] Container '865c9a1c-3abe-4263-921e-8d0be2f0a56d' for executor '21bf1400-6456-45e5-8f28-fe5ade7e7bfd' of framework '20150401-105258-3755085578-5050-14676-0000' failed to start: Failed to prepare isolator: Failed to create directory '/sys/fs/cgroup/memory/mesos/865c9a1c-3abe-4263-921e-8d0be2f0a56d': No such file or directory
> ```
> And I checked that /sys/fs/cgroup/cpu/mesos does exist while /sys/fs/cgroup/memory/mesos is missing.
> Since at first mesos works fine (and I checked source code, that if it create /sys/fs/cgroup/memory/mesos failed at mesos-slave startup, it will log that error), so I'm curious when/which removed /sys/fs/cgroup/memory/mesos.
> Has anyone saw this issue before?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)