You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Shane da Silva (JIRA)" <ji...@apache.org> on 2016/05/24 00:10:12 UTC

[jira] [Commented] (MESOS-4248) mesos slave can't start in CentOS-7 docker container

    [ https://issues.apache.org/jira/browse/MESOS-4248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15297391#comment-15297391 ] 

Shane da Silva commented on MESOS-4248:
---------------------------------------

FWIW, we can't reproduce this issue on Mesos 0.26.0, but we do hit it on 0.27.2 and 0.28.1.

It would be great for someone to review Yubao's patch and consider merging it, as it's convenient to be able to run Mesos in a container for integration testing. For example, in the Chef ecosystem using test-kitchen with kitchen-docker to quickly spin up pseudo-"VMs" is common practice.

 [~liuyb]: did you by chance ever find a workaround for this issue?

> mesos slave can't start in CentOS-7 docker container
> ----------------------------------------------------
>
>                 Key: MESOS-4248
>                 URL: https://issues.apache.org/jira/browse/MESOS-4248
>             Project: Mesos
>          Issue Type: Bug
>          Components: slave
>    Affects Versions: 0.26.0
>         Environment: My host OS is Debian Jessie,  the container OS is CentOS 7.2.
> {code}
> # cat /etc/system-release
> CentOS Linux release 7.2.1511 (Core) 
> # rpm -qa |grep mesos
> mesosphere-zookeeper-3.4.6-0.1.20141204175332.centos7.x86_64
> mesosphere-el-repo-7-1.noarch
> mesos-0.26.0-0.2.145.centos701406.x86_64
> $ docker version
> Client:
>  Version:      1.9.1
>  API version:  1.21
>  Go version:   go1.4.2
>  Git commit:   a34a1d5
>  Built:        Fri Nov 20 12:59:02 UTC 2015
>  OS/Arch:      linux/amd64
> Server:
>  Version:      1.9.1
>  API version:  1.21
>  Go version:   go1.4.2
>  Git commit:   a34a1d5
>  Built:        Fri Nov 20 12:59:02 UTC 2015
>  OS/Arch:      linux/amd64
> {code}
>            Reporter: Yubao Liu
>
> // Check the "Environment" label above for kinds of software versions.
> "systemctl start mesos-slave" can't start mesos-slave:
> {code}
> # journalctl -u mesos-slave
> ....
> Dec 24 10:35:25 mesos-slave1 systemd[1]: Started Mesos Slave.
> Dec 24 10:35:25 mesos-slave1 systemd[1]: Starting Mesos Slave...
> Dec 24 10:35:25 mesos-slave1 mesos-slave[12845]: I1224 10:35:25.210180 12838 logging.cpp:172] INFO level logging started!
> Dec 24 10:35:25 mesos-slave1 mesos-slave[12845]: I1224 10:35:25.210603 12838 main.cpp:190] Build: 2015-12-16 23:06:16 by root
> Dec 24 10:35:25 mesos-slave1 mesos-slave[12845]: I1224 10:35:25.210625 12838 main.cpp:192] Version: 0.26.0
> Dec 24 10:35:25 mesos-slave1 mesos-slave[12845]: I1224 10:35:25.210634 12838 main.cpp:195] Git tag: 0.26.0
> Dec 24 10:35:25 mesos-slave1 mesos-slave[12845]: I1224 10:35:25.210644 12838 main.cpp:199] Git SHA: d3717e5c4d1bf4fca5c41cd7ea54fae489028faa
> Dec 24 10:35:25 mesos-slave1 mesos-slave[12845]: I1224 10:35:25.210765 12838 containerizer.cpp:142] Using isolation: posix/cpu,posix/mem,filesystem/posix
> Dec 24 10:35:25 mesos-slave1 mesos-slave[12845]: I1224 10:35:25.215638 12838 linux_launcher.cpp:103] Using /sys/fs/cgroup/freezer as the freezer hierarchy for the Linux launcher
> Dec 24 10:35:25 mesos-slave1 mesos-slave[12845]: I1224 10:35:25.220279 12838 systemd.cpp:128] systemd version `219` detected
> Dec 24 10:35:25 mesos-slave1 mesos-slave[12845]: I1224 10:35:25.227017 12838 systemd.cpp:210] Started systemd slice `mesos_executors.slice`
> Dec 24 10:35:25 mesos-slave1 mesos-slave[12845]: Failed to create a containerizer: Could not create MesosContainerizer: Failed to create launcher: Failed to locate systemd cgroups hierarchy: does not exist
> Dec 24 10:35:25 mesos-slave1 systemd[1]: mesos-slave.service: main process exited, code=exited, status=1/FAILURE
> Dec 24 10:35:25 mesos-slave1 systemd[1]: Unit mesos-slave.service entered failed state.
> Dec 24 10:35:25 mesos-slave1 systemd[1]: mesos-slave.service failed.
> {code}
> I used strace to debug it, mesos-slave tried to access "/sys/fs/cgroup/systemd/mesos_executors.slice",  but it's actually at "/sys/fs/cgroup/systemd/system.slice/docker-45875efce9019375cd0c5b29bb1a12275fb6033293f9bf3d97d774a1e5d4de52.scope/mesos_executors.slice/",   mesos-slave should check "/proc/self/cgroup" to find those intermediate directories:
> {code}
> # cat /proc/self/cgroup 
> 8:perf_event:/system.slice/docker-45875efce9019375cd0c5b29bb1a12275fb6033293f9bf3d97d774a1e5d4de52.scope
> 7:blkio:/system.slice/docker-45875efce9019375cd0c5b29bb1a12275fb6033293f9bf3d97d774a1e5d4de52.scope
> 6:net_cls,net_prio:/system.slice/docker-45875efce9019375cd0c5b29bb1a12275fb6033293f9bf3d97d774a1e5d4de52.scope
> 5:freezer:/system.slice/docker-45875efce9019375cd0c5b29bb1a12275fb6033293f9bf3d97d774a1e5d4de52.scope
> 4:devices:/system.slice/docker-45875efce9019375cd0c5b29bb1a12275fb6033293f9bf3d97d774a1e5d4de52.scope
> 3:cpu,cpuacct:/system.slice/docker-45875efce9019375cd0c5b29bb1a12275fb6033293f9bf3d97d774a1e5d4de52.scope
> 2:cpuset:/system.slice/docker-45875efce9019375cd0c5b29bb1a12275fb6033293f9bf3d97d774a1e5d4de52.scope
> 1:name=systemd:/system.slice/docker-45875efce9019375cd0c5b29bb1a12275fb6033293f9bf3d97d774a1e5d4de52.scope
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)