You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mesos.apache.org by Tinco Andringa <ma...@tinco.nl> on 2014/12/12 10:55:40 UTC

DockerContainerizer error on two slaves

Hi, I'm provisioning a mesos cluster and on two of my machines I get the
following error when starting mesos-slave:

oot@web1:~# /usr/local/sbin/mesos-slave --master=zk://localhost:2181/mesos
--log_dir=/var/log/mesos --isolation=cgroups/cpu,cgroups/mem
--containerizers=docker,mesos --executor_registration_timeout=5mins
--work_dir=/var/run/work
I1212 10:46:30.782308 32590 logging.cpp:172] INFO level logging started!
I1212 10:46:30.782580 32590 main.cpp:142] Build: 2014-11-22 05:29:13 by root
I1212 10:46:30.782615 32590 main.cpp:144] Version: 0.21.0
I1212 10:46:30.782640 32590 main.cpp:147] Git tag: 0.21.0
I1212 10:46:30.782665 32590 main.cpp:151] Git SHA:
ab8fa655d34e8e15a4290422df38a18db1c09b5b
Failed to create a containerizer: Could not create DockerContainerizer:
Failed to find a mounted cgroups hierarchy for the 'cpu' subsystem; you
probably need to mount cgroups manually!

I have four machines in total, on two machines, db1 and db2 everything runs
fine and the slaves get added to the cluster. On web1 and web2 it fails and
they don't appear in the cluster. I run the exact same command on each of
the machines. Mesos-master runs fine on web1, web2 and db1.

Obviously there's some difference between the web and db machines, but I'm
really unclear on what that difference is specifically. Most of my chef
scripts are ran on both types of machine, there's just some extra webserver
stuff on the web machines, and some extra db stuff on the db machines. Db2
is the only node that doesn't run zookeeper or mesos-master.

Any hints or tips to get closer to the root of the problem would be much
appreciated, I'm not affraid to dive into the source a little if necessary.

Kind regards,
Tinco

Re: DockerContainerizer error on two slaves

Posted by Tim Chen <ti...@mesosphere.io>.
Hi Tinco,

What OS/environment are you running mesos-slave on? You might need to
enable cgroups if it's not enabled/mounted by default.

Tim

On Tue, Dec 16, 2014 at 11:26 AM, Ian Downes <id...@twitter.com> wrote:
>
> Can you also please post the output of these commands for a working and a
> non-working host?
>
> $ cat /proc/cgroups
>
> $ cat /proc/mounts
>
> Are you running inside a Docker or systemd container?
>
> On Tue, Dec 16, 2014 at 11:22 AM, Benjamin Mahler <
> benjamin.mahler@gmail.com> wrote:
>>
>> +Tim Chen (please chime in if I'm missing something)
>>
>> Sorry for the delay, from a quick glance it looks like the
>> DockerContainerizer it a bit less liberal in the setting up of cgroups if
>> they are not mounted on the machine. I'm curious, if you remove "docker"
>> from the containerizers flag, does it work?
>>
>> Otherwise, you can try mount the cgroups manually, as suggested by the
>> error message.
>>
>> Feel free to file a ticket to capture this!
>>
>> Hope this helps,
>> Ben
>>
>> On Fri, Dec 12, 2014 at 1:55 AM, Tinco Andringa <ma...@tinco.nl> wrote:
>>>
>>> Hi, I'm provisioning a mesos cluster and on two of my machines I get the
>>> following error when starting mesos-slave:
>>>
>>> oot@web1:~# /usr/local/sbin/mesos-slave
>>> --master=zk://localhost:2181/mesos --log_dir=/var/log/mesos
>>> --isolation=cgroups/cpu,cgroups/mem --containerizers=docker,mesos
>>> --executor_registration_timeout=5mins --work_dir=/var/run/work
>>> I1212 10:46:30.782308 32590 logging.cpp:172] INFO level logging started!
>>> I1212 10:46:30.782580 32590 main.cpp:142] Build: 2014-11-22 05:29:13 by
>>> root
>>> I1212 10:46:30.782615 32590 main.cpp:144] Version: 0.21.0
>>> I1212 10:46:30.782640 32590 main.cpp:147] Git tag: 0.21.0
>>> I1212 10:46:30.782665 32590 main.cpp:151] Git SHA:
>>> ab8fa655d34e8e15a4290422df38a18db1c09b5b
>>> Failed to create a containerizer: Could not create DockerContainerizer:
>>> Failed to find a mounted cgroups hierarchy for the 'cpu' subsystem; you
>>> probably need to mount cgroups manually!
>>>
>>> I have four machines in total, on two machines, db1 and db2 everything
>>> runs fine and the slaves get added to the cluster. On web1 and web2 it
>>> fails and they don't appear in the cluster. I run the exact same command on
>>> each of the machines. Mesos-master runs fine on web1, web2 and db1.
>>>
>>> Obviously there's some difference between the web and db machines, but
>>> I'm really unclear on what that difference is specifically. Most of my chef
>>> scripts are ran on both types of machine, there's just some extra webserver
>>> stuff on the web machines, and some extra db stuff on the db machines. Db2
>>> is the only node that doesn't run zookeeper or mesos-master.
>>>
>>> Any hints or tips to get closer to the root of the problem would be much
>>> appreciated, I'm not affraid to dive into the source a little if necessary.
>>>
>>> Kind regards,
>>> Tinco
>>>
>>

Re: DockerContainerizer error on two slaves

Posted by Ian Downes <id...@twitter.com>.
Can you also please post the output of these commands for a working and a
non-working host?

$ cat /proc/cgroups

$ cat /proc/mounts

Are you running inside a Docker or systemd container?

On Tue, Dec 16, 2014 at 11:22 AM, Benjamin Mahler <benjamin.mahler@gmail.com
> wrote:
>
> +Tim Chen (please chime in if I'm missing something)
>
> Sorry for the delay, from a quick glance it looks like the
> DockerContainerizer it a bit less liberal in the setting up of cgroups if
> they are not mounted on the machine. I'm curious, if you remove "docker"
> from the containerizers flag, does it work?
>
> Otherwise, you can try mount the cgroups manually, as suggested by the
> error message.
>
> Feel free to file a ticket to capture this!
>
> Hope this helps,
> Ben
>
> On Fri, Dec 12, 2014 at 1:55 AM, Tinco Andringa <ma...@tinco.nl> wrote:
>>
>> Hi, I'm provisioning a mesos cluster and on two of my machines I get the
>> following error when starting mesos-slave:
>>
>> oot@web1:~# /usr/local/sbin/mesos-slave
>> --master=zk://localhost:2181/mesos --log_dir=/var/log/mesos
>> --isolation=cgroups/cpu,cgroups/mem --containerizers=docker,mesos
>> --executor_registration_timeout=5mins --work_dir=/var/run/work
>> I1212 10:46:30.782308 32590 logging.cpp:172] INFO level logging started!
>> I1212 10:46:30.782580 32590 main.cpp:142] Build: 2014-11-22 05:29:13 by
>> root
>> I1212 10:46:30.782615 32590 main.cpp:144] Version: 0.21.0
>> I1212 10:46:30.782640 32590 main.cpp:147] Git tag: 0.21.0
>> I1212 10:46:30.782665 32590 main.cpp:151] Git SHA:
>> ab8fa655d34e8e15a4290422df38a18db1c09b5b
>> Failed to create a containerizer: Could not create DockerContainerizer:
>> Failed to find a mounted cgroups hierarchy for the 'cpu' subsystem; you
>> probably need to mount cgroups manually!
>>
>> I have four machines in total, on two machines, db1 and db2 everything
>> runs fine and the slaves get added to the cluster. On web1 and web2 it
>> fails and they don't appear in the cluster. I run the exact same command on
>> each of the machines. Mesos-master runs fine on web1, web2 and db1.
>>
>> Obviously there's some difference between the web and db machines, but
>> I'm really unclear on what that difference is specifically. Most of my chef
>> scripts are ran on both types of machine, there's just some extra webserver
>> stuff on the web machines, and some extra db stuff on the db machines. Db2
>> is the only node that doesn't run zookeeper or mesos-master.
>>
>> Any hints or tips to get closer to the root of the problem would be much
>> appreciated, I'm not affraid to dive into the source a little if necessary.
>>
>> Kind regards,
>> Tinco
>>
>

Re: DockerContainerizer error on two slaves

Posted by Benjamin Mahler <be...@gmail.com>.
+Tim Chen (please chime in if I'm missing something)

Sorry for the delay, from a quick glance it looks like the
DockerContainerizer it a bit less liberal in the setting up of cgroups if
they are not mounted on the machine. I'm curious, if you remove "docker"
from the containerizers flag, does it work?

Otherwise, you can try mount the cgroups manually, as suggested by the
error message.

Feel free to file a ticket to capture this!

Hope this helps,
Ben

On Fri, Dec 12, 2014 at 1:55 AM, Tinco Andringa <ma...@tinco.nl> wrote:
>
> Hi, I'm provisioning a mesos cluster and on two of my machines I get the
> following error when starting mesos-slave:
>
> oot@web1:~# /usr/local/sbin/mesos-slave
> --master=zk://localhost:2181/mesos --log_dir=/var/log/mesos
> --isolation=cgroups/cpu,cgroups/mem --containerizers=docker,mesos
> --executor_registration_timeout=5mins --work_dir=/var/run/work
> I1212 10:46:30.782308 32590 logging.cpp:172] INFO level logging started!
> I1212 10:46:30.782580 32590 main.cpp:142] Build: 2014-11-22 05:29:13 by
> root
> I1212 10:46:30.782615 32590 main.cpp:144] Version: 0.21.0
> I1212 10:46:30.782640 32590 main.cpp:147] Git tag: 0.21.0
> I1212 10:46:30.782665 32590 main.cpp:151] Git SHA:
> ab8fa655d34e8e15a4290422df38a18db1c09b5b
> Failed to create a containerizer: Could not create DockerContainerizer:
> Failed to find a mounted cgroups hierarchy for the 'cpu' subsystem; you
> probably need to mount cgroups manually!
>
> I have four machines in total, on two machines, db1 and db2 everything
> runs fine and the slaves get added to the cluster. On web1 and web2 it
> fails and they don't appear in the cluster. I run the exact same command on
> each of the machines. Mesos-master runs fine on web1, web2 and db1.
>
> Obviously there's some difference between the web and db machines, but I'm
> really unclear on what that difference is specifically. Most of my chef
> scripts are ran on both types of machine, there's just some extra webserver
> stuff on the web machines, and some extra db stuff on the db machines. Db2
> is the only node that doesn't run zookeeper or mesos-master.
>
> Any hints or tips to get closer to the root of the problem would be much
> appreciated, I'm not affraid to dive into the source a little if necessary.
>
> Kind regards,
> Tinco
>