You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Andreas Chalupa (JIRA)" <ji...@apache.org> on 2015/10/14 16:24:05 UTC
[jira] [Created] (MESOS-3730) Docker containers wont start on a set
of mesos slaves
Andreas Chalupa created MESOS-3730:
--------------------------------------
Summary: Docker containers wont start on a set of mesos slaves
Key: MESOS-3730
URL: https://issues.apache.org/jira/browse/MESOS-3730
Project: Mesos
Issue Type: Bug
Components: slave
Affects Versions: 0.25.0
Environment: CentOS 7
Reporter: Andreas Chalupa
We have 3 nodes that we've designated to run 'data' containers. These are stateful containers that share a volume with the slave host machine. We've seen on two different test beds now that these slaves get into a state where they can't start any containers. The STDERROR of the containers show this error:
mesos-docker-executor: /tmp/mesos-build/mesos-repo/3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp:110: T& Option<T>::get() [with T = std::basic_string<char>]: Assertion `isSome()' failed.
*** Aborted at 1444832114 (unix time) try "date -d @1444832114" if you are using GNU date ***
PC: @ 0x7fc02694a5d7 __GI_raise
*** SIGABRT (@0x4a7b) received by PID 19067 (TID 0x7fc02913b8c0) from PID 19067; stack trace: ***
@ 0x7fc027504130 (unknown)
@ 0x7fc02694a5d7 __GI_raise
@ 0x7fc02694bcc8 __GI_abort
@ 0x7fc026943546 __assert_fail_base
@ 0x7fc0269435f2 __GI___assert_fail
@ 0x4166b2 Option<>::get()
@ 0x417725 main
@ 0x7fc026936af5 __libc_start_main
@ 0x417875 (unknown)
I have no idea how to interpret this error and what it might mean.
Once the slave is in this state it is busted and no containers can be reloaded. I don't think it clears up until we reboot the host machine (maybe a restart of the slave or docker might be enough?)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)