You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Andreas Chalupa (JIRA)" <ji...@apache.org> on 2015/10/14 16:30:05 UTC
[jira] [Updated] (MESOS-3730) Docker containers wont start on a set
of mesos slaves
[ https://issues.apache.org/jira/browse/MESOS-3730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andreas Chalupa updated MESOS-3730:
-----------------------------------
Attachment: slaveLogs.zip
Logs from one of the slaves that exhibits the problem
> Docker containers wont start on a set of mesos slaves
> -----------------------------------------------------
>
> Key: MESOS-3730
> URL: https://issues.apache.org/jira/browse/MESOS-3730
> Project: Mesos
> Issue Type: Bug
> Components: slave
> Affects Versions: 0.25.0
> Environment: CentOS 7
> Reporter: Andreas Chalupa
> Attachments: slaveLogs.zip
>
>
> We have 3 nodes that we've designated to run 'data' containers. These are stateful containers that share a volume with the slave host machine. We've seen on two different test beds now that these slaves get into a state where they can't start any containers. The STDERROR of the containers show this error:
> mesos-docker-executor: /tmp/mesos-build/mesos-repo/3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp:110: T& Option<T>::get() [with T = std::basic_string<char>]: Assertion `isSome()' failed.
> *** Aborted at 1444832114 (unix time) try "date -d @1444832114" if you are using GNU date ***
> PC: @ 0x7fc02694a5d7 __GI_raise
> *** SIGABRT (@0x4a7b) received by PID 19067 (TID 0x7fc02913b8c0) from PID 19067; stack trace: ***
> @ 0x7fc027504130 (unknown)
> @ 0x7fc02694a5d7 __GI_raise
> @ 0x7fc02694bcc8 __GI_abort
> @ 0x7fc026943546 __assert_fail_base
> @ 0x7fc0269435f2 __GI___assert_fail
> @ 0x4166b2 Option<>::get()
> @ 0x417725 main
> @ 0x7fc026936af5 __libc_start_main
> @ 0x417875 (unknown)
> I have no idea how to interpret this error and what it might mean.
> Once the slave is in this state it is busted and no containers can be reloaded. I don't think it clears up until we reboot the host machine (maybe a restart of the slave or docker might be enough?)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)