You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Avinash Sridharan (JIRA)" <ji...@apache.org> on 2016/04/04 22:22:25 UTC

[jira] [Commented] (MESOS-5113) `network/cni` isolator crashes when launched without the --network_cni_plugins_dir flag

    [ https://issues.apache.org/jira/browse/MESOS-5113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15224975#comment-15224975 ] 

Avinash Sridharan commented on MESOS-5113:
------------------------------------------

We are seeing this crash because the `MesosContainerizer` calls recover on all isolators on startup. The `recover` method blindly tries using the `rootDir` without verify if it is set or not. The `rootDir` and `pluginDir` are not set in case the operator does not specify the --network_cni_plugins_dir flag, so we need to check the values of these "optional" variables before using this in the `network/cni` isolator. 

> `network/cni` isolator crashes when launched without the --network_cni_plugins_dir flag
> ---------------------------------------------------------------------------------------
>
>                 Key: MESOS-5113
>                 URL: https://issues.apache.org/jira/browse/MESOS-5113
>             Project: Mesos
>          Issue Type: Bug
>          Components: containerization
>         Environment: linux
>            Reporter: Avinash Sridharan
>            Assignee: Avinash Sridharan
>
> If we start the agent with the --isolation='network/cni' but do not specify the --network_cni_plugins_dir flag, the agent crashes with the following stack dump:
> 0x00007ffff2324cc9 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
> 56      ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
> (gdb) bt
> #0  0x00007ffff2324cc9 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
> #1  0x00007ffff23280d8 in __GI_abort () at abort.c:89
> #2  0x00007ffff231db86 in __assert_fail_base (fmt=0x7ffff246e830 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x451f5c "isSome()",
>     file=file@entry=0x451f65 "../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp", line=line@entry=111,
>     function=function@entry=0x45294a "const T &Option<std::basic_string<char> >::get() const & [T = std::basic_string<char>]") at assert.c:92
> #3  0x00007ffff231dc32 in __GI___assert_fail (assertion=0x451f5c "isSome()", file=0x451f65 "../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp", line=111,
>     function=0x45294a "const T &Option<std::basic_string<char> >::get() const & [T = std::basic_string<char>]") at assert.c:101
> #4  0x0000000000432c0d in Option<std::string>::get() const & (this=0x6c1ea8) at ../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp:111
> Python Exception <class 'IndexError'> list index out of range:
> #5  0x00007ffff63ef7cc in mesos::internal::slave::NetworkCniIsolatorProcess::recover (this=0x6c1e70, states=empty std::list, orphans=...) at ../../src/slave/containerizer/mesos/isolators/network/cni/cni.cpp:331
> #6  0x00007ffff60cddd8 in operator() (this=0x7fffc0001e00, process=0x6c1ef8) at ../../3rdparty/libprocess/include/process/dispatch.hpp:239
> #7  0x00007ffff60cd972 in std::_Function_handler<void (process::ProcessBase*), process::Future<Nothing> process::dispatch<Nothing, mesos::internal::slave::MesosIsolatorProcess, std::list<mesos::slave::ContainerState, std::allocator<mesos::slave::ContainerState> > const&, hashset<mesos::ContainerID, std::hash<mesos::ContainerID>, std::equal_to<mesos::ContainerID> > const&, std::list<mesos::slave::ContainerState, std::allocator<mesos::slave::ContainerState> >, hashset<mesos::ContainerID, std::hash<mesos::ContainerID>, std::equal_to<mesos::ContainerID> > >(process::PID<mesos::internal::slave::MesosIsolatorProcess> const&, process::Future<Nothing> (mesos::internal::slave::MesosIsolatorProcess::*)(std::list<mesos::slave::ContainerState, std::allocator<mesos::slave::ContainerState> > const&, hashset<mesos::ContainerID, std::hash<mesos::ContainerID>, std::equal_to<mesos::ContainerID> > const&), std::list<mesos::slave::ContainerState, std::allocator<mesos::slave::ContainerState> >, hashset<mesos::ContainerID, std::hash<mesos::ContainerID>, std::equal_to<mesos::ContainerID> >)::{lambda(process::ProcessBase*)#1}>::_M_invoke(std::_Any_data const&, process::ProcessBase*) (__functor=..., __args=0x6c1ef8) at /usr/bin/../lib/gcc/x86_64-linux-gnu/4.8/../../../../include/c++/4.8/functional:2071
> #8  0x00007ffff6a6bf38 in std::function<void (process::ProcessBase*)>::operator()(process::ProcessBase*) const (this=0x7fffc0001d70, __args=0x6c1ef8)
>     at /usr/bin/../lib/gcc/x86_64-linux-gnu/4.8/../../../../include/c++/4.8/functional:2471
> #9  0x00007ffff6a561b4 in process::ProcessBase::visit (this=0x6c1ef8, event=...) at ../../../3rdparty/libprocess/src/process.cpp:3130
> #10 0x00007ffff6aac5fe in process::DispatchEvent::visit (this=0x7fffc0001570, visitor=0x6c1ef8) at ../../../3rdparty/libprocess/include/process/event.hpp:161
> #11 0x00007ffff55e9c91 in process::ProcessBase::serve (this=0x6c1ef8, event=...) at ../../3rdparty/libprocess/include/process/process.hpp:82
> #12 0x00007ffff6a53ed4 in process::ProcessManager::resume (this=0x67cca0, process=0x6c1ef8) at ../../../3rdparty/libprocess/src/process.cpp:2570
> #13 0x00007ffff6a5bff5 in operator() (this=0x697d70, joining=...) at ../../../3rdparty/libprocess/src/process.cpp:2218
> #14 0x00007ffff6a5bf33 in std::_Bind<process::ProcessManager::init_threads()::$_1 (std::reference_wrapper<std::atomic_bool const>)>::__call<void, , 0ul>(std::tuple<>&&, std::_Index_tuple<0ul>) (this=0x697d70,
>     __args=<unknown type in /home/vagrant/mesosphere/mesos/build/src/.libs/libmesos-0.29.0.so, CU 0x45bb552, DIE 0x469efe5>) at /usr/bin/../lib/gcc/x86_64-linux-gnu/4.8/../../../../include/c++/4.8/functional:1295
> #15 0x00007ffff6a5bee6 in std::_Bind<process::ProcessManager::init_threads()::$_1 (std::reference_wrapper<std::atomic_bool const>)>::operator()<, void>() (this=0x697d70)
>     at /usr/bin/../lib/gcc/x86_64-linux-gnu/4.8/../../../../include/c++/4.8/functional:1353
> #16 0x00007ffff6a5be95 in std::_Bind_simple<std::_Bind<process::ProcessManager::init_threads()::$_1 (std::reference_wrapper<std::atomic_bool const>)> ()>::_M_invoke<>(std::_Index_tuple<>) (this=0x697d70)
>     at /usr/bin/../lib/gcc/x86_64-linux-gnu/4.8/../../../../include/c++/4.8/functional:1731
> #17 0x00007ffff6a5be65 in std::_Bind_simple<std::_Bind<process::ProcessManager::init_threads()::$_1 (std::reference_wrapper<std::atomic_bool const>)> ()>::operator()() (this=0x697d70)
>     at /usr/bin/../lib/gcc/x86_64-linux-gnu/4.8/../../../../include/c++/4.8/functional:1720
> #18 0x00007ffff6a5be3c in std::thread::_Impl<std::_Bind_simple<std::_Bind<process::ProcessManager::init_threads()::$_1 (std::reference_wrapper<std::atomic_bool const>)> ()> >::_M_run() (this=0x697d58)
>     at /usr/bin/../lib/gcc/x86_64-linux-gnu/4.8/../../../../include/c++/4.8/thread:115
> #19 0x00007ffff2b98a60 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
> #20 0x00007ffff26bb182 in start_thread (arg=0x7fffeb92d700) at pthread_create.c:312
> #21 0x00007ffff23e847d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
> (gdb) frame 4
> #4  0x0000000000432c0d in Option<std::string>::get() const & (this=0x6c1ea8) at ../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp:111



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)