You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Yan Xu (JIRA)" <ji...@apache.org> on 2017/09/01 20:28:00 UTC

[jira] [Commented] (MESOS-7921) process::EventQueue sometimes crashes

    [ https://issues.apache.org/jira/browse/MESOS-7921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16151096#comment-16151096 ] 

Yan Xu commented on MESOS-7921:
-------------------------------

New failure on ASF CI: https://lists.apache.org/thread.html/bf6cacef549f0822814914b32e281a55ce32a02232bef5070cce512c@%3Cbuilds.mesos.apache.org%3E

Similar to the one posted in the JIRA description.
{noformat:title=}
*** Aborted at 1504241455 (unix time) try "date -d @1504241455" if you are using GNU date ***
I0901 04:50:55.571101 779 registrar.cpp:424] Successfully recovered registrar
I0901 04:50:55.571496 779 master.cpp:1804] Recovered 0 agents from the registry (129B); allowing 10mins for agents to re-register
I0901 04:50:55.571521 793 hierarchical.cpp:209] Skipping recovery of hierarchical allocator: nothing to recover
PC: @ 0x2b4f0af34c80 process::EventQueue::Consumer::empty()
*** SIGSEGV (@0x8) received by PID 773 (TID 0x2b4f17caa700) from PID 8; stack trace: ***
@ 0x2b4f0f452330 (unknown)
@ 0x2b4f0af34c80 process::EventQueue::Consumer::empty()
@ 0x2b4f0af18c20 process::ProcessManager::resume()
@ 0x2b4f0af27a71 process::ProcessManager::init_threads()::$_9::operator()()
@ 0x2b4f0af279b5 _ZNSt12_Bind_simpleIFZN7process14ProcessManager12init_threadsEvE3$_9vEE9_M_invokeIJEEEvSt12_Index_tupleIJXspT_EEE
@ 0x2b4f0af27985 std::_Bind_simple<>::operator()()
@ 0x2b4f0af2795c std::thread::_Impl<>::_M_run()
@ 0x2b4f0f711a60 (unknown)
@ 0x2b4f0f44a184 start_thread
@ 0x2b4f0ff7dffd (unknown)
{noformat}

> process::EventQueue sometimes crashes
> -------------------------------------
>
>                 Key: MESOS-7921
>                 URL: https://issues.apache.org/jira/browse/MESOS-7921
>             Project: Mesos
>          Issue Type: Bug
>          Components: libprocess
>    Affects Versions: 1.4.0
>         Environment: autotools,gcc,--verbose,GLOG_v=1 MESOS_VERBOSE=1,ubuntu:14.04,(ubuntu)&&(!ubuntu-us1)&&(!ubuntu-eu2)
> Note that --enable-lock-free-event-queue is not enabled.
> Details: https://builds.apache.org/job/Mesos-Buildbot/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(ubuntu)&&(!ubuntu-us1)&&(!ubuntu-eu2)/4159/injectedEnvVars/
>            Reporter: Yan Xu
>            Priority: Blocker
>         Attachments: FetcherCacheTest.CachedCustomOutputFileWithSubdirectory.log.txt, MesosContainerizerSlaveRecoveryTest.ResourceStatisticsFullLog.txt
>
>
> The following segfault is found on [ASF|https://builds.apache.org/job/Mesos-Buildbot/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(ubuntu)&&(!ubuntu-us1)&&(!ubuntu-eu2)/4159/] in {{MesosContainerizerSlaveRecoveryTest.ResourceStatistics}} but it's flaky and shows up in other tests and environments (with or without --enable-lock-free-event-queue) as well.
> {noformat: title=Configuration}
> ./bootstrap '&&' ./configure --verbose '&&' make -j6 distcheck
> {noformat}
> {noformat:title=}
> *** Aborted at 1503937885 (unix time) try "date -d @1503937885" if you are using GNU date ***
> PC: @     0x2b9e2581caa0 process::EventQueue::Consumer::empty()
> *** SIGSEGV (@0x8) received by PID 751 (TID 0x2b9e31978700) from PID 8; stack trace: ***
>     @     0x2b9e29d26330 (unknown)
>     @     0x2b9e2581caa0 process::EventQueue::Consumer::empty()
>     @     0x2b9e25800a40 process::ProcessManager::resume()
>     @     0x2b9e2580f891 process::ProcessManager::init_threads()::$_9::operator()()
>     @     0x2b9e2580f7d5 _ZNSt12_Bind_simpleIFZN7process14ProcessManager12init_threadsEvE3$_9vEE9_M_invokeIJEEEvSt12_Index_tupleIJXspT_EEE
>     @     0x2b9e2580f7a5 std::_Bind_simple<>::operator()()
>     @     0x2b9e2580f77c std::thread::_Impl<>::_M_run()
>     @     0x2b9e29fe5a60 (unknown)
>     @     0x2b9e29d1e184 start_thread
>     @     0x2b9e2a851ffd (unknown)
> make[3]: *** [CMakeFiles/check] Segmentation fault (core dumped)
> {noformat}
> A builds@mesos.apache.org query shows many such instances: https://lists.apache.org/list.html?builds@mesos.apache.org:lte=1M:process%3A%3AEventQueue%3A%3AConsumer%3A%3Aempty



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)