You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Andrew Ruef (JIRA)" <ji...@apache.org> on 2018/06/23 13:39:00 UTC

[jira] [Created] (MESOS-9024) Mesos master segfaults with stack overflow under load

Andrew Ruef created MESOS-9024:
----------------------------------

             Summary: Mesos master segfaults with stack overflow under load
                 Key: MESOS-9024
                 URL: https://issues.apache.org/jira/browse/MESOS-9024
             Project: Mesos
          Issue Type: Bug
          Components: libprocess, master
    Affects Versions: 1.6.0
         Environment: Ubuntu 16.04.4 
            Reporter: Andrew Ruef
         Attachments: stack.txt.gz

Running mesos in non-HA mode on a small cluster under load, the master reliably segfaults due to some state it has worked itself into. The segfault appears to be a stack overflow, at least, the call stack has 72662 elements in it in the crashing thread. The root of the stack appears to be in libprocess. 

I've attached a gzip compressed stack backtrace since the uncompressed stack backtrace is too large to attach to this issue. This happens to me fairly reliably when doing jobs, but it can take many hours or days for mesos to work itself back into this state. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)