You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Andrei Budnik (JIRA)" <ji...@apache.org> on 2018/06/23 13:54:00 UTC

[jira] [Commented] (MESOS-9024) Mesos master segfaults with stack overflow under load

    [ https://issues.apache.org/jira/browse/MESOS-9024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16521101#comment-16521101 ] 

Andrei Budnik commented on MESOS-9024:
--------------------------------------

May you please add repeating part of the stack trace to the description?

> Mesos master segfaults with stack overflow under load
> -----------------------------------------------------
>
>                 Key: MESOS-9024
>                 URL: https://issues.apache.org/jira/browse/MESOS-9024
>             Project: Mesos
>          Issue Type: Bug
>          Components: libprocess, master
>    Affects Versions: 1.6.0
>         Environment: Ubuntu 16.04.4 
>            Reporter: Andrew Ruef
>            Priority: Major
>         Attachments: stack.txt.gz
>
>
> Running mesos in non-HA mode on a small cluster under load, the master reliably segfaults due to some state it has worked itself into. The segfault appears to be a stack overflow, at least, the call stack has 72662 elements in it in the crashing thread. The root of the stack appears to be in libprocess. 
> I've attached a gzip compressed stack backtrace since the uncompressed stack backtrace is too large to attach to this issue. This happens to me fairly reliably when doing jobs, but it can take many hours or days for mesos to work itself back into this state. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)