You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mesos.apache.org by Jord Sonneveld <jo...@moz.com> on 2015/06/24 20:30:45 UTC

oom kills lead up to cpu lockup

This issue has been harassing our cluster.  Any thoughts would be
appreciated.

I've attached the syslog showing the issue.  I've cut out several sections
of oom kills of node processes, however I've left in the last two.
Previous oom kill messages are similar.

These oom kills occur before the system locks up, as can be seen in the
attached log message.

What on earth is going on?  https://issues.apache.org/jira/browse/MESOS-662
describes kernel hangs due to OOM situations.  I'm not saying this is
happening again; I am wondering if there is an other issue that is causing
a lockup.

Thoughts?

Re: oom kills lead up to cpu lockup

Posted by zhou weitao <zh...@gmail.com>.
Thanks for your share.

2015-06-26 4:40 GMT+08:00 Jord Sonneveld <jo...@moz.com>:

> Upgrading kernel to 3.19.0-21-generic seems to have resolved this.
>
> On Wed, Jun 24, 2015 at 12:06 PM, Jord Sonneveld <jo...@moz.com> wrote:
>
>> Could it be:
>> http://blog.nitrous.io/2014/03/10/stability-and-a-linux-oom-killer-bug.html
>>
>
>

Re: oom kills lead up to cpu lockup

Posted by Jord Sonneveld <jo...@moz.com>.
Upgrading kernel to 3.19.0-21-generic seems to have resolved this.

On Wed, Jun 24, 2015 at 12:06 PM, Jord Sonneveld <jo...@moz.com> wrote:

> Could it be:
> http://blog.nitrous.io/2014/03/10/stability-and-a-linux-oom-killer-bug.html
>

Re: oom kills lead up to cpu lockup

Posted by Jord Sonneveld <jo...@moz.com>.
Could it be:
http://blog.nitrous.io/2014/03/10/stability-and-a-linux-oom-killer-bug.html