You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Ian Downes (JIRA)" <ji...@apache.org> on 2014/11/14 00:25:34 UTC

[jira] [Created] (MESOS-2105) Reliably report OOM even if the executor exits normally

Ian Downes created MESOS-2105:
---------------------------------

             Summary: Reliably report OOM even if the executor exits normally
                 Key: MESOS-2105
                 URL: https://issues.apache.org/jira/browse/MESOS-2105
             Project: Mesos
          Issue Type: Improvement
          Components: isolation
    Affects Versions: 0.20.0
            Reporter: Ian Downes


Container OOMs are asynchronously reported by the kernel and the following sequence can occur:
1) Container OOMs
2) Kernel chooses to kill the task
3) Executor notices, reports TASK_FAILED, then exits
4) MesosContainerizer sees executor exit, *doesn't check for an OOM*, and destroys the container
5) Memory isolator may or may not have seen the OOM event but the container is destroyed anyway.

The task is reported to have failed but without including the cause.

Suggest always checking if an OOM has occurred, even if the executor exits normally.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)