You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Ian Downes (JIRA)" <ji...@apache.org> on 2014/11/14 00:25:34 UTC
[jira] [Created] (MESOS-2105) Reliably report OOM even if the
executor exits normally
Ian Downes created MESOS-2105:
---------------------------------
Summary: Reliably report OOM even if the executor exits normally
Key: MESOS-2105
URL: https://issues.apache.org/jira/browse/MESOS-2105
Project: Mesos
Issue Type: Improvement
Components: isolation
Affects Versions: 0.20.0
Reporter: Ian Downes
Container OOMs are asynchronously reported by the kernel and the following sequence can occur:
1) Container OOMs
2) Kernel chooses to kill the task
3) Executor notices, reports TASK_FAILED, then exits
4) MesosContainerizer sees executor exit, *doesn't check for an OOM*, and destroys the container
5) Memory isolator may or may not have seen the OOM event but the container is destroyed anyway.
The task is reported to have failed but without including the cause.
Suggest always checking if an OOM has occurred, even if the executor exits normally.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)