You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Anand Mazumdar (JIRA)" <ji...@apache.org> on 2017/01/10 21:34:58 UTC

[jira] [Commented] (MESOS-6848) The default executor does not exit if a single task pod fails.

    [ https://issues.apache.org/jira/browse/MESOS-6848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15816257#comment-15816257 ] 

Anand Mazumdar commented on MESOS-6848:
---------------------------------------

Keeping the issue open to backport to 1.1.x branch.

{noformat}
commit 3efcd33f440c7e56c137bfb7cd953ee35e4b3aa5
Author: Anand Mazumdar <an...@apache.org>
Date:   Tue Jan 10 13:08:03 2017 -0800

    Fixed a bug in the default executor around not committing suicide.

    This bug is only observed when the task group contains a single task.
    The default executor was not committing suicide when this single task
    used to exit with a non-zero status code as per the default restart
    policy.

    Review: https://reviews.apache.org/r/55157/
{noformat}

> The default executor does not exit if a single task pod fails.
> --------------------------------------------------------------
>
>                 Key: MESOS-6848
>                 URL: https://issues.apache.org/jira/browse/MESOS-6848
>             Project: Mesos
>          Issue Type: Bug
>    Affects Versions: 1.1.0
>            Reporter: Anand Mazumdar
>            Assignee: Anand Mazumdar
>            Priority: Blocker
>
> If a task group has a single task and it exits with a non-zero exit code, the default executor does not commit suicide.
> This mostly happens due to the fact that we invoke {{shutdown()}} in {{waited()}} when we notice the termination of a single container here: https://github.com/apache/mesos/blob/master/src/launcher/default_executor.cpp#L666
> but then we return early here after executing all the kill calls: https://github.com/apache/mesos/blob/master/src/launcher/default_executor.cpp#L751
> However, when there is just one task in the task group, this won't result in {{__shutdown}} being called ever leading to the executor committing suicide.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)