You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@mesos.apache.org by "Vinod Kone (JIRA)" <ji...@apache.org> on 2017/04/18 22:25:41 UTC

[jira] [Commented] (MESOS-3420) Resolve shutdown semantics for Machine/Down

    [ https://issues.apache.org/jira/browse/MESOS-3420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15973631#comment-15973631 ] 

Vinod Kone commented on MESOS-3420:
-----------------------------------

cc [~anandmazumdar]

> Resolve shutdown semantics for Machine/Down
> -------------------------------------------
>
>                 Key: MESOS-3420
>                 URL: https://issues.apache.org/jira/browse/MESOS-3420
>             Project: Mesos
>          Issue Type: Task
>            Reporter: Joris Van Remoortere
>              Labels: maintenance, mesosphere
>
> When an operator uses the {{machine/down}} endpoint, the master sends a shutdown message to the agent.
> We need to discuss and resolve the semantics that we want regarding the operators and frameworks knowing when their tasks are terminated.
> One option is to explicitly remove the agent from the master which will send the {{TASK_LOST}} updates and {{SlaveLostMessage}} directly from the master. The concern around this is that during a network partition, or if the agent was down at the time, that these tasks could still be running.
> This is a general problem related to task life-times being dissociated with that life-time of the agent.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)