You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@mesos.apache.org by "Cody Maloney (JIRA)" <ji...@apache.org> on 2016/08/24 21:04:20 UTC

[jira] [Commented] (MESOS-6078) Add a agent teardown endpoint

    [ https://issues.apache.org/jira/browse/MESOS-6078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15435683#comment-15435683 ] 

Cody Maloney commented on MESOS-6078:
-------------------------------------

For reference on the API for this: Needs to be able to be simply done with a button in a Web UI (Simple HTTP request).

> Add a agent teardown endpoint
> -----------------------------
>
>                 Key: MESOS-6078
>                 URL: https://issues.apache.org/jira/browse/MESOS-6078
>             Project: Mesos
>          Issue Type: Improvement
>          Components: master
>    Affects Versions: 1.0.0, 1.0.1
>            Reporter: Cody Maloney
>            Assignee: Michael Park
>              Labels: mesosphere
>
> Currently, when a whole agent machine is unexpectedly terminated for good (AWS terminated the instance without warning), it goes through the mesos slave removal rate limit before it's gone.
> If a couple agents / a whole rack goes in a cluster of thousands of agents, this can get to be a problem.
> If the agent can be shutdown "cleanly" everything would get scheduled, but once the agent is gone, there currently is no good way for an adminitstrator to indicate the node is gone / gone and it's tasks are lost / should be rescheduled if appropriate as soon as possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)