You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Adam B (JIRA)" <ji...@apache.org> on 2017/04/14 08:25:41 UTC

[jira] [Updated] (MESOS-6828) Consider ways for frameworks to ignore offers with an Unavailability

     [ https://issues.apache.org/jira/browse/MESOS-6828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Adam B updated MESOS-6828:
--------------------------
    Labels: maintenance mesosphere  (was: maintenance)

> Consider ways for frameworks to ignore offers with an Unavailability
> --------------------------------------------------------------------
>
>                 Key: MESOS-6828
>                 URL: https://issues.apache.org/jira/browse/MESOS-6828
>             Project: Mesos
>          Issue Type: Improvement
>            Reporter: Joris Van Remoortere
>            Assignee: Artem Harutyunyan
>              Labels: maintenance, mesosphere
>
> Due to the opt-in nature of maintenance primitives in Mesos, there is a deficiency for cluster administrators when frameworks have not opted in.
> An example case:
> - Cluster with reasonable churn (tasks terminate naturally)
> - Operator specifies maintenance schedule
> Ideally *even* in a world where none of the frameworks had opted in to maintenance primitives the operator would have some way of preventing frameworks from scheduling further work on agents in the schedule. The natural termination of the tasks in the cluster would allow the nodes to drain gracefully and the operator to then perform maintenance.
> 2 options that have been discussed so far:
> # Provide a capability for frameworks to automatically filter offers with an {{Unavailability}} set.
> #* Pro: Finer grained control. Allows other frameworks to keep scheduling short lived tasks that can complete before the Unavailability.
> #* Con: All frameworks have to be updated. Consider making this an environment variable to the scheduler driver for legacy frameworks.
> # Provide a flag on the master to filter all offers with an {{Unavailability}} set.
> #* Pro: Immediately actionable / usable.
> #* Con: Coarse grained. Some frameworks may suffer efficiency.
> #* Con: *Dangerous*: planning out a multi-day maintenance schedule for an entire cluster will prevent any frameworks from scheduling further work, potentially stalling the cluster.
> Action Items: Provide further context for each option and consider others. We need to ensure we have something immediately consumable by users to fill the gap until maintenance primitives are the norm. We also need to ensure we prevent dangerous scenarios like the Con listed for option #2.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)