You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@mesos.apache.org by "Huadong Liu (JIRA)" <ji...@apache.org> on 2016/09/21 21:47:20 UTC

[jira] [Created] (MESOS-6221) Ability to post maintenance/schedule with better granularity

Huadong Liu created MESOS-6221:
----------------------------------

Summary: Ability to post maintenance/schedule with better granularity
Key: MESOS-6221
URL: https://issues.apache.org/jira/browse/MESOS-6221
Project: Mesos
Issue Type: Improvement
Components: HTTP API
Reporter: Huadong Liu

Currently the maintenance schedule update is at cluster granularity: "To update the maintenance schedule, the operator should first read the current schedule, make any necessary changes, and then post the modified schedule." http://mesos.apache.org/documentation/latest/maintenance/

In contrast, the machine/down and up endpoints operate at host granularity. One or a set of hosts can be moved to DOWN mode or UP mode once the schedule exists.

Requiring to GET current schedule before POSTing an updated schedule may create races if machine/up and maintenance/schedule update happen at different hosts/processes, for example.

1. mesos master has host A in maintenance down mode.
2. process p1 tries to UP host A.
3. process p2 tries to get the current schedule and then append host B to the schedule.
4. mesos master may end up have A and B in maintenance DRAIN mode although the desired result is to have B in DRAIN mode only.

I cannot find a document to explain why the maintenance schedule has to be updated at the cluster granularity. Although the problem can be resolved by external synchronization, having the ability to update maintenance schedule at hosts granularity seems a better choice.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)