You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@mesos.apache.org by "Aaron Bell (JIRA)" <ji...@apache.org> on 2015/10/27 00:26:27 UTC

[jira] [Commented] (MESOS-3374) Improve High Availability documentation

    [ https://issues.apache.org/jira/browse/MESOS-3374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14975332#comment-14975332 ] 

Aaron Bell commented on MESOS-3374:
-----------------------------------

Likely want to link the HA docs to http://mesos.apache.org/documentation/latest/operational-guide/ which provides the other half of the story (explaining fail fast behaviour and need for supervisors).

> Improve High Availability documentation
> ---------------------------------------
>
>                 Key: MESOS-3374
>                 URL: https://issues.apache.org/jira/browse/MESOS-3374
>             Project: Mesos
>          Issue Type: Documentation
>          Components: documentation
>    Affects Versions: 0.23.0
>            Reporter: Aaron Bell
>            Priority: Minor
>
> This [Call Me Maybe article|https://aphyr.com/posts/326-call-me-maybe-chronos] used the Jepsen tool to evaluate Chronos running on Mesos. It uncovered bug MESOS-3280.
> Action: Improve documentation at http://mesos.apache.org/documentation/latest/high-availability to include 'good practice' patterns or recommendations.
> For example:
> - We RECOMMEND running ZooKeeper co-located with Mesos masters.
> -- This reduces the set of network partitions to worry about.
> -- This means you’re going to have 1 ZK node for every Mesos master.
> -- This is NOT technically required. If you have a different ZK deployment architecture you're free to use it.
> - _More ideas_



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)