You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@mesos.apache.org by "Benjamin Mahler (JIRA)" <ji...@apache.org> on 2013/11/12 23:13:19 UTC

[jira] [Updated] (MESOS-770) Rate control and randomization of Replicated Log catching-up

     [ https://issues.apache.org/jira/browse/MESOS-770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Benjamin Mahler updated MESOS-770:
----------------------------------

    Component/s: replicated log

> Rate control and randomization of Replicated Log catching-up
> ------------------------------------------------------------
>
>                 Key: MESOS-770
>                 URL: https://issues.apache.org/jira/browse/MESOS-770
>             Project: Mesos
>          Issue Type: Improvement
>          Components: replicated log
>            Reporter: Yan Xu
>
> When the log is catching up either in the process of recovering or after coordinator failover the Paxos protocol is run on multiple positions (possibly the entire log) concurrently. Too much concurrency could have negative impact on the network and the problem may be exacerbated by the contention among between multiple recovering replicas and the coordinator.
> Rate control helps limit the number of concurrent positions a proposer (recoverer or coordinator) seeks consensus at a time. We can batch a number of positions each time.
> Randomly picking the positions in each batch reduces the possibility that multiple proposers contend for the same position at the same time which causes conflict and retries.



--
This message was sent by Atlassian JIRA
(v6.1#6144)