You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by "Flavio Junqueira (JIRA)" <ji...@apache.org> on 2014/03/30 14:15:16 UTC

[jira] [Commented] (ZOOKEEPER-1814) Reduction of waiting time during Fast Leader Election

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13954660#comment-13954660 ] 

Flavio Junqueira commented on ZOOKEEPER-1814:
---------------------------------------------

It is fine if you want to make this configurable, but I'd say that we need to do a better job explaining the impact of changing it. For example, we could start by saying that servers keep sending batches until they are able to make progress with leader election. We keep increasing the time between batches so that we don't keep sending messages unnecessarily, and we increase the time between batches until we reach a threshold, which is what this parameter sets. 

Actually, the choice of 60s was somewhat arbitrary. Is there a reason for having a different threshold?

> Reduction of waiting time during Fast Leader Election
> -----------------------------------------------------
>
>                 Key: ZOOKEEPER-1814
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1814
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: leaderElection
>    Affects Versions: 3.4.5, 3.5.0
>            Reporter: Daniel Peon
>            Assignee: Daniel Peon
>             Fix For: 3.5.0
>
>         Attachments: ZOOKEEPER-1814.patch, ZOOKEEPER-1814.patch, ZOOKEEPER-1814.patch, ZOOKEEPER-1814.patch, ZOOKEEPER-1814.patch, ZOOKEEPER-1814.patch, ZOOKEEPER-1814.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> FastLeader election takes long time because of the exponential backoff. Currently the time is 60 seconds.
> It would be interesting to give the possibility to configure this parameter, like for example for a Server shutdown.
> Otherwise, it sometimes takes so long and it has been detected a test failure when executing: org.apache.zookeeper.server.quorum.QuorumPeerMainTest.
> This test case waits until 30 seconds and this is smaller than the 60 seconds where the leader election can be waiting for at the moment of shutting down.
> Considering the failure during the test case, this issue was considered a possible bug.



--
This message was sent by Atlassian JIRA
(v6.2#6252)