You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by "Flavio Junqueira (JIRA)" <ji...@apache.org> on 2012/06/20 15:58:42 UTC

[jira] [Commented] (ZOOKEEPER-1492) leader cannot switch to LOOKING state when lost the majority

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13397517#comment-13397517 ] 

Flavio Junqueira commented on ZOOKEEPER-1492:
---------------------------------------------

Thanks for reporting this issue. I actually think that this has been pointed out in a different form here: ZOOKEEPER-1113.
                
> leader cannot switch to LOOKING state when lost the majority
> ------------------------------------------------------------
>
>                 Key: ZOOKEEPER-1492
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1492
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.4.3
>         Environment: eclipse linux
>            Reporter: gaoxiao
>            Priority: Critical
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> When a follower leave the cluster, and the cluster cannot achieve a majority, the leader should get out from Leading stat and get into Looking state, but if the there are some observers, the leader will not get away and the client cannot use the cluster.
> eg:
> The servers config:
> server.1=z1:2888:3888
> server.2=z2:2888:3888
> server.3=z3:2888:3888:observer
> At first, 1,2,3 are all started, it's all ok, 2 is the leader, but at this time, if 1 is stopped, 2 will not leave the Leading state, and client cannot connect to cluster.
> I think the problem is:
> (Leader.java  method:lead)
> Line 388-407
>                 syncedSet.add(self.getId());
>                 synchronized (learners) {
>                     for (LearnerHandler f : learners) {
>                         if (f.synced()) {
>                             syncedCount++;
>                             syncedSet.add(f.getSid());
>                         }
>                         f.ping();
>                     }
>                 }
>               if (!tickSkip && !self.getQuorumVerifier().containsQuorum(syncedSet)) {
>                 //if (!tickSkip && syncedCount < self.quorumPeers.size() / 2) {
>                     // Lost quorum, shutdown
>                   // TODO: message is wrong unless majority quorums used
>                     shutdown("Only " + syncedCount + " followers, need "
>                             + (self.getVotingView().size() / 2));
>                     // make sure the order is the same!
>                     // the leader goes to looking
>                     return;
>               } 
> The code add all learners' ping to syncedSet, and I think at this place, only followers should be added to syncedSet, so the method 'containsQuorum' can figure out the majority.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira