You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by lvfangmin <gi...@git.apache.org> on 2018/12/02 05:18:04 UTC

[GitHub] zookeeper pull request #703: [ZOOKEEPER-1818] Correctly handle potential inc...

Github user lvfangmin commented on a diff in the pull request:

    https://github.com/apache/zookeeper/pull/703#discussion_r238085234
  
    --- Diff: zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/QuorumPeer.java ---
    @@ -1989,6 +1989,38 @@ private boolean updateVote(long designatedLeader, long zxid){
         /**
          * Updates leader election info to avoid inconsistencies when
          * a new server tries to join the ensemble.
    +     *
    +     * Here is the inconsistency scenario we try to solve by updating the peer 
    +     * epoch after following leader:
    +     *
    +     * Let's say we have an ensemble with 3 servers z1, z2 and z3.
    +     *
    +     * 1. z1, z2 were following z3 with peerEpoch to be 0xb8, the new epoch is 
    +     *    0xb9, aka current accepted epoch on disk.
    +     * 2. z2 get restarted, which will use 0xb9 as it's peer epoch when loading
    +     *    the current accept epoch from disk.
    +     * 3. z2 received notification from z1 and z3, which is following z3 with 
    +     *    epoch 0xb8, so it started following z3 again with peer epoch 0xb8.
    +     * 4. before z2 successfully connected to z3, z3 get restarted with new 
    +     *    epoch 0xb9.
    +     * 5. z2 will retry around a few round (default 5s) before giving up, 
    +     *    meanwhile it will report z3 as leader.
    +     * 6. z1 restarted, and looking with peer epoch 0xb9.
    +     * 7. z1 voted z3, and z3 was elected as leader again with peer epoch 0xb9.
    +     * 8. z2 successfully connected to z3 before giving up, but with peer 
    +     *    epoch 0xb8.
    +     * 9. z1 get restarted, looking for leader with peer epoch 0xba, but cannot 
    +     *    join, because z2 is reporting peer epoch 0xb8, while z3 is reporting 
    +     *    0xb9.
    +     *
    +     * By updating the election vote after actually following leader, we can 
    --- End diff --
    
    It's align with the function name which is updating the vote, although we only updated the electionEpoch here.


---