You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by lvfangmin <gi...@git.apache.org> on 2018/12/02 05:18:04 UTC
[GitHub] zookeeper pull request #703: [ZOOKEEPER-1818] Correctly handle potential inc...
Github user lvfangmin commented on a diff in the pull request:
https://github.com/apache/zookeeper/pull/703#discussion_r238085234
--- Diff: zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/QuorumPeer.java ---
@@ -1989,6 +1989,38 @@ private boolean updateVote(long designatedLeader, long zxid){
/**
* Updates leader election info to avoid inconsistencies when
* a new server tries to join the ensemble.
+ *
+ * Here is the inconsistency scenario we try to solve by updating the peer
+ * epoch after following leader:
+ *
+ * Let's say we have an ensemble with 3 servers z1, z2 and z3.
+ *
+ * 1. z1, z2 were following z3 with peerEpoch to be 0xb8, the new epoch is
+ * 0xb9, aka current accepted epoch on disk.
+ * 2. z2 get restarted, which will use 0xb9 as it's peer epoch when loading
+ * the current accept epoch from disk.
+ * 3. z2 received notification from z1 and z3, which is following z3 with
+ * epoch 0xb8, so it started following z3 again with peer epoch 0xb8.
+ * 4. before z2 successfully connected to z3, z3 get restarted with new
+ * epoch 0xb9.
+ * 5. z2 will retry around a few round (default 5s) before giving up,
+ * meanwhile it will report z3 as leader.
+ * 6. z1 restarted, and looking with peer epoch 0xb9.
+ * 7. z1 voted z3, and z3 was elected as leader again with peer epoch 0xb9.
+ * 8. z2 successfully connected to z3 before giving up, but with peer
+ * epoch 0xb8.
+ * 9. z1 get restarted, looking for leader with peer epoch 0xba, but cannot
+ * join, because z2 is reporting peer epoch 0xb8, while z3 is reporting
+ * 0xb9.
+ *
+ * By updating the election vote after actually following leader, we can
--- End diff --
It's align with the function name which is updating the vote, although we only updated the electionEpoch here.
---