You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@zookeeper.apache.org by GitBox <gi...@apache.org> on 2020/09/04 02:56:52 UTC

[GitHub] [zookeeper] fregatte123 commented on a change in pull request #1445: ZOOKEEPER-3911: Data inconsistency caused by DIFF sync uncommitted log

fregatte123 commented on a change in pull request #1445:
URL: https://github.com/apache/zookeeper/pull/1445#discussion_r483357169



##########
File path: zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/Learner.java
##########
@@ -741,18 +741,30 @@ protected void syncWithLeader(long newLeaderZxid) throws Exception {
                     }
 
                     self.setCurrentEpoch(newEpoch);
-                    writeToTxnLog = true; //Anything after this needs to go to the transaction log, not applied directly in memory
+                    writeToTxnLog = true;
+                    //Anything after this needs to go to the transaction log, not applied directly in memory
                     isPreZAB1_0 = false;
+
+                    // ZOOKEEPER-3911: make sure sync the uncommitted logs before commit them (ACK NEWLEADER).
+                    sock.setSoTimeout(self.tickTime * self.syncLimit);
+                    self.setSyncMode(QuorumPeer.SyncMode.NONE);
+                    zk.startup();

Review comment:
       @hanm 
   
   According to the code, inconsistencies may occur as follows
   
   step1: Assuming that there are 4 followers and 1 leader at this time, their current maximum zxid is as follows:
   follower A:zxid = n
   follower B:zxid = n+1
   follower C:zxid = n+2
   follower D:zxid = n+3
   leader E:zxid = n+4
   
   step2: The gap between the synchronization of leader and follower D, such as DIFF
   step3: follower D receives the leader’s proposal, puts it in memory, and then calls zk.startup(), which means that the client (such as webapp) can see follower D:zxid = n+3
   
   step4: follower D writes the proposal in memory to disk
   
   step5: follower D writePacket(new QuorumPacket(Leader.ACK, newLeaderZxid, null, null), true). ACK-LD message will be sent to leader(ACK-LD from the paper[1]), and then follower D will be stopped
   
   step6: If leader E is stopped before ACK-L has not reached the quorum. (The leader receives the quorum ACK-LD, which means that the proposals with zxid <= (n+4) have been persisted, which means that no matter whether the leader is down or re-elected or the leader disk is damaged, it cannot be started. The proposal of =n+4 has reached consensus in the cluster.)
   
   step7: The cluster will select the leader from the 3 surviving machines.
    A:zxid = n B:zxid = n+1 C:zxid = n+2
   Suppose that C becomes the leader (a total of 5 machines, 3 survive, and can reach a quorum).
   When A and B are synchronized with C
    A:zxid = n+2 B:zxid = n+2 C:zxid = n+2
   
   Conclusion: zxid=n+3 cannot be accessed, resulting in inconsistent access
   
   After the leader receives the quorum ACK-LD, the follower starts the external access service, which is relatively safe. If the follower provides external access before then, the follower cannot guarantee that its zxid has reached the quorum
   
   
   [1]Zab: High-performance broadcast for primary-backup systems




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org