You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by Vishal Kathuria <vi...@fb.com> on 2011/08/11 20:31:57 UTC

Question on Leader/Follower sync and a potential issue

Hi Flavio/Ben,
I was reviewing the ZK code and there is one case that doesn't seem to be handled correctly. I might be reading the code wrong or it may be a bug, so I wanted to run it by you folks.

Initial Condition

1.       Lets say there are three nodes in the ensemble A,B,C with A being the leader

2.       The current epoch is 7.

3.       For simplicity of the example, lets say zxid is a two digit number, with epoch being the first digit.

4.       The zxid is 73

5.       All the nodes have seen the change 73 and have persistently logged it.

Step 1
Request with zxid 74 is issued. The leader A writes it to the log but there is a crash of the entire ensemble and B,C never write the change 74 to their log.

Step 3
B,C restart, A is still down
B,C form the quorum
B is the new leader. Lets say  B minCommitLog is 71 and maxCommitLog is 73
epoch is now 8, zxid is 80
Request with zxid 81 is successful. On B, minCommitLog is now 71, maxCommitLog is 81

Step 4
A starts up. It applies the change in request with zxid 74 to its in-memory data tree
A contacts B to registerAsFollower and provides 74 as its ZxId
Since 71<=74<=81, B decides to send A the diff. B will send to A the proposal 81.


Problem:
The problem with the above sequence is that A's data tree has the update from request 74, which is not correct. Before getting the proposals 81, A should have received a trunc to 73. I don't see that in the code. If the maxCommitLog on B hadn't bumped to 81 but had stayed at 73, that case seems to be fine.

Looking forward to hearing from you guys regarding whether I am missing something in the code or if it is a bug that we need to fix.

Thanks!
Vishal