You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by "Alexander Shraer (JIRA)" <ji...@apache.org> on 2011/07/22 06:13:58 UTC

[jira] [Updated] (ZOOKEEPER-1131) Transactions can be dropped because leader election uses last committed zxid instead of last acknowledged/received zxid

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alexander Shraer updated ZOOKEEPER-1131:
----------------------------------------

    Description: 
Suppose we have 3 servers - A, B, C which have seen the same number of commits. 
- A is the leader and it sends out a new proposal.
- B doesn't receive the proposal, but A and C receive and ACK it
- A commits the proposal, but fails before anyone else sees the commit.
- B and C start leader election. 
- since both B and C saw the same number of commits, if B has a higher server-id than C, leader election will elect B. Then, the last transaction will be truncated from C's log, which is a bug since it was acked by a majority.
  
This happens since servers propose their last committed zxid in leader election, and not their last received / acked zxid (this is not being tracked, AFAIK). See method
FastLeaderElection.getInitLastLoggedZxid(), which calls QuorumPeer.getLastLoggedZxid(), which is supposed to return the last logged Zxid, but instead calls zkDb.getDataTreeLastProcessedZxid() which returns the last committed zxid.

  was:
Suppose we have 3 servers - A, B, C. 
- A is the leader and it sends 2 proposals. 
- Everyone ack the first proposal, only A and C receive and ack the second proposal.
- A commits the first proposal, and A and B see the commit. 
- A commits the second proposal, but fails before anyone else sees the commit.
- B and C start leader election. 
- Since servers propose their last committed zxid in leader election, and not their last received / acked zxid (this is not being tracked, AFAIK), B will be elected as leader since it has seen more commits than C.
- The last transaction is lost forever even though it has been acked by a majority.

Notice that if C reboots before starting the leader election, this problem doesn't happen since it "locally commits" the second proposal updating its last committed zxid.


> Transactions can be dropped because leader election uses last committed zxid instead of last acknowledged/received zxid
> -----------------------------------------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-1131
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1131
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: leaderElection, server
>    Affects Versions: 3.4.0
>            Reporter: Alexander Shraer
>
> Suppose we have 3 servers - A, B, C which have seen the same number of commits. 
> - A is the leader and it sends out a new proposal.
> - B doesn't receive the proposal, but A and C receive and ACK it
> - A commits the proposal, but fails before anyone else sees the commit.
> - B and C start leader election. 
> - since both B and C saw the same number of commits, if B has a higher server-id than C, leader election will elect B. Then, the last transaction will be truncated from C's log, which is a bug since it was acked by a majority.
>   
> This happens since servers propose their last committed zxid in leader election, and not their last received / acked zxid (this is not being tracked, AFAIK). See method
> FastLeaderElection.getInitLastLoggedZxid(), which calls QuorumPeer.getLastLoggedZxid(), which is supposed to return the last logged Zxid, but instead calls zkDb.getDataTreeLastProcessedZxid() which returns the last committed zxid.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira