You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by "Yang Yang (JIRA)" <ji...@apache.org> on 2011/07/21 22:24:57 UTC

[jira] [Commented] (ZOOKEEPER-1131) Transactions can be dropped because leader election uses last committed zxid instead of last acknowledged/received zxid

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13069185#comment-13069185 ] 

Yang Yang commented on ZOOKEEPER-1131:
--------------------------------------

for my education, Alexander: it seems that zk.loadData() is always called by Leader.lead() <=== LearnerHandler.run(), 
so the missing link is in FileTxnSnapLog.restore()  ???


        TxnIterator itr = txnLog.read(dt.lastProcessedZxid+1);

here dt.lastProcessedZxid points to the last SEEN proposal, right?

> Transactions can be dropped because leader election uses last committed zxid instead of last acknowledged/received zxid
> -----------------------------------------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-1131
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1131
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: leaderElection, server
>    Affects Versions: 3.4.0
>            Reporter: Alexander Shraer
>
> Suppose we have 3 servers - A, B, C. 
> - A is the leader and it sends 2 proposals. 
> - Everyone ack the first proposal, only A and C receive and ack the second proposal.
> - A commits the first proposal, and A and B see the commit. 
> - A commits the second proposal, but fails before anyone else sees the commit.
> - B and C start leader election. 
> - Since servers propose their last committed zxid in leader election, and not their last received / acked zxid (this is not being tracked, AFAIK), B will be elected as leader since it has seen more commits than C.
> - The last transaction is lost forever even though it has been acked by a majority.
> Notice that if C reboots before starting the leader election, this problem doesn't happen since it "locally commits" the second proposal updating its last committed zxid.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira