You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by "Camille Fournier (JIRA)" <ji...@apache.org> on 2011/07/06 15:33:16 UTC
[jira] [Resolved] (ZOOKEEPER-1118) Inconsistent data after server crashes several times

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-1118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Camille Fournier resolved ZOOKEEPER-1118.
-----------------------------------------

    Resolution: Duplicate

> Inconsistent data after server crashes several times
> ----------------------------------------------------
>
>                 Key: ZOOKEEPER-1118
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1118
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.3.2
>         Environment: Redhat RHEL5
>            Reporter: Kurt Young
>            Priority: Critical
>
> I think there is a bug when Follower try to sync data with Leader.
> Assume there are some operations committed during one server had been crashed. When the server restart, it will receive a NEWLEADER packet which include the last zxid of leader and the server will set its own lastProcessZxid to the leader's. 
> {code:title=Follower.java|borderStyle=solid}
> void followLeader() throws InterruptedException {
>     fzk.registerJMX(new FollowerBean(this, zk), self.jmxLocalPeerBean);
>     try {
>         InetSocketAddress addr = findLeader();
>         try {
>             connectToLeader(addr);
>             long newLeaderZxid = registerWithLeader(Leader.FOLLOWERINFO);  // get the last zxid from leader
>             //check to see if the leader zxid is lower than ours                                                                                          
>             //this should never happen but is just a safety check                                                                                         
>             long lastLoggedZxid = self.getLastLoggedZxid();
>             if ((newLeaderZxid >> 32L) < (lastLoggedZxid >> 32L)) {
>                 LOG.fatal("Leader epoch " + Long.toHexString(newLeaderZxid >> 32L)
>                         + " is less than our epoch " + Long.toHexString(lastLoggedZxid >> 32L));
>                 throw new IOException("Error: Epoch of leader is lower");
>             }
>             syncWithLeader(newLeaderZxid);   // set its own lastProcessZxid to leader's last zxid
> {code}
> Then, some COMMIT packets will be received by the server in order to sync the data with leader. And then, the leader will send an UPTODATE packet to server to take a snapshot. 
> {code:title=Follower.java|borderStyle=solid}
> protected void processPacket(QuorumPacket qp) throws IOException{
>     switch (qp.getType()) {
>     case Leader.PING:
>         ping(qp);
>         break;
>     case Leader.PROPOSAL:
>         TxnHeader hdr = new TxnHeader();
>         BinaryInputArchive ia = BinaryInputArchive
>         .getArchive(new ByteArrayInputStream(qp.getData()));
>         Record txn = SerializeUtils.deserializeTxn(ia, hdr);
>         if (hdr.getZxid() != lastQueued + 1) {
>             LOG.warn("Got zxid 0x"
>                     + Long.toHexString(hdr.getZxid())
>                     + " expected 0x"
>                     + Long.toHexString(lastQueued + 1));
>         }
>         lastQueued = hdr.getZxid();
>         fzk.logRequest(hdr, txn);
>         break;
>     case Leader.COMMIT:
>         fzk.commit(qp.getZxid());
>         break;
>     case Leader.UPTODATE:
>         fzk.takeSnapshot();
>         self.cnxnFactory.setZooKeeperServer(fzk);
>         break;
>     case Leader.REVALIDATE:
>         revalidate(qp);
>         break;
>     case Leader.SYNC:
>         fzk.sync();
>         break;
>     }
> }
> {code}
> Notice the different way the Follower treat the COMMIT and the UPTODATE packets. When receives a COMMIT packet, the follower will give this to a processor to deal with. But if receives a UPTODATE packet, the follower will take a snapshot immediately. So it is possible that the server will take snapshot before it commits all the operations it missed. Then if the server crashed again and recovered， it will recover its data from the snapshot, so the date inconsistent with the leader now, but its last zxid is the same. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira