You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@zookeeper.apache.org by "Martin Güthle (Jira)" <ji...@apache.org> on 2020/05/29 10:11:00 UTC

[jira] [Commented] (ZOOKEEPER-3526) When a zk Node just become the leader,synchronizing with Follower in the first time, the value of maxCommittedLog may be smaller than the value of minCommittedLog,is that a problem?

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-3526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17119467#comment-17119467 ] 

Martin Güthle commented on ZOOKEEPER-3526:
------------------------------------------

Hi,

after start using the Read-Only-Mode it seems like we have run into a similar issue. Activating the trace log level, I think I was able to dig to the cause of the problem.

Steps to reproduce it:
 # Start an ensemble with 3 zk instances and active read-only mode
 # Wait for leader election to complete (assuming zkNode 3 is the leader in the following)
 # Insert some data (optional, as the problem can be observerd without any data, too)
 # Stop zkNode 1 and zkNode 2
 # Wait until zkNode 3 detects the lost and provide the Read-Only-Mode
 # Connect via zkCli -r (any other client, which can be killed in an easy manner or is known to run into the session timeout)
 # Send zkCli in background and kill it (to ensure a session timeout occures)
 # Wait for session timeout
 # Start zkNode 2 -> Will Receive a diff due to equal zxid
 # New Epoche starts -> new zxid
 # Start zkNode 1 -> Will receiv a trunc as there is a difference in the zxid
 # Data loss is now observed

 

The related, relevant trace logs are (using ZooKeeper 3.4.13 line numbers, but code does not differ in the relevant parts):

step 8:

 
{code:java}
2020-05-25 16:20:39,168 [myid:3] - DEBUG [ProcessThread(sid:3 cport:-1)::FinalRequestProcessor@89] - Processing request:: sessionid:0x2004792d7c60000 type:closeSession cxid:0x0 zxid:0xc txntype:-11 reqpath:n/a{code}
 

step 9+10:

 
{code:java}
2020-05-25 16:22:01,564 [myid:3] - INFO  [QuorumPeer[myid=3]/0.0.0.0:2181:Leader@380] - LEADING - LEADER ELECTION TOOK - 126318
2020-05-25 16:22:02,482 [myid:3] - INFO  [LearnerHandler-/10.1.1.31:34690:LearnerHandler@346] - Follower sid: 2 : info : org.apache.zookeeper.server.quorum.QuorumPeer$QuorumServer@6d10cc4e
2020-05-25 16:22:02,499 [myid:3] - INFO  [LearnerHandler-/10.1.1.31:34690:LearnerHandler@401] - Synchronizing with Follower sid: 2 maxCommittedLog=0xc minCommittedLog=0x100000530 peerLastZxid=0x100000718
2020-05-25 16:22:02,499 [myid:3] - INFO  [LearnerHandler-/10.1.1.31:34690:LearnerHandler@410] - leader and follower are in sync, zxid=0x100000718
2020-05-25 16:22:02,499 [myid:3] - INFO  [LearnerHandler-/10.1.1.31:34690:LearnerHandler@475] - Sending DIFF
2020-05-25 16:22:02,504 [myid:3] - INFO  [LearnerHandler-/10.1.1.31:34690:LearnerHandler@535] - Received NEWLEADER-ACK message from 2
2020-05-25 16:22:02,505 [myid:3] - INFO  [QuorumPeer[myid=3]/0.0.0.0:2181:Leader@964] - Have quorum of supporters, sids: [ 2,3 ]; starting up and setting last processed zxid: 0x200000000
{code}
Step 11 (log from zkNode3)
{code:java}
2020-05-25 16:23:02,551 [myid:3] - INFO  [LearnerHandler-/10.1.1.30:54868:LearnerHandler@346] - Follower sid: 1 : info : org.apache.zookeeper.server.quorum.QuorumPeer$QuorumServer@3d0240a9
2020-05-25 16:23:02,556 [myid:3] - INFO  [LearnerHandler-/10.1.1.30:54868:LearnerHandler@401] - Synchronizing with Follower sid: 1 maxCommittedLog=0x20000000c minCommittedLog=0x10000053c peerLastZxid=0x100000718
2020-05-25 16:23:02,556 [myid:3] - DEBUG [LearnerHandler-/10.1.1.30:54868:LearnerHandler@415] - proposal size is 501
2020-05-25 16:23:02,556 [myid:3] - DEBUG [LearnerHandler-/10.1.1.30:54868:LearnerHandler@418] - Sending proposals to follower
2020-05-25 16:23:02,557 [myid:3] - INFO  [LearnerHandler-/10.1.1.30:54868:LearnerHandler@475] - Sending TRUNC
2020-05-25 16:23:02,570 [myid:3] - INFO  [LearnerHandler-/10.1.1.30:54868:LearnerHandler@535] - Received NEWLEADER-ACK message from 1
{code}
Step 11 (log from zkNode1)
{code:java}
2020-05-25 16:23:02,541 [myid:1] - DEBUG [QuorumPeer[myid=1]/0.0.0.0:2181:FastLeaderElection@607] - id: 3, proposed id: 1, zxid: 0x100000718, proposed zxid: 0x100000718
2020-05-25 16:23:02,541 [myid:1] - DEBUG [QuorumPeer[myid=1]/0.0.0.0:2181:FastLeaderElection@714] - Updating proposal: 3 (newleader), 0x100000718 (newzxid), 1 (oldleader), 0x100000718 (oldzxid)
// Further leader msgs, which are not relevant
2020-05-25 16:23:02,544 [myid:1] - INFO  [QuorumPeer[myid=1]/0.0.0.0:2181:QuorumPeer@979] - FOLLOWING
2020-05-25 16:23:02,557 [myid:1] - WARN  [QuorumPeer[myid=1]/0.0.0.0:2181:Learner@349] - Truncating log to get in sync with the leader 0xc
{code}
 

So what actual happens is:
 * Regular ZooKeeper server start, load it's database and set the zxid accordingly
 * Expiring session with zkNode3 runs into the timeout and a request is generated accordingly
 * The current zxid from the ReadOnly ZooKeeper server is requested and stored within the database:
 ** Each instance of ZooKeeperServer (ReadOnlyZooKeeper inherits from the ZooKeeperServer) has its own zxid storage
 ** zkDatabase zxid is shared between all ZooKeeperServers

So the fix is (3.4.14 code base):
{code:java}
--- zookeeper-server/src/main/java/org/apache/zookeeper/server/ZooKeeperServer.java.org 2020-05-29 12:06:14.001336800 +0200
+++ zookeeper-server/src/main/java/org/apache/zookeeper/server/ZooKeeperServer.java     2020-05-29 12:06:33.108783000 +0200
@@ -109,7 +109,7 @@
     protected SessionTracker sessionTracker;
     private FileTxnSnapLog txnLogFactory = null;
     private ZKDatabase zkDb;
-    private final AtomicLong hzxid = new AtomicLong(0);
+    private static final AtomicLong hzxid = new AtomicLong(0);
     public final static Exception ok = new Exception("No prob");
     protected RequestProcessor firstProcessor;
     protected volatile State state = State.INITIAL;
{code}
Checking the code of 3.6 the bug is still present and the fix accordingly:
{code:java}
--- zookeeper-server/src/main/java/org/apache/zookeeper/server/ZooKeeperServer.java.org 2020-05-29 12:07:22.523718700 +0200
+++ zookeeper-server/src/main/java/org/apache/zookeeper/server/ZooKeeperServer.java     2020-05-29 12:07:31.626953200 +0200
@@ -166,7 +166,7 @@
     private ZKDatabase zkDb;
     private ResponseCache readResponseCache;
     private ResponseCache getChildrenResponseCache;
-    private final AtomicLong hzxid = new AtomicLong(0);
+    private static final AtomicLong hzxid = new AtomicLong(0);
     public static final Exception ok = new Exception("No prob");
     protected RequestProcessor firstProcessor;
     protected JvmPauseMonitor jvmPauseMonitor;
{code}
Using these fixes, the tests are broken, so needs some further work, to be complete. But haven't had time yet to do this.

 

Hope this is the only cause of the problem. At least I haven't seen any further parts yet. Maybe someone can help to get the fix into the code

 

> When a zk Node just become the leader,synchronizing with Follower in the first time, the value of maxCommittedLog may be smaller than the value of  minCommittedLog,is that a problem?
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-3526
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3526
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.4.14
>         Environment: 1.version:3.4.14
> 2.num of zk nodes: 3
>            Reporter: wanglei
>            Priority: Major
>
> 1.Node1:1566815238 (myid)、Node2:1566815239 (myid)、Node3:1566815240 (myid)
> 2.After a election, Node3 become the new leader, begin to sync with followers
>  
> *2019-08-27 04:26:09,521 [myid:1566815240] - INFO [NIOServerCxn.Factory:/172.28.8.123:9880:ZooKeeperServer@910][] - Refusing session request for not-read-only client /172.28.0.3:38994*
>  *2019-08-27 04:26:09,609 [myid:1566815240] - INFO [QuorumPeer[myid=1566815240]/172.28.8.123:9880:ZooKeeperServer@502][] - shutting down*
>  *2019-08-27 04:26:09,609 [myid:1566815240] - INFO [QuorumPeer[myid=1566815240]/172.28.8.123:9880:SessionTrackerImpl@226][] - Shutting down*
>  *2019-08-27 04:26:09,609 [myid:1566815240] - INFO [QuorumPeer[myid=1566815240]/172.28.8.123:9880:PrepRequestProcessor@769][] - Shutting down*
>  *2019-08-27 04:26:09,609 [myid:1566815240] - INFO [ReadOnlyRequestProcessor:1566815240:ReadOnlyRequestProcessor@111][] - ReadOnlyRequestProcessor exited loop!*
>  *2019-08-27 04:26:09,610 [myid:1566815240] - INFO [ProcessThread(sid:1566815240 cport:-1)::PrepRequestProcessor@144][] - PrepRequestProcessor exited loop!*
>  *2019-08-27 04:26:09,610 [myid:1566815240] - INFO [QuorumPeer[myid=1566815240]/172.28.8.123:9880:FinalRequestProcessor@430][] - shutdown of request processor complete*
>  *2019-08-27 04:26:09,613 [myid:1566815240] - INFO [QuorumPeer[myid=1566815240]/172.28.8.123:9880:QuorumPeer@992][] - LEADING*
>  *2019-08-27 04:26:09,615 [myid:1566815240] - INFO [QuorumPeer[myid=1566815240]/172.28.8.123:9880:Leader@64][] - TCP NoDelay set to: true*
>  *2019-08-27 04:26:09,616 [myid:1566815240] - INFO [QuorumPeer[myid=1566815240]/172.28.8.123:9880:ZooKeeperServer@174][] - Created server with tickTime 2000 minSessionTimeout 4000 maxSessionTimeout 300000 datadir /opt/fusionplatform/data/zookeeper/data/version-2 snapdir /opt/fusionplatform/data/zookeeper/data/version-2*
>  *2019-08-27 04:26:09,616 [myid:1566815240] - INFO [QuorumPeer[myid=1566815240]/172.28.8.123:9880:Leader@380][] - {color:#FF0000}LEADING - LEADER ELECTION TOOK - 15297{color}*
>  *2019-08-27 04:26:09,956 [myid:1566815240] - INFO [NIOServerCxn.Factory:/172.28.8.123:9880:NIOServerCnxnFactory@222][] - Accepted socket connection from /172.28.0.3:39012*
>  *2019-08-27 04:26:09,956 [myid:1566815240] - WARN [NIOServerCxn.Factory:/172.28.8.123:9880:NIOServerCnxn@383][] - Exception causing close of session 0x0: ZooKeeperServer not running*
>  *2019-08-27 04:26:09,974 [myid:1566815240] - INFO [NIOServerCxn.Factory:/172.28.8.123:9880:NIOServerCnxnFactory@222][] - Accepted socket connection from /172.28.0.2:50732*
>  *2019-08-27 04:26:09,974 [myid:1566815240] - WARN [NIOServerCxn.Factory:/172.28.8.123:9880:NIOServerCnxn@383][] - Exception causing close of session 0x0: ZooKeeperServer not running*
>  *2019-08-27 04:26:10,513 [myid:1566815240] - INFO [NIOServerCxn.Factory:/172.28.8.123:9880:NIOServerCnxnFactory@222][] - Accepted socket connection from /172.28.0.5:60010*
>  *2019-08-27 04:26:10,514 [myid:1566815240] - WARN [NIOServerCxn.Factory:/172.28.8.123:9880:NIOServerCnxn@383][] - Exception causing close of session 0x0: ZooKeeperServer not running*
>  *2019-08-27 04:26:10,516 [myid:1566815240] - INFO [NIOServerCxn.Factory:/172.28.8.123:9880:NIOServerCnxnFactory@222][] - Accepted socket connection from /172.28.0.5:60020*
>  *2019-08-27 04:26:10,517 [myid:1566815240] - WARN [NIOServerCxn.Factory:/172.28.8.123:9880:NIOServerCnxn@383][] - Exception causing close of session 0x0: ZooKeeperServer not running*
>  *2019-08-27 04:26:10,530 [myid:1566815240] - INFO [NIOServerCxn.Factory:/172.28.8.123:9880:NIOServerCnxnFactory@222][] - Accepted socket connection from /172.28.0.5:60024*
>  *2019-08-27 04:26:10,531 [myid:1566815240] - WARN [NIOServerCxn.Factory:/172.28.8.123:9880:NIOServerCnxn@383][] - Exception causing close of session 0x0: ZooKeeperServer not running*
>  *2019-08-27 04:26:10,619 [myid:1566815240] - INFO [LearnerHandler-/172.28.0.2:59666:LearnerHandler@346][] - Follower sid: 1566815238 : info : org.apache.zookeeper.server.quorum.QuorumPeer$QuorumServer@6f38a687*
>  *2019-08-27 04:26:10,747 [myid:1566815240] - INFO [LearnerHandler-/172.28.0.2:59666:LearnerHandler@401][] -{color:#FF0000} Synchronizing with Follower sid: 1566815238 maxCommittedLog=0x3 minCommittedLog=0x9000002d9 peerLastZxid=0x9000004ca{color}*
>  *2019-08-27 04:26:10,747 [myid:1566815240] - INFO [LearnerHandler-/172.28.0.2:59666:LearnerHandler@410][] - leader and follower are in sync, zxid=0x9000004ca*
>  *2019-08-27 04:26:10,748 [myid:1566815240] - INFO [LearnerHandler-/172.28.0.2:59666:LearnerHandler@475][] - Sending DIFF*
>  *2019-08-27 04:26:10,811 [myid:1566815240] - INFO [SessionTracker:SessionTrackerImpl@163][] - SessionTrackerImpl exited loop!*
>  *2019-08-27 04:26:10,833 [myid:1566815240] - INFO [LearnerHandler-/172.28.0.2:59666:LearnerHandler@535][] - Received NEWLEADER-ACK message from 1566815238*
>  *2019-08-27 04:26:10,833 [myid:1566815240] - INFO [QuorumPeer[myid=1566815240]/172.28.8.123:9880:Leader@964][] - Have quorum of supporters, sids: [ 1566815238,1566815240 ]; starting up and setting last processed zxid: 0xa00000000*
>  *2019-08-27 04:26:11,160 [myid:1566815240] - INFO [SyncThread:1566815240:FileTxnLog@216][] - Creating new log file: log.a00000001*
> {color:#FF0000}maxCommittedLog=0x3 minCommittedLog=0x9000002d9 peerLastZxid=0x9000004ca{color}
> {color:#ff0000}*why maxCommittedLog < minCommittedLog?*{color}
> 2.Node 2(follower) get a Trunc message form leader.The leader zxid of the Trunc message is 0x3. So Node3 truncat the  transaction log(the zxid which is bigger than 0x3 will be deleted). At last, the data in Node2 is inconsistent.
>  
> *2019-08-27 04:26:14,225 [myid:1566815239] - INFO [WorkerReceiver[myid=1566815239]:FastLeaderElection@595][] - Notification: 1 (message format version), 1566815240 (n.leader), 0x9000004ca (n.zxid), 0x1 (n.round), FOLLOWING (n.state), 1566815238 (n.sid), 0xa (n.peerEpoch) LOOKING (my state)*
>  *2019-08-27 04:26:14,226 [myid:1566815239] - INFO [WorkerReceiver[myid=1566815239]:FastLeaderElection@595][] - Notification: 1 (message format version), 1566815240 (n.leader), 0x9000004ca (n.zxid), 0x1 (n.round), FOLLOWING (n.state), 1566815238 (n.sid), 0xa (n.peerEpoch) FOLLOWING (my state)*
>  *2019-08-27 04:26:14,226 [myid:1566815239] - INFO [QuorumPeer[myid=1566815239]/172.28.8.122:9880:QuorumPeer@980][] - FOLLOWING*
>  *2019-08-27 04:26:14,226 [myid:1566815239] - INFO [Thread-1:QuorumPeer$1@936][] - Interrupted while attempting to start ReadOnlyZooKeeperServer, not started*
>  *2019-08-27 04:26:14,229 [myid:1566815239] - INFO [QuorumPeer[myid=1566815239]/172.28.8.122:9880:Learner@86][] - TCP NoDelay set to: true*
>  *2019-08-27 04:26:14,229 [myid:1566815239] - INFO [QuorumPeer[myid=1566815239]/172.28.8.122:9880:ZooKeeperServer@174][] - Created server with tickTime 2000 minSessionTimeout 4000 maxSessionTimeout 300000 datadir /opt/fusionplatform/data/zookeeper/data/version-2 snapdir /opt/fusionplatform/data/zookeeper/data/version-2*
>  *2019-08-27 04:26:14,230 [myid:1566815239] - INFO [QuorumPeer[myid=1566815239]/172.28.8.122:9880:Follower@65][] - {color:#FF0000}FOLLOWING - LEADER ELECTION TOOK - 36{color}*
>  *2019-08-27 04:26:14,232 [myid:1566815239] - INFO [QuorumPeer[myid=1566815239]/172.28.8.122:9880:QuorumPeer$QuorumServer@185][] - Resolved hostname: 172.28.8.123 to address: /172.28.8.123*
>  *2019-08-27 04:26:14,346 [myid:1566815239] - WARN [QuorumPeer[myid=1566815239]/172.28.8.122:9880:Learner@349][] - {color:#FF0000}Truncating log to get in sync with the leader 0x3{color}*
>  *2019-08-27 04:26:14,371 [myid:1566815239] - INFO [QuorumPeer[myid=1566815239]/172.28.8.122:9880:DataTree@715][] - type: create, sessionid:0x10000080a040001 cxid:0x4 zxid:0x3 reqpath:/cps*
>  *2019-08-27 04:26:14,374 [myid:1566815239] - WARN [QuorumPeer[myid=1566815239]/172.28.8.122:9880:Learner@387][] - Got zxid 0xa00000001 expected 0x1*
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)