You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by "Haoze Wu (Jira)" <ji...@apache.org> on 2021/02/06 18:56:00 UTC

[jira] [Created] (ZOOKEEPER-4203) Leader swallows the ZooKeeperServer.State.ERROR from Leader.LearnerCnxAcceptor in some concurrency condition.

Haoze Wu created ZOOKEEPER-4203:
-----------------------------------

             Summary: Leader swallows the ZooKeeperServer.State.ERROR from Leader.LearnerCnxAcceptor in some concurrency condition.
                 Key: ZOOKEEPER-4203
                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4203
             Project: ZooKeeper
          Issue Type: Bug
          Components: server
    Affects Versions: 3.6.2
            Reporter: Haoze Wu


We found a bug similar to https://issues.apache.org/jira/browse/ZOOKEEPER-2029. Leader.LearnerCnxAcceptor is a ZooKeeperCriticalThread and its exception is handled by ZooKeeperCriticalThread#handleException, which is supposed to shutdown the process or rejoin the quorum. However, in some concurrency condition, this correct handling does not occur, and thus this leader without LearnerCnxAcceptor will not accept any follower, which is the same symptom as https://issues.apache.org/jira/browse/ZOOKEEPER-2029. Ater reproduction, we confirmed that this bug exists in both the 3.6.2 release version and the master branch.

 

This concurrency condition can be constructed as follows: start a ZooKeeper cluster of 3 nodes, and the ServerSocket#accept invocation throws an exception in the LearnerCnxAcceptorHandler for the second follower that tries to join the quorum. If these 3 nodes get started almost simultaneously, e.g., within 2 seconds, then The aforementioned symptom may occur. In the log, we can observe that one of the followers keeps trying to join the quorum and keeps failing, and the other 2 servers always get the request from this follower but never let it join the quorum.

 

We prepared the reproduction scripts in a gist link, and the throwing exception of ServerSocket#accept is done by the Byteman injection script `serverSocketAccept-exception.btm`.

 

The root cause of this bug is that when the Leader.LearnerCnxAcceptor thread handles an exception in ZooKeeperCriticalThread#handleException, the exception is not really "handled", meaning that it basically does nothing but set the ZooKeeperServerState as ERROR in the following stack trace: ZooKeeperCriticalThread#handleException -> ZooKeeperServerListenerImpl#notifyStopping -> LeaderZooKeeperServer#setState.

 

LeaderZooKeeperServer#setState is implemented in QuorumZooKeeperServer#setState, basically setting the state as ZooKeeperServer.State.ERROR in this scenario.

 

Under normal Circumstances, this ERROR state will be detected by another thread in the following stack trace: QuorumPeer#run -> Leader#lead -> Leader#isRunning -> ZooKeeperServer#isRunning. In the code, it is within the infinite while loop at the end of Leader#lead ([https://github.com/apache/zookeeper/blob/release-3.6.2/zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/Leader.java#L758]).

 

If the ERROR state from Leader.LearnerCnxAcceptor or any other ZooKeeperCriticalThread could be detected in this way, then the leader would be aware of the ERROR state and turn to [https://github.com/apache/zookeeper/blob/release-3.6.2/zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/Leader.java#L779], which is correct.

 

However, in some concurrency condition, the ERROR state can't be detected in this way, because in Leader#lead, [https://github.com/apache/zookeeper/blob/release-3.6.2/zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/Leader.java#L685] can cover this ERROR state with a RUNNING state in the following stack trace: Leader#lead -> Leader#startZkServer -> LeaderZooKeeperServer#startup -> ZooKeeperServer#startup -> ZooKeeperServer#setState. Therefore, if the ERROR state occurs before the invocation of Leader#startZkServer in Leader#lead, then this ERROR state will be covered, because ZooKeeperServer#setState does not record or handle the old state.

 

The reproduction script we provided can construct this concurrency condition, because, usually, after the server startup, it takes some time for the QuorumPeer thread to reach [https://github.com/apache/zookeeper/blob/release-3.6.2/zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/Leader.java#L685]. Thus the exception LearnerCnxAcceptorHandler, if any, usually occurs earlier than that. Then the aforementioned symptom happens.

 

In terms of the fix, we will send a pull request. Basically we just add a sanity check to prevent the RUNNING/INITIAL state set in the ZK startup from covering the possible ERROR state.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)