You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@zookeeper.apache.org by "Kezhu Wang (Jira)" <ji...@apache.org> on 2022/04/02 07:34:00 UTC

[jira] [Commented] (ZOOKEEPER-3023) Flaky test: org.apache.zookeeper.server.quorum.Zab1_0Test.testNormalFollowerRunWithDiff

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-3023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17516241#comment-17516241 ] 

Kezhu Wang commented on ZOOKEEPER-3023:
---------------------------------------

The line context is dated. I think it is [following assertion|https://github.com/apache/zookeeper/blob/2173c92a2b054e2fa55fd69d0d9ea892b7cc7e66/zookeeper-server/src/test/java/org/apache/zookeeper/server/quorum/Zab1_0Test.java#L790] that failed.
{code:java}
assertEquals(createSessionZxid, f.fzk.getLastProcessedZxid());
{code}

Beside above assertion, this test also [failed|https://github.com/apache/zookeeper/runs/5525199467?check_suite_focus=true] with following message and assertion.
{code:none}
[ERROR] testNormalFollowerRunWithDiff  Time elapsed: 0.045 s  <<< FAILURE!
org.opentest4j.AssertionFailedError: expected: <4294967296> but was: <4294967298>
	at org.apache.zookeeper.server.quorum.Zab1_0Test$5.converseWithFollower(Zab1_0Test.java:778)
	at org.apache.zookeeper.server.quorum.Zab1_0Test.testFollowerConversation(Zab1_0Test.java:445)
	at org.apache.zookeeper.server.quorum.Zab1_0Test.testNormalFollowerRunWithDiff(Zab1_0Test.java:706)
{code}
{code:java}
                    // Read the uptodate ack
                    readPacketSkippingPing(ia, qp);
                    assertEquals(Leader.ACK, qp.getType());
                    assertEquals(ZxidUtils.makeZxid(1, 0), qp.getZxid());

                    // Get the ack of the new leader
                    readPacketSkippingPing(ia, qp);
                    assertEquals(Leader.ACK, qp.getType());
                    assertEquals(ZxidUtils.makeZxid(1, 0), qp.getZxid());
                    assertEquals(1, f.self.getAcceptedEpoch());
                    assertEquals(1, f.self.getCurrentEpoch());
{code}

I think this test becomes flaky after ZOOKEEPER-2678, and ZOOKEEPER-3911 introduces new flaky path. Before ZOOKEEPER-2678 and ZOOKEEPER-3911, following condition holds:
1. Txn logs are synced to disk and committed to memory. ZOOKEEPER-2678 breaks this, and ZOOKEEPER-3911 tried but failed to fix as {{FollowerZooKeeperServer.logRequest}} logs txn asynchronous.
2. No ack before {{Leader.NEWLEADER}} ack.  ZOOKEEPER-3911 breaks this.

I think we can fix this flaky test by bring back above conditions by synchronously sync txn to disk and commit to memory without go through asynchronous request processor. This way we build invariant that after ack for {{NEWLEADER}}, followers are in sync with leader. This is consistent with pre ZOOKEEPER-2678 and easy to test.

> Flaky test: org.apache.zookeeper.server.quorum.Zab1_0Test.testNormalFollowerRunWithDiff
> ---------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-3023
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3023
>             Project: ZooKeeper
>          Issue Type: Sub-task
>    Affects Versions: 3.6.0
>            Reporter: Pravin Dsilva
>            Assignee: Kezhu Wang
>            Priority: Major
>
> Getting the following error on master branch:
> Error Message
> {code:java}
> expected:<4294967298> but was:<0>{code}
> Stacktrace
> {code:java}
> junit.framework.AssertionFailedError: expected:<4294967298> but was:<0> at org.apache.zookeeper.server.quorum.Zab1_0Test$5.converseWithFollower(Zab1_0Test.java:876) at org.apache.zookeeper.server.quorum.Zab1_0Test.testFollowerConversation(Zab1_0Test.java:523) at org.apache.zookeeper.server.quorum.Zab1_0Test.testNormalFollowerRunWithDiff(Zab1_0Test.java:791) at org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:79){code}
> Flaky test:https://builds.apache.org/job/ZooKeeper-trunk-java10/141/testReport/junit/org.apache.zookeeper.server.quorum/Zab1_0Test/testNormalFollowerRunWithDiff/



--
This message was sent by Atlassian Jira
(v8.20.1#820001)