You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "xmarker (Jira)" <ji...@apache.org> on 2021/10/15 15:34:00 UTC

[jira] [Commented] (FLINK-24538) ZooKeeperLeaderElectionTest.testLeaderShouldBeCorrectedWhenOverwritten fails with NPE

    [ https://issues.apache.org/jira/browse/FLINK-24538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17429345#comment-17429345 ] 

xmarker commented on FLINK-24538:
---------------------------------

I investigate the issue related code and i think the issue may be occur with flowing scene:

1.When call `retrievalEventHandler.waitForNewLeader(timeout)` at line 434, in TestingRetrievalBase.waitForNewLeader wait a correct leader information

2. But when return `leader.getLeaderAddress()` in TestingRetrievalBase the leaderRetrievalDriver was notified a empty leader information(may be zookeeper connection suspend or lost, see ZooKeeperLeaderRetrievalDriver.handleStateChange)

3. So we can add a lock in TestingRetrievalBase when change it's leader information 

[~wangyang0918]   do you have any good advice ? 

> ZooKeeperLeaderElectionTest.testLeaderShouldBeCorrectedWhenOverwritten fails with NPE
> -------------------------------------------------------------------------------------
>
>                 Key: FLINK-24538
>                 URL: https://issues.apache.org/jira/browse/FLINK-24538
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / Coordination
>    Affects Versions: 1.14.0
>            Reporter: Xintong Song
>            Priority: Major
>              Labels: test-stability
>             Fix For: 1.15.0, 1.14.1
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=25020&view=logs&j=f2b08047-82c3-520f-51ee-a30fd6254285&t=3810d23d-4df2-586c-103c-ec14ede6af00&l=7573
> {code}
> Oct 13 22:26:04 [ERROR] Tests run: 8, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 12.355 s <<< FAILURE! - in org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionTest
> Oct 13 22:26:04 [ERROR] testLeaderShouldBeCorrectedWhenOverwritten  Time elapsed: 1.138 s  <<< ERROR!
> Oct 13 22:26:04 java.lang.NullPointerException
> Oct 13 22:26:04 	at org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionTest.testLeaderShouldBeCorrectedWhenOverwritten(ZooKeeperLeaderElectionTest.java:434)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)