You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Chesnay Schepler (Jira)" <ji...@apache.org> on 2022/10/31 11:34:00 UTC

[jira] [Closed] (FLINK-29611) Fix flaky tests in CoBroadcastWithNonKeyedOperatorTest

     [ https://issues.apache.org/jira/browse/FLINK-29611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chesnay Schepler closed FLINK-29611.
------------------------------------
    Resolution: Won't Fix

> Fix flaky tests in CoBroadcastWithNonKeyedOperatorTest
> ------------------------------------------------------
>
>                 Key: FLINK-29611
>                 URL: https://issues.apache.org/jira/browse/FLINK-29611
>             Project: Flink
>          Issue Type: Bug
>            Reporter: Sopan Phaltankar
>            Priority: Minor
>              Labels: pull-request-available
>
> The test _org.apache.flink.streaming.api.operators.co.CoBroadcastWithNonKeyedOperatorTest.testMultiStateSupport_ has the following failure:
> Failures:
> [ERROR]   CoBroadcastWithNonKeyedOperatorTest.testMultiStateSupport:74 
> Wrong Side Output: arrays first differed at element [0]; expected:<Record @ 15 : 9:key.6->6> but was:<Record @ 15 : 9:key.5->5>
> I used the tool [NonDex|https://github.com/TestingResearchIllinois/NonDex] to find this flaky test. 
> Command: mvn edu.illinois:nondex-maven-plugun:1.1.2:nondex -Dtest='Fully Qualified Test Name'
> I analyzed the assertion failure and found that the root cause is because the test method calls ctx.getBroadcastState(STATE_DESCRIPTOR).immutableEntries() which calls the entrySet() method of the underlying HashMap. entrySet() returns the entries in a non-deterministic way, causing the test to be flaky. 
> The fix would be to change _HashMap_ to _LinkedHashMap_ where the Map is getting initialized.
> On further analysis, it was found that the Map is getting initialized on line 53 of org.apache.flink.runtime.state.HeapBroadcastState class.
> After changing from HashMap to LinkedHashMap, the above test is passing.
> Edit: Upon making this change and running the CI, it was found that the tests org.apache.flink.api.datastream.DataStreamBatchExecutionITCase.batchKeyedBroadcastExecution and org.apache.flink.api.datastream.DataStreamBatchExecutionITCase.batchBroadcastExecution were failing. Upon further investigation, I found that these tests were also flaky and depended on the earlier made change.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)