You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by GitBox <gi...@apache.org> on 2020/12/11 08:17:27 UTC

[GitHub] [kafka] showuon commented on pull request #9690: KAFKA-10017: fix flaky EOS-beta upgrade test

showuon commented on pull request #9690:
URL: https://github.com/apache/kafka/pull/9690#issuecomment-743046108


   @mjsax , I investigated your failed tests for some days, and finally found out why sometimes the test failed here:
   ```
   Did not receive all 148 records from topic multiPartitionOutputTopic within 60000 ms
   Expected: is a value equal to or greater than <148>
        but: <138> was less than <148>
   	at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
   	at org.apache.kafka.streams.integration.utils.IntegrationTestUtils.lambda$waitUntilMinKeyValueRecordsReceived$1(IntegrationTestUtils.java:597)
   	at org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:449)
   	at org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:417)
   	at org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinKeyValueRecordsReceived(IntegrationTestUtils.java:593)
   	at org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinKeyValueRecordsReceived(IntegrationTestUtils.java:566)
   	at org.apache.kafka.streams.integration.EosBetaUpgradeIntegrationThreeTest.readResult(EosBetaUpgradeIntegrationThreeTest.java:1056)
   	at org.apache.kafka.streams.integration.EosBetaUpgradeIntegrationThreeTest.verifyUncommitted(EosBetaUpgradeIntegrationThreeTest.java:1030)
   	at org.apache.kafka.streams.integration.EosBetaUpgradeIntegrationThreeTest.shouldUpgradeFromEosAlphaToEosBeta(EosBetaUpgradeIntegrationThreeTest.java:619)
   ```
   It's because sometimes, the keys in stream store is empty, and that's why the following computation based on the variable is wrong.
   Here's the logs I got: (They mapped to the code [here](https://github.com/apache/kafka/pull/9690/files#diff-86a5136ae170df067137442b5eae05fa5fd9d1e02aca85bac8a251b7d2557b0eR477), which is in phase 6)
   ```
   keysFirstClientBeta is []
   keysSecondClientAlpha is [1, 3]
   ```
   
   And the variable `newlyCommittedKeys` in phase 5 should also be empty [] since we all get it via `keysFromInstance(streams1Beta);`. And that's why at the end of phase 5, `verifyCommitted(expectedCommittedResultAfterRestartFirstClient);` is actually doing `verifyCommitted([])` <-- I also printed out the log here and can confirmed this
   
   So, in summary, there's no logic error in the code, just has a bad assumption: `keysFromInstance(streams1Beta);` will always get something, but it might be empty. I just can't figure out why it's empty, store not ready? Or some other reasons here? Do you have any thought for this?
   
   Thanks.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org