You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pulsar.apache.org by Nicolò Boschi <bo...@gmail.com> on 2022/10/17 10:27:23 UTC

Pulsar Flaky test report 2022-10-17 for PR builds in CI

Dear community,

Here's a report of the flaky tests in Pulsar CI during the observation
period of 2022-10-07 - 2022-10-14

https://docs.google.com/spreadsheets/d/165FHpHjs5fHccSsmQM4beeg6brn-zfUjcrXf6xAu4yQ/edit?usp=sharing



The report contains a subset of the test failures.
The flaky tests are observed from builds of merged PRs.
The GitHub Actions logs will be checked for builds where the SHA of the
head of the PR matches the SHA which got merged.
This ensures that all found exceptions are real flakes, since no changes
were made to the PR to make the tests pass later
so that the PR was merged successfully.



Top 7 flaky issues to fix:
org.apache.pulsar.client.api.SimpleProducerConsumerTest.rest 87
org.apache.pulsar.client.api.SimpleProducerConsumerTest.testConcurrentConsumerReceiveWhileReconnect
47
org.apache.pulsar.client.api.v1.V1_ProducerConsumerTest.testActiveAndInActiveConsumerEntryCacheBehavior
40
org.apache.pulsar.broker.stats.PrometheusMetricsTest.testDuplicateMetricTypeDefinitions
22
org.apache.pulsar.broker.service.persistent.SimpleProducerConsumerTestStreamingDispatcherTest.rest
14
org.apache.pulsar.metadata.ZKSessionTest.testReacquireLeadershipAfterSessionLost
13
org.apache.pulsar.broker.admin.NamespacesTest.testSplitBundleForMultiTimes
11
The main reason for all the SimpleProducerConsumerTest failures is
https://github.com/apache/pulsar/pull/17887 and this is a possible fix:
https://github.com/apache/pulsar/pull/18068

Markdown formatted summary reports for each test class can be accessed at
https://github.com/nicoloboschi/pulsar-flakes/tree/master/2022-10-07-to-2022-10-14
<https://github.com/nicoloboschi/pulsar-flakes/tree/master/2022-10-07-to-2022-10-14>
More flaky test issues:
https://github.com/apache/pulsar/issues?q=flaky
+sort%3Aupdated-desc+is%3Aopen+is:issue

We need more help in addressing the flaky tests. Please join the efforts
so that we can get CI to a more stable state.

To coordinate the work,
1) please search for an existing issues or search for all flaky issues with
"flaky" or the test class name (without package) in the search:
https://github.com/apache/pulsar/issues?q=is%3Aopen+flaky+sort%3Aupdated-desc
2) If there isn't an issue for a particular flaky test failure that you'd
like to fix, please create an issue using the "Flaky test" template at
https://github.com/apache/pulsar/issues/new/choose
3) Please comment on the issue that you are working on.

We have a few active contributors working on the flaky tests, thanks for
the contributions.

I'm looking forward to more contributors joining the efforts. Please join
the #testing channel on Slack if you'd like to ask questions and tips about
reproducing flaky tests locally and how to fix them.
Sharing stories about fixing flaky tests is also helpful for sharing the
knowledge about how flaky tests can be fixed. That's also a valuable way to
contribute.
Some flaky tests might be actual real production code bugs. Fixing
the flaky test might result in fixing a real production code bug.

More contributions are welcome! Please keep up the good work!

(Also thanks to Lari for all the work done to have these clean reports)

Thanks,
Nicolò Boschi