You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by Enrico Olivelli <eo...@gmail.com> on 2021/02/05 15:24:58 UTC

Java 11 tests are very flaky on GitHub Actions

Hi,
I see that the new test workflow with Java 11 is very flaky.

This is an example
https://github.com/apache/zookeeper/pull/1592/checks?check_run_id=1830428694

I would like to not consider it blocker for merging pull requests.

That said, we should investigate further, the tests that are failing were
not flaky on JDK8

Thoughts ?
Enrico

Re: Java 11 tests are very flaky on GitHub Actions

Posted by Christopher <ct...@apache.org>.
FWIW, the maven-surefire-plugin should be able to retry temporarily
failing tests, but it isn't working with JUnit5 until the plugin is
updated to a newer version. However, the newer version of the plugin
didn't work with the versions of JUnit that ZK is using. When I tried
to update maven-surefire-plugin, things got worse due to JUnit4/JUnit5
stuff that I don't understand.

I have created a PR to trigger the tests to run on JDK8 instead of
JDK11, to demonstrate that the tests are flaky there (or to prove me
wrong), but it doesn't need to be merged, as it's just a test:
https://github.com/apache/zookeeper/pull/1595

On Fri, Feb 5, 2021 at 10:48 AM Christopher <ct...@apache.org> wrote:
>
> These tests are flaky on JDK8, too, when I tried. It's my
> understanding that's why they were not being run on Travis previously
> (still aren't). Most of the tests that I see failing are due to the
> "Address already in use" bind error. This may be more likely in a
> virtualized environment like GitHub Actions (vs. non-vritualized
> Jenkins), but I don't think it is unique to that environment... just
> maybe more likely for whatever reason.
>
> I have spent quite a bit of time looking into this, and I think the
> main cause is that the port reservation stuff isn't working the way it
> should. It tries to bind to a port, and then it closes the
> ServerSocket that was used to find an available port. What it should
> do instead is return the ServerSocket itself for use in the calling
> code, after it has successfully bound, rather than returning an
> integer. But, that might be a big change, and there might be a simpler
> fix.
>
> On Fri, Feb 5, 2021 at 10:25 AM Enrico Olivelli <eo...@gmail.com> wrote:
> >
> > Hi,
> > I see that the new test workflow with Java 11 is very flaky.
> >
> > This is an example
> > https://github.com/apache/zookeeper/pull/1592/checks?check_run_id=1830428694
> >
> > I would like to not consider it blocker for merging pull requests.
> >
> > That said, we should investigate further, the tests that are failing were
> > not flaky on JDK8
> >
> > Thoughts ?
> > Enrico

Re: Java 11 tests are very flaky on GitHub Actions

Posted by Christopher <ct...@apache.org>.
These tests are flaky on JDK8, too, when I tried. It's my
understanding that's why they were not being run on Travis previously
(still aren't). Most of the tests that I see failing are due to the
"Address already in use" bind error. This may be more likely in a
virtualized environment like GitHub Actions (vs. non-vritualized
Jenkins), but I don't think it is unique to that environment... just
maybe more likely for whatever reason.

I have spent quite a bit of time looking into this, and I think the
main cause is that the port reservation stuff isn't working the way it
should. It tries to bind to a port, and then it closes the
ServerSocket that was used to find an available port. What it should
do instead is return the ServerSocket itself for use in the calling
code, after it has successfully bound, rather than returning an
integer. But, that might be a big change, and there might be a simpler
fix.

On Fri, Feb 5, 2021 at 10:25 AM Enrico Olivelli <eo...@gmail.com> wrote:
>
> Hi,
> I see that the new test workflow with Java 11 is very flaky.
>
> This is an example
> https://github.com/apache/zookeeper/pull/1592/checks?check_run_id=1830428694
>
> I would like to not consider it blocker for merging pull requests.
>
> That said, we should investigate further, the tests that are failing were
> not flaky on JDK8
>
> Thoughts ?
> Enrico