You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mesos.apache.org by Benno Evers <be...@mesosphere.com> on 2018/04/03 11:34:51 UTC

Re: Adding a `FLAKY` label to flaky unit tests

Hi,

> 1) What would be the criteria for removing `FLAKY` label from a test? Who
will take care of removing this label?
The process would be exactly the same as for removing the `DISABLED` label
today, i.e. whoever feels confident that they fixed the test can remove the
label.

> 2) Do we expect that most of our tests will eventually get `FLAKY` label?
I wouldn't expect that, if we get to the point where the majority of tests
are flaky I assume we will be able to identify some systematic causes of
flaky tests that we can fix.

> Would the CI run FLAKY tests or will it filter it out?
That's the beauty of this proposal, every CI operator can decide for
themselves what the better solution for their needs will be. For us,
ideally I would like to run them 10 times and report an error only if it
failed more than once, but I'm not sure how hard this would be to implement
in jenkins. If it turns out to be too hard, I would suggest disabling them
so we can get back to a stable, green state and have failures be meaningful
again.

> What are the other reasons tests are DISABLED today?
I don't have an exhaustive list, but at least some were disabled as a
result of API changes, with the intention of fixing them later, e.g.
MESOS-8711

Best regards,

On Thu, Mar 29, 2018 at 9:22 PM, Vinod Kone <vi...@apache.org> wrote:

> Would the CI run FLAKY tests or will it filter it out? I'm assuming it
> still does based on your observation above.
>
> What are the other reasons tests are DISABLED today?
>
> On Thu, Mar 29, 2018 at 10:35 AM, Meng Zhu <mz...@mesosphere.com> wrote:
>
> > +1, the advantages are appealing.
> >
> > Though I am afraid that this will probably reduce the incentive to fix
> > flaky tests.
> >
> > -Meng
> >
> > On Thu, Mar 29, 2018 at 9:45 AM, Benno Evers <be...@mesosphere.com>
> > wrote:
> >
> > > Hi all,
> > >
> > > if you're regularly running Mesos unit tests, e.g. because you've set
> up
> > a
> > > CI system, you probably noticed that there is a lot of noise in the
> > results
> > > due to flaky tests.
> > >
> > > As a measure to ease the pain, what do you think about adding a `FLAKY`
> > > label to known flaky unit tests, similar to how we have `ROOT`,
> > `INTERNET`,
> > > `DISABLED`, etc. right now?
> > >
> > > The advantages, in my opinion, would be:
> > >  - Looking at test results, it would be immediately visible whether a
> > test
> > > failure was known flaky or not without going to JIRA
> > >  - People who want to reduce noise can disable all known flaky tests
> by a
> > > simple gtest filter
> > >  - People who want to can still run the flaky tests easier than if they
> > get
> > > disabled outright
> > >  - With a little bit of scripting, it would be possible to add logic
> like
> > > "for flaky tests, run them 10 times and only report a failure if more
> > than
> > > x% of the runs fail."
> > >
> > > What do you think?
> > >
> > > Best regards,
> > > --
> > > Benno Evers
> > > Software Engineer, Mesosphere
> > >
> >
>



-- 
Benno Evers
Software Engineer, Mesosphere