You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cassandra.apache.org by Josh McKenzie <jm...@apache.org> on 2022/10/03 14:23:09 UTC

Re: [DISCUSS] Revising our release criteria, commit guidelines, and the role of circleci vs. ASF CI

Any further revisions or objections to this or are we good to take it to a vote?

On Wed, Sep 28, 2022, at 10:54 AM, Josh McKenzie wrote:
> So revised proposal:
> 
> On Release Lifecycle cwiki page:
>  - Ensure we have parity on jobs run between circle and asf-ci
>  - Allow usage of circleci as gatekeeper for releases. 1 green run -> beta, 3 green runs consecutive -> ga
>  - No new consistent regressions on CI for asf compared to prior branches
>  - Explicitly do not consider ci-cassandra asf flaky tests as release blockers
> 
> Changes to codify into documentation:
>  - On patch before commit, multiplex @500 all new tests, changed tests, or expected to be impacted tests ("expected to be impacted" piece pending multi-class multiplexing support):
>  - Add support for multi-class specification in multiplexer and document
> 
> Add informal project commitment during next major release lifecycle to continue working on bringing asf ci-cassandra up to where it can be formal gatekeeper for release.
> 
> On Wed, Sep 28, 2022, at 10:13 AM, Ekaterina Dimitrova wrote:
>> If we talk blockers nothing more than ensuring we see all tests we want pre-release, IMHO. 
>> The other points sound to me like future important improvements that will help us significantly in the flaky test fight.
>> 
>> On Wed, 28 Sep 2022 at 10:08, Josh McKenzie <jm...@apache.org> wrote:
>>> __
>>> I'm receptive to that but I wouldn't gate our ability to get 4.1 out the door based on circle on that. Honestly probably only need to have the parity of coverage be the blocker for its use in retrospect.
>>> 
>>> On Wed, Sep 28, 2022, at 1:32 AM, Berenguer Blasi wrote:
>>>> I would add an option for generate.sh to detect all changed *Test.java files, that would be handy imo.
>>>> 
>>>> On 28/9/22 4:29, Josh McKenzie wrote:
>>>>> So:
>>>>>  1. 500 iterations on multiplexer
>>>>>  2. Augmenting generate.sh to allow providing multiple class names and generating a single config that'll multiplex all the tests provided
>>>>>  3. Test parity / pre-release config added on circleci (see https://issues.apache.org/jira/browse/CASSANDRA-17930), specifically dtest-large, dtest-offheap, test-large-novnode
>>>>> If we get the above 3, are we at a place where we're good to consider vetting releases on circleci for beta / rc / ga?
>>>>> 
>>>>> On Tue, Sep 27, 2022, at 11:28 AM, Ekaterina Dimitrova wrote:
>>>>>>> “I have plans on modifying the multiplexer to allow specifying a list of classes per test target, so we don't have to needlessly suffer with this”
>>>>>>> 
>>>>>>> 
>>>>>>> That would be great, I was thinking of that the other day too. With that said I’ll be happy to support you in that effort too :-) 
>>>>>> 
>>>>>> On Tue, 27 Sep 2022 at 11:18, Josh McKenzie <jm...@apache.org> wrote:
>>>>>>> 
>>>>>>>> I have plans on modifying the multiplexer to allow specifying a list of classes per test target, so we don't have to needlessly suffer with this
>>>>>>> This sounds integral to us multiplexing tests on large diffs whether we go with circle for releases or not and would be a great addition!
>>>>>>> 
>>>>>>> On Tue, Sep 27, 2022, at 6:19 AM, Andrés de la Peña wrote:
>>>>>>>>> 250 iterations isn't enough; I use 500 as a low water mark.
>>>>>>>> 
>>>>>>>> I agree that 500 iterations would be a reasonable minimum. We have seen flaky unit tests requiring far more iterations, but that's not very common. We could use to 500 iterations as default, and discretionary use a higher limit in tests that are quick and might be prone to concurrency issues. I can change the defaults on CirceCI config file if we agree to a new limit, the current default of 100 iterations is quite arbitrary.
>>>>>>>> 
>>>>>>>> The test multiplexer allows to either run test individual test methods or entire classes. It is quite frequent to see tests methods that pass individually but fail when they are run together with the other tests in the same class. Because of this, I think that we should always run entire classes when repeating new or modified tests. The only exception to this would be Python dtests, which usually are more resource intensive and not so prone to that type of issues.
>>>>>>>> 
>>>>>>>>> For CI on a patch, run the pre-commit suite and also run multiplexer with 250 runs on new, changed, or related tests to ensure not flaky
>>>>>>>> 
>>>>>>>> The multiplexer only allows to run a single test class per push. This is ok for fixing existing flakies (its original purpose), and for most minor changes, but it can be quite inconvenient for testing large patches that add or modify many tests. For example, the patch for CEP-19 directly modifies 31 test classes, which means 31 CircleCI config pushes. This number can be somewhat reduced with some wildcards on the class names, but the process is still quite inconvenient. I guess that other large patches will find the same problem. I have plans on modifying the multiplexer to allow specifying a list of classes per test target, so we don't have to needlessly suffer with this.
>>>>>>>> 
>>>>>>>> On Mon, 26 Sept 2022 at 22:44, Brandon Williams <dr...@gmail.com> wrote:
>>>>>>>>> On Mon, Sep 26, 2022 at 1:31 PM Josh McKenzie <jm...@apache.org> wrote:
>>>>>>>>> >
>>>>>>>>> > 250 iterations isn't enough; I use 500 as a low water mark.
>>>>>>>>> >
>>>>>>>>> > Say more here. I originally had it at 500 but neither Mick nor I knew why and figured we could suss this out on this thread.
>>>>>>>>> 
>>>>>>>>> I've seen flakies that passed with less later exhibit at that point.
>>>>>>>>> 
>>>>>>>>> > This is also assuming that circle and ASF CI run the same tests, which
>>>>>>>>> > is not entirely true.
>>>>>>>>> >
>>>>>>>>> > +1: we need to fix this. My intuition is the path to getting circle-ci in parity on coverage is a shorter path than getting ASF CI to 3 green runs for GA. That consistent w/your perception as well or do you disagree?
>>>>>>>>> 
>>>>>>>>> I agree that bringing parity to the coverage will be the shorter path.
>>>>>>> 
>>>>> 
>>> 
> 

Re: [DISCUSS] Revising our release criteria, commit guidelines, and the role of circleci vs. ASF CI

Posted by Mick Semb Wever <mc...@apache.org>.
> Thanks Andres! Let's sit tight and wait for the release vote for beta-1 to
> settle before we ratify these changes.
>


Reading this, I believe we have a lazy consensus in place and the 4.1-beta1
thread has settled, so we can finalise the release, while this thread will
go to ratification separately.

Re: [DISCUSS] Revising our release criteria, commit guidelines, and the role of circleci vs. ASF CI

Posted by Josh McKenzie <jm...@apache.org>.
Thanks Andres! Let's sit tight and wait for the release vote for beta-1 to settle before we ratify these changes.

On Wed, Oct 5, 2022, at 6:20 AM, Andrés de la Peña wrote:
> The proposal looks good to me. I have created tickets for:
>  - Increasing the default number of repeated test iterations to 500 (CASSANDRA-17937 <https://issues.apache.org/jira/browse/CASSANDRA-17937>, ready to commit)
>  - Automatically detecting and repeating new or modified JUnit tests (CASSANDRA-17939 <https://issues.apache.org/jira/browse/CASSANDRA-17939>, patch available)
>  - Allowing to specify multiple tests in the test multiplexer (CASSANDRA-17938 <https://issues.apache.org/jira/browse/CASSANDRA-17938>, in progress)
> 
> On Mon, 3 Oct 2022 at 15:23, Josh McKenzie <jm...@apache.org> wrote:
>> __
>> Any further revisions or objections to this or are we good to take it to a vote?
>> 
>> On Wed, Sep 28, 2022, at 10:54 AM, Josh McKenzie wrote:
>>> So revised proposal:
>>> 
>>> On Release Lifecycle cwiki page:
>>>  - Ensure we have parity on jobs run between circle and asf-ci
>>>  - Allow usage of circleci as gatekeeper for releases. 1 green run -> beta, 3 green runs consecutive -> ga
>>>  - No new consistent regressions on CI for asf compared to prior branches
>>>  - Explicitly do not consider ci-cassandra asf flaky tests as release blockers
>>> 
>>> Changes to codify into documentation:
>>>  - On patch before commit, multiplex @500 all new tests, changed tests, or expected to be impacted tests ("expected to be impacted" piece pending multi-class multiplexing support):
>>>  - Add support for multi-class specification in multiplexer and document
>>> 
>>> Add informal project commitment during next major release lifecycle to continue working on bringing asf ci-cassandra up to where it can be formal gatekeeper for release.
>>> 
>>> On Wed, Sep 28, 2022, at 10:13 AM, Ekaterina Dimitrova wrote:
>>>> If we talk blockers nothing more than ensuring we see all tests we want pre-release, IMHO. 
>>>> The other points sound to me like future important improvements that will help us significantly in the flaky test fight.
>>>> 
>>>> On Wed, 28 Sep 2022 at 10:08, Josh McKenzie <jm...@apache.org> wrote:
>>>>> __
>>>>> I'm receptive to that but I wouldn't gate our ability to get 4.1 out the door based on circle on that. Honestly probably only need to have the parity of coverage be the blocker for its use in retrospect.
>>>>> 
>>>>> On Wed, Sep 28, 2022, at 1:32 AM, Berenguer Blasi wrote:
>>>>>> I would add an option for generate.sh to detect all changed *Test.java files, that would be handy imo.
>>>>>> 
>>>>>> On 28/9/22 4:29, Josh McKenzie wrote:
>>>>>>> So:
>>>>>>>  1. 500 iterations on multiplexer
>>>>>>>  2. Augmenting generate.sh to allow providing multiple class names and generating a single config that'll multiplex all the tests provided
>>>>>>>  3. Test parity / pre-release config added on circleci (see https://issues.apache.org/jira/browse/CASSANDRA-17930), specifically dtest-large, dtest-offheap, test-large-novnode
>>>>>>> If we get the above 3, are we at a place where we're good to consider vetting releases on circleci for beta / rc / ga?
>>>>>>> 
>>>>>>> On Tue, Sep 27, 2022, at 11:28 AM, Ekaterina Dimitrova wrote:
>>>>>>>>> “I have plans on modifying the multiplexer to allow specifying a list of classes per test target, so we don't have to needlessly suffer with this”
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> That would be great, I was thinking of that the other day too. With that said I’ll be happy to support you in that effort too :-) 
>>>>>>>> 
>>>>>>>> On Tue, 27 Sep 2022 at 11:18, Josh McKenzie <jm...@apache.org> wrote:
>>>>>>>>> 
>>>>>>>>>> I have plans on modifying the multiplexer to allow specifying a list of classes per test target, so we don't have to needlessly suffer with this
>>>>>>>>> This sounds integral to us multiplexing tests on large diffs whether we go with circle for releases or not and would be a great addition!
>>>>>>>>> 
>>>>>>>>> On Tue, Sep 27, 2022, at 6:19 AM, Andrés de la Peña wrote:
>>>>>>>>>>> 250 iterations isn't enough; I use 500 as a low water mark.
>>>>>>>>>> 
>>>>>>>>>> I agree that 500 iterations would be a reasonable minimum. We have seen flaky unit tests requiring far more iterations, but that's not very common. We could use to 500 iterations as default, and discretionary use a higher limit in tests that are quick and might be prone to concurrency issues. I can change the defaults on CirceCI config file if we agree to a new limit, the current default of 100 iterations is quite arbitrary.
>>>>>>>>>> 
>>>>>>>>>> The test multiplexer allows to either run test individual test methods or entire classes. It is quite frequent to see tests methods that pass individually but fail when they are run together with the other tests in the same class. Because of this, I think that we should always run entire classes when repeating new or modified tests. The only exception to this would be Python dtests, which usually are more resource intensive and not so prone to that type of issues.
>>>>>>>>>> 
>>>>>>>>>>> For CI on a patch, run the pre-commit suite and also run multiplexer with 250 runs on new, changed, or related tests to ensure not flaky
>>>>>>>>>> 
>>>>>>>>>> The multiplexer only allows to run a single test class per push. This is ok for fixing existing flakies (its original purpose), and for most minor changes, but it can be quite inconvenient for testing large patches that add or modify many tests. For example, the patch for CEP-19 directly modifies 31 test classes, which means 31 CircleCI config pushes. This number can be somewhat reduced with some wildcards on the class names, but the process is still quite inconvenient. I guess that other large patches will find the same problem. I have plans on modifying the multiplexer to allow specifying a list of classes per test target, so we don't have to needlessly suffer with this.
>>>>>>>>>> 
>>>>>>>>>> On Mon, 26 Sept 2022 at 22:44, Brandon Williams <dr...@gmail.com> wrote:
>>>>>>>>>>> On Mon, Sep 26, 2022 at 1:31 PM Josh McKenzie <jm...@apache.org> wrote:
>>>>>>>>>>> >
>>>>>>>>>>> > 250 iterations isn't enough; I use 500 as a low water mark.
>>>>>>>>>>> >
>>>>>>>>>>> > Say more here. I originally had it at 500 but neither Mick nor I knew why and figured we could suss this out on this thread.
>>>>>>>>>>> 
>>>>>>>>>>> I've seen flakies that passed with less later exhibit at that point.
>>>>>>>>>>> 
>>>>>>>>>>> > This is also assuming that circle and ASF CI run the same tests, which
>>>>>>>>>>> > is not entirely true.
>>>>>>>>>>> >
>>>>>>>>>>> > +1: we need to fix this. My intuition is the path to getting circle-ci in parity on coverage is a shorter path than getting ASF CI to 3 green runs for GA. That consistent w/your perception as well or do you disagree?
>>>>>>>>>>> 
>>>>>>>>>>> I agree that bringing parity to the coverage will be the shorter path.
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>> 
>> 

Re: [DISCUSS] Revising our release criteria, commit guidelines, and the role of circleci vs. ASF CI

Posted by Andrés de la Peña <a....@gmail.com>.
The proposal looks good to me. I have created tickets for:
 - Increasing the default number of repeated test iterations to 500 (
CASSANDRA-17937 <https://issues.apache.org/jira/browse/CASSANDRA-17937>,
ready to commit)
 - Automatically detecting and repeating new or modified JUnit tests (
CASSANDRA-17939 <https://issues.apache.org/jira/browse/CASSANDRA-17939>,
patch available)
 - Allowing to specify multiple tests in the test multiplexer (
CASSANDRA-17938 <https://issues.apache.org/jira/browse/CASSANDRA-17938>, in
progress)

On Mon, 3 Oct 2022 at 15:23, Josh McKenzie <jm...@apache.org> wrote:

> Any further revisions or objections to this or are we good to take it to a
> vote?
>
> On Wed, Sep 28, 2022, at 10:54 AM, Josh McKenzie wrote:
>
> So revised proposal:
>
> On Release Lifecycle cwiki page:
>  - Ensure we have parity on jobs run between circle and asf-ci
>  - Allow usage of circleci as gatekeeper for releases. 1 green run ->
> beta, 3 green runs consecutive -> ga
>  - No new consistent regressions on CI for asf compared to prior branches
>  - Explicitly do not consider ci-cassandra asf flaky tests as release
> blockers
>
> Changes to codify into documentation:
>  - On patch before commit, multiplex @500 all new tests, changed tests, or
> expected to be impacted tests ("expected to be impacted" piece pending
> multi-class multiplexing support):
>  - Add support for multi-class specification in multiplexer and document
>
> Add informal project commitment during next major release lifecycle to
> continue working on bringing asf ci-cassandra up to where it can be formal
> gatekeeper for release.
>
> On Wed, Sep 28, 2022, at 10:13 AM, Ekaterina Dimitrova wrote:
>
> If we talk blockers nothing more than ensuring we see all tests we want
> pre-release, IMHO.
> The other points sound to me like future important improvements that will
> help us significantly in the flaky test fight.
>
> On Wed, 28 Sep 2022 at 10:08, Josh McKenzie <jm...@apache.org> wrote:
>
>
> I'm receptive to that but I wouldn't gate our ability to get 4.1 out the
> door based on circle on that. Honestly probably only need to have the
> parity of coverage be the blocker for its use in retrospect.
>
> On Wed, Sep 28, 2022, at 1:32 AM, Berenguer Blasi wrote:
>
> I would add an option for generate.sh to detect all changed *Test.java
> files, that would be handy imo.
> On 28/9/22 4:29, Josh McKenzie wrote:
>
> So:
>
>    1. 500 iterations on multiplexer
>    2. Augmenting generate.sh to allow providing multiple class names and
>    generating a single config that'll multiplex all the tests provided
>    3. Test parity / pre-release config added on circleci (see
>    https://issues.apache.org/jira/browse/CASSANDRA-17930),
>    specifically dtest-large, dtest-offheap, test-large-novnode
>
> If we get the above 3, are we at a place where we're good to consider
> vetting releases on circleci for beta / rc / ga?
>
> On Tue, Sep 27, 2022, at 11:28 AM, Ekaterina Dimitrova wrote:
>
> “I have plans on modifying the multiplexer to allow specifying a list of
> classes per test target, so we don't have to needlessly suffer with this”
>
>
> That would be great, I was thinking of that the other day too. With that
> said I’ll be happy to support you in that effort too :-)
>
>
> On Tue, 27 Sep 2022 at 11:18, Josh McKenzie <jm...@apache.org> wrote:
>
>
> I have plans on modifying the multiplexer to allow specifying a list of
> classes per test target, so we don't have to needlessly suffer with this
>
> This sounds integral to us multiplexing tests on large diffs whether we go
> with circle for releases or not and would be a great addition!
>
> On Tue, Sep 27, 2022, at 6:19 AM, Andrés de la Peña wrote:
>
> 250 iterations isn't enough; I use 500 as a low water mark.
>
>
> I agree that 500 iterations would be a reasonable minimum. We have seen
> flaky unit tests requiring far more iterations, but that's not very common.
> We could use to 500 iterations as default, and discretionary use a higher
> limit in tests that are quick and might be prone to concurrency issues. I
> can change the defaults on CirceCI config file if we agree to a new limit,
> the current default of 100 iterations is quite arbitrary.
>
> The test multiplexer allows to either run test individual test methods or
> entire classes. It is quite frequent to see tests methods that pass
> individually but fail when they are run together with the other tests in
> the same class. Because of this, I think that we should always run entire
> classes when repeating new or modified tests. The only exception to this
> would be Python dtests, which usually are more resource intensive and not
> so prone to that type of issues.
>
> For CI on a patch, run the pre-commit suite and also run multiplexer with
> 250 runs on new, changed, or related tests to ensure not flaky
>
>
> The multiplexer only allows to run a single test class per push. This is
> ok for fixing existing flakies (its original purpose), and for most minor
> changes, but it can be quite inconvenient for testing large patches that
> add or modify many tests. For example, the patch for CEP-19 directly
> modifies 31 test classes, which means 31 CircleCI config pushes. This
> number can be somewhat reduced with some wildcards on the class names, but
> the process is still quite inconvenient. I guess that other large patches
> will find the same problem. I have plans on modifying the multiplexer to
> allow specifying a list of classes per test target, so we don't have to
> needlessly suffer with this.
>
> On Mon, 26 Sept 2022 at 22:44, Brandon Williams <dr...@gmail.com> wrote:
>
> On Mon, Sep 26, 2022 at 1:31 PM Josh McKenzie <jm...@apache.org>
> wrote:
> >
> > 250 iterations isn't enough; I use 500 as a low water mark.
> >
> > Say more here. I originally had it at 500 but neither Mick nor I knew
> why and figured we could suss this out on this thread.
>
> I've seen flakies that passed with less later exhibit at that point.
>
> > This is also assuming that circle and ASF CI run the same tests, which
> > is not entirely true.
> >
> > +1: we need to fix this. My intuition is the path to getting circle-ci
> in parity on coverage is a shorter path than getting ASF CI to 3 green runs
> for GA. That consistent w/your perception as well or do you disagree?
>
> I agree that bringing parity to the coverage will be the shorter path.
>
>
>
>
>
>
>