You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cassandra.apache.org by Jacek Lewandowski <le...@gmail.com> on 2023/11/30 09:25:17 UTC

Long tests, Burn tests, Simulator tests, Fuzz tests - can we clarify the diffs?

Hi,

I'm getting a bit lost - what are the exact differences between those
test scenarios? What are the criteria for qualifying a test to be part of a
certain scenario?

I'm working a little bit with tests and build scripts and the number of
different configurations for which we have a separate target in the build
starts to be problematic, I cannot imagine how problematic it is for a new
contributor.

It is not urgent, but we should at least have a plan on how to simplify and
unify things.

I'm in favour of reducing the number of test targets to the minimum - for
different configurations I think we should provide a parameter pointing to
jvm options file and maybe to cassandra.yaml. I know that we currently do
some super hacky things with cassandra yaml for different configs - like
concatenting parts of it. I presume it is not necessary - we can have a
default test config yaml and a directory with overriding yamls; while
building we could have a tool which is able to load the default
configuration, apply the override and save the resulting yaml somewhere in
the build/test/configs for example. That would allows us to easily use
those yamls in IDE as well - currently it is impossible.

What do you think?

Thank you and my apologize for bothering about lower priority stuff while
we have a 5.0 release headache...

Jacek

Re: Long tests, Burn tests, Simulator tests, Fuzz tests - can we clarify the diffs?

Posted by Mick Semb Wever <mc...@apache.org>.
>
> 1. Since long tests are just unit tests that take a long time to run,
>


Yes, they are just "resource intensive" tests, on par to the "large" python
dtests.  they require more machine specs to run.
They are great candidates to improve so they don't require additional
resources, but many often value and cannot.



> 2. I'm still confused about the distinction between burn and fuzz tests -
> it seems to me that fuzz tests are just modern burn tests - should we
> refactor the existing burn tests to use the new framework?
>


Burn are not really tests that belong in the CI pipeline. We only run them
in the CI pipeline to validate that they can still compile and run.  So we
only need to run them for an absolute minimum amount of time.  Maybe it
would be nice if they were part of the checks stage instead of being their
own test type.



> 4. Yeah, running a complete suite for each artificially crafted
> configuration brings little value compared to the maintenance and
> infrastructure costs. It feels like we are running all tests a bit blindly,
> hoping we catch something accidentally. I agree this is not the purpose of
> the unit tests and should be covered instead by fuzz. For features like
> CDC, compression, different sstable formats, trie memtable, commit log
> compression/encryption, system directory keyspace, etc... we should have
> dedicated tests that verify just that functionality
>


I think everyone agrees here, but…. these variations are still catching
failures, and until we have an improvement or replacement we do rely on
them.   I'm not in favour of removing them until we have proof /confidence
that any replacement is catching the same failures.  Especially oa, tries,
vnodes. (Not tries and offheap is being replaced with "latest", which will
be valuable simplification.)

Dedicated unit tests may also be parameterised tests with a base
parameterisation that extends off on analysis of what a patch touches…

Re: Harry in-tree (Forked from "Long tests, Burn tests, Simulator tests, Fuzz tests - can we clarify the diffs?")

Posted by Francisco Guerrero <fr...@apache.org>.
+1 thanks for this effort! 

On 2023/12/21 21:22:54 Alex Petrov wrote:
> Hey folks,
> 
> I am mostly done with a patch that brings Harry in-tree [1]. I will trigger one more CI run overnight, and my intention was to merge it some time soon, but I wanted to give a fair warning here, since this is a relatively large patch. 
> 
> Good news for everyone that it: 
>   a) touches no production code whatsoever. Only test (in-jvm dtest namely) code that was using Harry already.
>   b) the only tests that are changed are ones that used a duplicate version of placement simulator we had both for testing TCM, and in Harry
>   c) in addition, I have converted 3 existing TCM tests to a new API to have some base for examples/usage.
> 
> Since we were effectively relying on this code for a while now, and the intention now is to converge to: 
>   a) fewer different generators, and have a shareable version of generators for everyone to use accross the base
>   b) a testing tool that can be useful for both trivial cases, and complex scenarios 
> myself and many other Cassandra contributors have expressed an opinion that bringing Harry in-tree will be highly benefitial.
> 
> I strongly believe that bringing Harry in-tree will help to lower the barrier for fuzz test and simplify co-development of Cassandra and Harry. Previously, it has been rather difficult to debug edge cases because I had to either re-compile an in-jvm dtest jar and bring it to Harry, or re-compile a Harry jar and bring it to Cassandra, which is both tedious and time consuming. Moreover, I believe we have missed at very least one RT regression [2] because Harry was not in-tree, as its tests would've caught the issue even with the model that existed.
> 
> For other recently found issues, I think having Harry in-tree would have substantially lowered a turnaround time, and allowed me to share repros with developers of corresponding features much quicker.
> 
> I do expect a slight learning curve for Harry, but my intention is to build a web of simple tests (worked on some of them yesterday after conversation with David already), which can follow the in-jvm-dtest pattern of find-similar-test / copy / modify. There's already copious documentation, so I do not believe not having docs for Harry was ever an issue, since there have been plenty. 
> 
> You all are aware of my dedication to testing and quality of Apache Cassandra, and I hope you also see the benefits of having a model checker in-tree.
> 
> Thank you and happy upcoming holidays,
> --Alex
> 
> [1] https://issues.apache.org/jira/browse/CASSANDRA-19210
> [2] https://issues.apache.org/jira/browse/CASSANDRA-18932
> 

Re: Harry in-tree (Forked from "Long tests, Burn tests, Simulator tests, Fuzz tests - can we clarify the diffs?")

Posted by Alex Petrov <al...@coffeenco.de>.
I did mean the md file, which explains all internal intricacies. Also there is a blog post [1] 

My plan was to:
 1. Introduce easily copiable samples 
 2. add more Java doc
 3. Talk to other contributors and collect information about missing pieces / how to make it more accessible 
I might ask for help when we have some better understanding of what we want to achieve.

Thank you 
—Alex

[1] https://cassandra.apache.org/_/blog/Harry-an-Open-Source-Fuzz-Testing-and-Verification-Tool-for-Apache-Cassandra.html

On Tue, Jan 2, 2024, at 10:54 PM, Lorina Poland wrote:
> Is there any user-facing documentation (for developers) that should be added? I note that you say there is "extensive documentation"; I presume that you are referring to the README.md in the repo?
> 
> If there is a desire to add documentation to the website, as opposed to the MD files in the repo, please reach out to me.
> 
> Thanks,
> Lorina
> 
> On 2023/12/21 21:22:54 Alex Petrov wrote:
> > Hey folks,
> > 
> > I am mostly done with a patch that brings Harry in-tree [1]. I will trigger one more CI run overnight, and my intention was to merge it some time soon, but I wanted to give a fair warning here, since this is a relatively large patch. 
> > 
> > Good news for everyone that it: 
> >   a) touches no production code whatsoever. Only test (in-jvm dtest namely) code that was using Harry already.
> >   b) the only tests that are changed are ones that used a duplicate version of placement simulator we had both for testing TCM, and in Harry
> >   c) in addition, I have converted 3 existing TCM tests to a new API to have some base for examples/usage.
> > 
> > Since we were effectively relying on this code for a while now, and the intention now is to converge to: 
> >   a) fewer different generators, and have a shareable version of generators for everyone to use accross the base
> >   b) a testing tool that can be useful for both trivial cases, and complex scenarios 
> > myself and many other Cassandra contributors have expressed an opinion that bringing Harry in-tree will be highly benefitial.
> > 
> > I strongly believe that bringing Harry in-tree will help to lower the barrier for fuzz test and simplify co-development of Cassandra and Harry. Previously, it has been rather difficult to debug edge cases because I had to either re-compile an in-jvm dtest jar and bring it to Harry, or re-compile a Harry jar and bring it to Cassandra, which is both tedious and time consuming. Moreover, I believe we have missed at very least one RT regression [2] because Harry was not in-tree, as its tests would've caught the issue even with the model that existed.
> > 
> > For other recently found issues, I think having Harry in-tree would have substantially lowered a turnaround time, and allowed me to share repros with developers of corresponding features much quicker.
> > 
> > I do expect a slight learning curve for Harry, but my intention is to build a web of simple tests (worked on some of them yesterday after conversation with David already), which can follow the in-jvm-dtest pattern of find-similar-test / copy / modify. There's already copious documentation, so I do not believe not having docs for Harry was ever an issue, since there have been plenty. 
> > 
> > You all are aware of my dedication to testing and quality of Apache Cassandra, and I hope you also see the benefits of having a model checker in-tree.
> > 
> > Thank you and happy upcoming holidays,
> > --Alex
> > 
> > [1] https://issues.apache.org/jira/browse/CASSANDRA-19210
> > [2] https://issues.apache.org/jira/browse/CASSANDRA-18932
> > 
> 

Re: Harry in-tree (Forked from "Long tests, Burn tests, Simulator tests, Fuzz tests - can we clarify the diffs?")

Posted by Lorina Poland <po...@apache.org>.
Is there any user-facing documentation (for developers) that should be added? I note that you say there is "extensive documentation"; I presume that you are referring to the README.md in the repo?

If there is a desire to add documentation to the website, as opposed to the MD files in the repo, please reach out to me.

Thanks,
Lorina

On 2023/12/21 21:22:54 Alex Petrov wrote:
> Hey folks,
> 
> I am mostly done with a patch that brings Harry in-tree [1]. I will trigger one more CI run overnight, and my intention was to merge it some time soon, but I wanted to give a fair warning here, since this is a relatively large patch. 
> 
> Good news for everyone that it: 
>   a) touches no production code whatsoever. Only test (in-jvm dtest namely) code that was using Harry already.
>   b) the only tests that are changed are ones that used a duplicate version of placement simulator we had both for testing TCM, and in Harry
>   c) in addition, I have converted 3 existing TCM tests to a new API to have some base for examples/usage.
> 
> Since we were effectively relying on this code for a while now, and the intention now is to converge to: 
>   a) fewer different generators, and have a shareable version of generators for everyone to use accross the base
>   b) a testing tool that can be useful for both trivial cases, and complex scenarios 
> myself and many other Cassandra contributors have expressed an opinion that bringing Harry in-tree will be highly benefitial.
> 
> I strongly believe that bringing Harry in-tree will help to lower the barrier for fuzz test and simplify co-development of Cassandra and Harry. Previously, it has been rather difficult to debug edge cases because I had to either re-compile an in-jvm dtest jar and bring it to Harry, or re-compile a Harry jar and bring it to Cassandra, which is both tedious and time consuming. Moreover, I believe we have missed at very least one RT regression [2] because Harry was not in-tree, as its tests would've caught the issue even with the model that existed.
> 
> For other recently found issues, I think having Harry in-tree would have substantially lowered a turnaround time, and allowed me to share repros with developers of corresponding features much quicker.
> 
> I do expect a slight learning curve for Harry, but my intention is to build a web of simple tests (worked on some of them yesterday after conversation with David already), which can follow the in-jvm-dtest pattern of find-similar-test / copy / modify. There's already copious documentation, so I do not believe not having docs for Harry was ever an issue, since there have been plenty. 
> 
> You all are aware of my dedication to testing and quality of Apache Cassandra, and I hope you also see the benefits of having a model checker in-tree.
> 
> Thank you and happy upcoming holidays,
> --Alex
> 
> [1] https://issues.apache.org/jira/browse/CASSANDRA-19210
> [2] https://issues.apache.org/jira/browse/CASSANDRA-18932
> 

Re: Harry in-tree (Forked from "Long tests, Burn tests, Simulator tests, Fuzz tests - can we clarify the diffs?")

Posted by German Eichberger via dev <de...@cassandra.apache.org>.
+1
________________________________
From: Patrick McFadin <pm...@gmail.com>
Sent: Friday, December 22, 2023 9:12 AM
To: dev@cassandra.apache.org <de...@cassandra.apache.org>
Subject: [EXTERNAL] Re: Harry in-tree (Forked from "Long tests, Burn tests, Simulator tests, Fuzz tests - can we clarify the diffs?")

It was great having some more extended discussions about Harry in person last week. Anything we can do to make it easier for anyone to test Cassandra thoroughly is an easy +1 from me!

Thanks for all your efforts so far, Alex.

Patrick

On Fri, Dec 22, 2023 at 8:03 AM Jacek Lewandowski <le...@gmail.com>> wrote:
Obviously +1

Thank you Alex

pt., 22 gru 2023, 16:45 użytkownik Sumanth Pasupuleti <su...@gmail.com>> napisał:
+1, thank you for your efforts in bringing Harry in-tree. Anything that improves the testing ecosystem for Cassandra, particularly around complex scenarios / edge cases  goes a long way in improving reliability, and with having a powerful tool like Harry in-tree, it is a lot more accessible to the developers than it has been. Also, thank you for keeping in mind the onboarding experience of developers.

- Sumanth

On Fri, Dec 22, 2023 at 1:11 AM Alex Petrov <al...@coffeenco.de>> wrote:
Some follow-up tickets to establish the project direction:

https://issues.apache.org/jira/browse/CASSANDRA-19229

Two other things that we will work on in Tree are:
https://issues.apache.org/jira/browse/CASSANDRA-18275 (model and in-JVM test for partition-restricted 2i queries)
https://issues.apache.org/jira/browse/CASSANDRA-18667 (multi-threaded SAI read and write fuzz test)

If you would like to get your recently added feature tested with Harry model, please let me know!

On Fri, Dec 22, 2023, at 12:41 AM, Joseph Lynch wrote:
+1

Sounds like a great change that will help us unify around a common testing paradigm, and even pave the path to in-tree load testing plus integrated correctness checking which would be extremely valuable!

-Joey

On Thu, Dec 21, 2023 at 1:35 PM Caleb Rackliffe <ca...@gmail.com>> wrote:
+1

Agree w/ all the justifications mentioned above.

As a reviewer on CASSANDRA-19210<https://issues.apache.org/jira/browse/CASSANDRA-19210>, my goals were to a.) look at the directory, naming, and package structure of the ported code, b.) make sure IDE integration was working, and c.) make sure any modifications to existing code (rather than direct code movements from cassandra-harry) were straightforward.

On Thu, Dec 21, 2023 at 3:23 PM Alex Petrov <al...@coffeenco.de>> wrote:

Hey folks,

I am mostly done with a patch that brings Harry in-tree [1]. I will trigger one more CI run overnight, and my intention was to merge it some time soon, but I wanted to give a fair warning here, since this is a relatively large patch.

Good news for everyone that it:
  a) touches no production code whatsoever. Only test (in-jvm dtest namely) code that was using Harry already.
  b) the only tests that are changed are ones that used a duplicate version of placement simulator we had both for testing TCM, and in Harry
  c) in addition, I have converted 3 existing TCM tests to a new API to have some base for examples/usage.

Since we were effectively relying on this code for a while now, and the intention now is to converge to:
  a) fewer different generators, and have a shareable version of generators for everyone to use accross the base
  b) a testing tool that can be useful for both trivial cases, and complex scenarios
myself and many other Cassandra contributors have expressed an opinion that bringing Harry in-tree will be highly benefitial.

I strongly believe that bringing Harry in-tree will help to lower the barrier for fuzz test and simplify co-development of Cassandra and Harry. Previously, it has been rather difficult to debug edge cases because I had to either re-compile an in-jvm dtest jar and bring it to Harry, or re-compile a Harry jar and bring it to Cassandra, which is both tedious and time consuming. Moreover, I believe we have missed at very least one RT regression [2] because Harry was not in-tree, as its tests would've caught the issue even with the model that existed.

For other recently found issues, I think having Harry in-tree would have substantially lowered a turnaround time, and allowed me to share repros with developers of corresponding features much quicker.

I do expect a slight learning curve for Harry, but my intention is to build a web of simple tests (worked on some of them yesterday after conversation with David already), which can follow the in-jvm-dtest pattern of find-similar-test / copy / modify. There's already copious documentation, so I do not believe not having docs for Harry was ever an issue, since there have been plenty.

You all are aware of my dedication to testing and quality of Apache Cassandra, and I hope you also see the benefits of having a model checker in-tree.

Thank you and happy upcoming holidays,
--Alex

[1] https://issues.apache.org/jira/browse/CASSANDRA-19210
[2] https://issues.apache.org/jira/browse/CASSANDRA-18932



Re: Harry in-tree (Forked from "Long tests, Burn tests, Simulator tests, Fuzz tests - can we clarify the diffs?")

Posted by Patrick McFadin <pm...@gmail.com>.
It was great having some more extended discussions about Harry in person
last week. Anything we can do to make it easier for anyone to test
Cassandra thoroughly is an easy +1 from me!

Thanks for all your efforts so far, Alex.

Patrick

On Fri, Dec 22, 2023 at 8:03 AM Jacek Lewandowski <
lewandowski.jacek@gmail.com> wrote:

> Obviously +1
>
> Thank you Alex
>
> pt., 22 gru 2023, 16:45 użytkownik Sumanth Pasupuleti <
> sumanth.pasupuleti.is@gmail.com> napisał:
>
>> +1, thank you for your efforts in bringing Harry in-tree. Anything that
>> improves the testing ecosystem for Cassandra, particularly around complex
>> scenarios / edge cases  goes a long way in improving reliability, and with
>> having a powerful tool like Harry in-tree, it is a lot more accessible to
>> the developers than it has been. Also, thank you for keeping in mind the
>> onboarding experience of developers.
>>
>> - Sumanth
>>
>> On Fri, Dec 22, 2023 at 1:11 AM Alex Petrov <al...@coffeenco.de> wrote:
>>
>>> Some follow-up tickets to establish the project direction:
>>>
>>> https://issues.apache.org/jira/browse/CASSANDRA-19229
>>>
>>> Two other things that we will work on in Tree are:
>>> https://issues.apache.org/jira/browse/CASSANDRA-18275 (model and in-JVM
>>> test for partition-restricted 2i queries)
>>> https://issues.apache.org/jira/browse/CASSANDRA-18667 (multi-threaded
>>> SAI read and write fuzz test)
>>>
>>> If you would like to get your recently added feature tested with Harry
>>> model, please let me know!
>>>
>>> On Fri, Dec 22, 2023, at 12:41 AM, Joseph Lynch wrote:
>>>
>>> +1
>>>
>>> Sounds like a great change that will help us unify around a common
>>> testing paradigm, and even pave the path to in-tree load testing plus
>>> integrated correctness checking which would be extremely valuable!
>>>
>>> -Joey
>>>
>>> On Thu, Dec 21, 2023 at 1:35 PM Caleb Rackliffe <
>>> calebrackliffe@gmail.com> wrote:
>>>
>>> +1
>>>
>>> Agree w/ all the justifications mentioned above.
>>>
>>> As a reviewer on CASSANDRA-19210
>>> <https://issues.apache.org/jira/browse/CASSANDRA-19210>, my goals were
>>> to a.) look at the directory, naming, and package structure of the ported
>>> code, b.) make sure IDE integration was working, and c.) make sure any
>>> modifications to existing code (rather than direct code movements from
>>> cassandra-harry) were straightforward.
>>>
>>> On Thu, Dec 21, 2023 at 3:23 PM Alex Petrov <al...@coffeenco.de> wrote:
>>>
>>>
>>> Hey folks,
>>>
>>> I am mostly done with a patch that brings Harry in-tree [1]. I will
>>> trigger one more CI run overnight, and my intention was to merge it some
>>> time soon, but I wanted to give a fair warning here, since this is a
>>> relatively large patch.
>>>
>>> Good news for everyone that it:
>>>   a) touches no production code whatsoever. Only test (in-jvm dtest
>>> namely) code that was using Harry already.
>>>   b) the only tests that are changed are ones that used a duplicate
>>> version of placement simulator we had both for testing TCM, and in Harry
>>>   c) in addition, I have converted 3 existing TCM tests to a new API to
>>> have some base for examples/usage.
>>>
>>> Since we were effectively relying on this code for a while now, and the
>>> intention now is to converge to:
>>>   a) fewer different generators, and have a shareable version of
>>> generators for everyone to use accross the base
>>>   b) a testing tool that can be useful for both trivial cases, and
>>> complex scenarios
>>> myself and many other Cassandra contributors have expressed an opinion
>>> that bringing Harry in-tree will be highly benefitial.
>>>
>>> I strongly believe that bringing Harry in-tree will help to lower the
>>> barrier for fuzz test and simplify co-development of Cassandra and Harry.
>>> Previously, it has been rather difficult to debug edge cases because I had
>>> to either re-compile an in-jvm dtest jar and bring it to Harry, or
>>> re-compile a Harry jar and bring it to Cassandra, which is both tedious and
>>> time consuming. Moreover, I believe we have missed at very least one RT
>>> regression [2] because Harry was not in-tree, as its tests would've caught
>>> the issue even with the model that existed.
>>>
>>> For other recently found issues, I think having Harry in-tree would have
>>> substantially lowered a turnaround time, and allowed me to share repros
>>> with developers of corresponding features much quicker.
>>>
>>> I do expect a slight learning curve for Harry, but my intention is to
>>> build a web of simple tests (worked on some of them yesterday after
>>> conversation with David already), which can follow the in-jvm-dtest pattern
>>> of find-similar-test / copy / modify. There's already copious
>>> documentation, so I do not believe not having docs for Harry was ever an
>>> issue, since there have been plenty.
>>>
>>> You all are aware of my dedication to testing and quality of Apache
>>> Cassandra, and I hope you also see the benefits of having a model checker
>>> in-tree.
>>>
>>> Thank you and happy upcoming holidays,
>>> --Alex
>>>
>>> [1] https://issues.apache.org/jira/browse/CASSANDRA-19210
>>> [2] https://issues.apache.org/jira/browse/CASSANDRA-18932
>>>
>>>
>>>

Re: Harry in-tree (Forked from "Long tests, Burn tests, Simulator tests, Fuzz tests - can we clarify the diffs?")

Posted by Jacek Lewandowski <le...@gmail.com>.
Obviously +1

Thank you Alex

pt., 22 gru 2023, 16:45 użytkownik Sumanth Pasupuleti <
sumanth.pasupuleti.is@gmail.com> napisał:

> +1, thank you for your efforts in bringing Harry in-tree. Anything that
> improves the testing ecosystem for Cassandra, particularly around complex
> scenarios / edge cases  goes a long way in improving reliability, and with
> having a powerful tool like Harry in-tree, it is a lot more accessible to
> the developers than it has been. Also, thank you for keeping in mind the
> onboarding experience of developers.
>
> - Sumanth
>
> On Fri, Dec 22, 2023 at 1:11 AM Alex Petrov <al...@coffeenco.de> wrote:
>
>> Some follow-up tickets to establish the project direction:
>>
>> https://issues.apache.org/jira/browse/CASSANDRA-19229
>>
>> Two other things that we will work on in Tree are:
>> https://issues.apache.org/jira/browse/CASSANDRA-18275 (model and in-JVM
>> test for partition-restricted 2i queries)
>> https://issues.apache.org/jira/browse/CASSANDRA-18667 (multi-threaded
>> SAI read and write fuzz test)
>>
>> If you would like to get your recently added feature tested with Harry
>> model, please let me know!
>>
>> On Fri, Dec 22, 2023, at 12:41 AM, Joseph Lynch wrote:
>>
>> +1
>>
>> Sounds like a great change that will help us unify around a common
>> testing paradigm, and even pave the path to in-tree load testing plus
>> integrated correctness checking which would be extremely valuable!
>>
>> -Joey
>>
>> On Thu, Dec 21, 2023 at 1:35 PM Caleb Rackliffe <ca...@gmail.com>
>> wrote:
>>
>> +1
>>
>> Agree w/ all the justifications mentioned above.
>>
>> As a reviewer on CASSANDRA-19210
>> <https://issues.apache.org/jira/browse/CASSANDRA-19210>, my goals were
>> to a.) look at the directory, naming, and package structure of the ported
>> code, b.) make sure IDE integration was working, and c.) make sure any
>> modifications to existing code (rather than direct code movements from
>> cassandra-harry) were straightforward.
>>
>> On Thu, Dec 21, 2023 at 3:23 PM Alex Petrov <al...@coffeenco.de> wrote:
>>
>>
>> Hey folks,
>>
>> I am mostly done with a patch that brings Harry in-tree [1]. I will
>> trigger one more CI run overnight, and my intention was to merge it some
>> time soon, but I wanted to give a fair warning here, since this is a
>> relatively large patch.
>>
>> Good news for everyone that it:
>>   a) touches no production code whatsoever. Only test (in-jvm dtest
>> namely) code that was using Harry already.
>>   b) the only tests that are changed are ones that used a duplicate
>> version of placement simulator we had both for testing TCM, and in Harry
>>   c) in addition, I have converted 3 existing TCM tests to a new API to
>> have some base for examples/usage.
>>
>> Since we were effectively relying on this code for a while now, and the
>> intention now is to converge to:
>>   a) fewer different generators, and have a shareable version of
>> generators for everyone to use accross the base
>>   b) a testing tool that can be useful for both trivial cases, and
>> complex scenarios
>> myself and many other Cassandra contributors have expressed an opinion
>> that bringing Harry in-tree will be highly benefitial.
>>
>> I strongly believe that bringing Harry in-tree will help to lower the
>> barrier for fuzz test and simplify co-development of Cassandra and Harry.
>> Previously, it has been rather difficult to debug edge cases because I had
>> to either re-compile an in-jvm dtest jar and bring it to Harry, or
>> re-compile a Harry jar and bring it to Cassandra, which is both tedious and
>> time consuming. Moreover, I believe we have missed at very least one RT
>> regression [2] because Harry was not in-tree, as its tests would've caught
>> the issue even with the model that existed.
>>
>> For other recently found issues, I think having Harry in-tree would have
>> substantially lowered a turnaround time, and allowed me to share repros
>> with developers of corresponding features much quicker.
>>
>> I do expect a slight learning curve for Harry, but my intention is to
>> build a web of simple tests (worked on some of them yesterday after
>> conversation with David already), which can follow the in-jvm-dtest pattern
>> of find-similar-test / copy / modify. There's already copious
>> documentation, so I do not believe not having docs for Harry was ever an
>> issue, since there have been plenty.
>>
>> You all are aware of my dedication to testing and quality of Apache
>> Cassandra, and I hope you also see the benefits of having a model checker
>> in-tree.
>>
>> Thank you and happy upcoming holidays,
>> --Alex
>>
>> [1] https://issues.apache.org/jira/browse/CASSANDRA-19210
>> [2] https://issues.apache.org/jira/browse/CASSANDRA-18932
>>
>>
>>

Re: Harry in-tree (Forked from "Long tests, Burn tests, Simulator tests, Fuzz tests - can we clarify the diffs?")

Posted by Sumanth Pasupuleti <su...@gmail.com>.
+1, thank you for your efforts in bringing Harry in-tree. Anything that
improves the testing ecosystem for Cassandra, particularly around complex
scenarios / edge cases  goes a long way in improving reliability, and with
having a powerful tool like Harry in-tree, it is a lot more accessible to
the developers than it has been. Also, thank you for keeping in mind the
onboarding experience of developers.

- Sumanth

On Fri, Dec 22, 2023 at 1:11 AM Alex Petrov <al...@coffeenco.de> wrote:

> Some follow-up tickets to establish the project direction:
>
> https://issues.apache.org/jira/browse/CASSANDRA-19229
>
> Two other things that we will work on in Tree are:
> https://issues.apache.org/jira/browse/CASSANDRA-18275 (model and in-JVM
> test for partition-restricted 2i queries)
> https://issues.apache.org/jira/browse/CASSANDRA-18667 (multi-threaded SAI
> read and write fuzz test)
>
> If you would like to get your recently added feature tested with Harry
> model, please let me know!
>
> On Fri, Dec 22, 2023, at 12:41 AM, Joseph Lynch wrote:
>
> +1
>
> Sounds like a great change that will help us unify around a common testing
> paradigm, and even pave the path to in-tree load testing plus integrated
> correctness checking which would be extremely valuable!
>
> -Joey
>
> On Thu, Dec 21, 2023 at 1:35 PM Caleb Rackliffe <ca...@gmail.com>
> wrote:
>
> +1
>
> Agree w/ all the justifications mentioned above.
>
> As a reviewer on CASSANDRA-19210
> <https://issues.apache.org/jira/browse/CASSANDRA-19210>, my goals were to
> a.) look at the directory, naming, and package structure of the ported
> code, b.) make sure IDE integration was working, and c.) make sure any
> modifications to existing code (rather than direct code movements from
> cassandra-harry) were straightforward.
>
> On Thu, Dec 21, 2023 at 3:23 PM Alex Petrov <al...@coffeenco.de> wrote:
>
>
> Hey folks,
>
> I am mostly done with a patch that brings Harry in-tree [1]. I will
> trigger one more CI run overnight, and my intention was to merge it some
> time soon, but I wanted to give a fair warning here, since this is a
> relatively large patch.
>
> Good news for everyone that it:
>   a) touches no production code whatsoever. Only test (in-jvm dtest
> namely) code that was using Harry already.
>   b) the only tests that are changed are ones that used a duplicate
> version of placement simulator we had both for testing TCM, and in Harry
>   c) in addition, I have converted 3 existing TCM tests to a new API to
> have some base for examples/usage.
>
> Since we were effectively relying on this code for a while now, and the
> intention now is to converge to:
>   a) fewer different generators, and have a shareable version of
> generators for everyone to use accross the base
>   b) a testing tool that can be useful for both trivial cases, and complex
> scenarios
> myself and many other Cassandra contributors have expressed an opinion
> that bringing Harry in-tree will be highly benefitial.
>
> I strongly believe that bringing Harry in-tree will help to lower the
> barrier for fuzz test and simplify co-development of Cassandra and Harry.
> Previously, it has been rather difficult to debug edge cases because I had
> to either re-compile an in-jvm dtest jar and bring it to Harry, or
> re-compile a Harry jar and bring it to Cassandra, which is both tedious and
> time consuming. Moreover, I believe we have missed at very least one RT
> regression [2] because Harry was not in-tree, as its tests would've caught
> the issue even with the model that existed.
>
> For other recently found issues, I think having Harry in-tree would have
> substantially lowered a turnaround time, and allowed me to share repros
> with developers of corresponding features much quicker.
>
> I do expect a slight learning curve for Harry, but my intention is to
> build a web of simple tests (worked on some of them yesterday after
> conversation with David already), which can follow the in-jvm-dtest pattern
> of find-similar-test / copy / modify. There's already copious
> documentation, so I do not believe not having docs for Harry was ever an
> issue, since there have been plenty.
>
> You all are aware of my dedication to testing and quality of Apache
> Cassandra, and I hope you also see the benefits of having a model checker
> in-tree.
>
> Thank you and happy upcoming holidays,
> --Alex
>
> [1] https://issues.apache.org/jira/browse/CASSANDRA-19210
> [2] https://issues.apache.org/jira/browse/CASSANDRA-18932
>
>
>

Re: Harry in-tree (Forked from "Long tests, Burn tests, Simulator tests, Fuzz tests - can we clarify the diffs?")

Posted by Alex Petrov <al...@coffeenco.de>.
Some follow-up tickets to establish the project direction:

https://issues.apache.org/jira/browse/CASSANDRA-19229

Two other things that we will work on in Tree are:
https://issues.apache.org/jira/browse/CASSANDRA-18275 (model and in-JVM test for partition-restricted 2i queries) 
https://issues.apache.org/jira/browse/CASSANDRA-18667 (multi-threaded SAI read and write fuzz test)

If you would like to get your recently added feature tested with Harry model, please let me know!

On Fri, Dec 22, 2023, at 12:41 AM, Joseph Lynch wrote:
> +1
> 
> Sounds like a great change that will help us unify around a common testing paradigm, and even pave the path to in-tree load testing plus integrated correctness checking which would be extremely valuable!
> 
> -Joey
> 
> On Thu, Dec 21, 2023 at 1:35 PM Caleb Rackliffe <ca...@gmail.com> wrote:
>> +1
>> 
>> Agree w/ all the justifications mentioned above.
>> 
>> As a reviewer on CASSANDRA-19210 <https://issues.apache.org/jira/browse/CASSANDRA-19210>, my goals were to a.) look at the directory, naming, and package structure of the ported code, b.) make sure IDE integration was working, and c.) make sure any modifications to existing code (rather than direct code movements from cassandra-harry) were straightforward.
>> 
>> On Thu, Dec 21, 2023 at 3:23 PM Alex Petrov <al...@coffeenco.de> wrote:
>>> __
>>> Hey folks,
>>> 
>>> I am mostly done with a patch that brings Harry in-tree [1]. I will trigger one more CI run overnight, and my intention was to merge it some time soon, but I wanted to give a fair warning here, since this is a relatively large patch. 
>>> 
>>> Good news for everyone that it: 
>>>   a) touches no production code whatsoever. Only test (in-jvm dtest namely) code that was using Harry already.
>>>   b) the only tests that are changed are ones that used a duplicate version of placement simulator we had both for testing TCM, and in Harry
>>>   c) in addition, I have converted 3 existing TCM tests to a new API to have some base for examples/usage.
>>> 
>>> Since we were effectively relying on this code for a while now, and the intention now is to converge to: 
>>>   a) fewer different generators, and have a shareable version of generators for everyone to use accross the base
>>>   b) a testing tool that can be useful for both trivial cases, and complex scenarios 
>>> myself and many other Cassandra contributors have expressed an opinion that bringing Harry in-tree will be highly benefitial.
>>> 
>>> I strongly believe that bringing Harry in-tree will help to lower the barrier for fuzz test and simplify co-development of Cassandra and Harry. Previously, it has been rather difficult to debug edge cases because I had to either re-compile an in-jvm dtest jar and bring it to Harry, or re-compile a Harry jar and bring it to Cassandra, which is both tedious and time consuming. Moreover, I believe we have missed at very least one RT regression [2] because Harry was not in-tree, as its tests would've caught the issue even with the model that existed.
>>> 
>>> For other recently found issues, I think having Harry in-tree would have substantially lowered a turnaround time, and allowed me to share repros with developers of corresponding features much quicker.
>>> 
>>> I do expect a slight learning curve for Harry, but my intention is to build a web of simple tests (worked on some of them yesterday after conversation with David already), which can follow the in-jvm-dtest pattern of find-similar-test / copy / modify. There's already copious documentation, so I do not believe not having docs for Harry was ever an issue, since there have been plenty. 
>>> 
>>> You all are aware of my dedication to testing and quality of Apache Cassandra, and I hope you also see the benefits of having a model checker in-tree.
>>> 
>>> Thank you and happy upcoming holidays,
>>> --Alex
>>> 
>>> [1] https://issues.apache.org/jira/browse/CASSANDRA-19210
>>> [2] https://issues.apache.org/jira/browse/CASSANDRA-18932
>>> 

Re: Harry in-tree (Forked from "Long tests, Burn tests, Simulator tests, Fuzz tests - can we clarify the diffs?")

Posted by Joseph Lynch <jo...@gmail.com>.
+1

Sounds like a great change that will help us unify around a common testing
paradigm, and even pave the path to in-tree load testing plus integrated
correctness checking which would be extremely valuable!

-Joey

On Thu, Dec 21, 2023 at 1:35 PM Caleb Rackliffe <ca...@gmail.com>
wrote:

> +1
>
> Agree w/ all the justifications mentioned above.
>
> As a reviewer on CASSANDRA-19210
> <https://issues.apache.org/jira/browse/CASSANDRA-19210>, my goals were to
> a.) look at the directory, naming, and package structure of the ported
> code, b.) make sure IDE integration was working, and c.) make sure any
> modifications to existing code (rather than direct code movements from
> cassandra-harry) were straightforward.
>
> On Thu, Dec 21, 2023 at 3:23 PM Alex Petrov <al...@coffeenco.de> wrote:
>
>> Hey folks,
>>
>> I am mostly done with a patch that brings Harry in-tree [1]. I will
>> trigger one more CI run overnight, and my intention was to merge it some
>> time soon, but I wanted to give a fair warning here, since this is a
>> relatively large patch.
>>
>> Good news for everyone that it:
>>   a) touches no production code whatsoever. Only test (in-jvm dtest
>> namely) code that was using Harry already.
>>   b) the only tests that are changed are ones that used a duplicate
>> version of placement simulator we had both for testing TCM, and in Harry
>>   c) in addition, I have converted 3 existing TCM tests to a new API to
>> have some base for examples/usage.
>>
>> Since we were effectively relying on this code for a while now, and the
>> intention now is to converge to:
>>   a) fewer different generators, and have a shareable version of
>> generators for everyone to use accross the base
>>   b) a testing tool that can be useful for both trivial cases, and
>> complex scenarios
>> myself and many other Cassandra contributors have expressed an opinion
>> that bringing Harry in-tree will be highly benefitial.
>>
>> I strongly believe that bringing Harry in-tree will help to lower the
>> barrier for fuzz test and simplify co-development of Cassandra and Harry.
>> Previously, it has been rather difficult to debug edge cases because I had
>> to either re-compile an in-jvm dtest jar and bring it to Harry, or
>> re-compile a Harry jar and bring it to Cassandra, which is both tedious and
>> time consuming. Moreover, I believe we have missed at very least one RT
>> regression [2] because Harry was not in-tree, as its tests would've caught
>> the issue even with the model that existed.
>>
>> For other recently found issues, I think having Harry in-tree would have
>> substantially lowered a turnaround time, and allowed me to share repros
>> with developers of corresponding features much quicker.
>>
>> I do expect a slight learning curve for Harry, but my intention is to
>> build a web of simple tests (worked on some of them yesterday after
>> conversation with David already), which can follow the in-jvm-dtest pattern
>> of find-similar-test / copy / modify. There's already copious
>> documentation, so I do not believe not having docs for Harry was ever an
>> issue, since there have been plenty.
>>
>> You all are aware of my dedication to testing and quality of Apache
>> Cassandra, and I hope you also see the benefits of having a model checker
>> in-tree.
>>
>> Thank you and happy upcoming holidays,
>> --Alex
>>
>> [1] https://issues.apache.org/jira/browse/CASSANDRA-19210
>> [2] https://issues.apache.org/jira/browse/CASSANDRA-18932
>>
>>

Re: Harry in-tree (Forked from "Long tests, Burn tests, Simulator tests, Fuzz tests - can we clarify the diffs?")

Posted by Caleb Rackliffe <ca...@gmail.com>.
+1

Agree w/ all the justifications mentioned above.

As a reviewer on CASSANDRA-19210
<https://issues.apache.org/jira/browse/CASSANDRA-19210>, my goals were to
a.) look at the directory, naming, and package structure of the ported
code, b.) make sure IDE integration was working, and c.) make sure any
modifications to existing code (rather than direct code movements from
cassandra-harry) were straightforward.

On Thu, Dec 21, 2023 at 3:23 PM Alex Petrov <al...@coffeenco.de> wrote:

> Hey folks,
>
> I am mostly done with a patch that brings Harry in-tree [1]. I will
> trigger one more CI run overnight, and my intention was to merge it some
> time soon, but I wanted to give a fair warning here, since this is a
> relatively large patch.
>
> Good news for everyone that it:
>   a) touches no production code whatsoever. Only test (in-jvm dtest
> namely) code that was using Harry already.
>   b) the only tests that are changed are ones that used a duplicate
> version of placement simulator we had both for testing TCM, and in Harry
>   c) in addition, I have converted 3 existing TCM tests to a new API to
> have some base for examples/usage.
>
> Since we were effectively relying on this code for a while now, and the
> intention now is to converge to:
>   a) fewer different generators, and have a shareable version of
> generators for everyone to use accross the base
>   b) a testing tool that can be useful for both trivial cases, and complex
> scenarios
> myself and many other Cassandra contributors have expressed an opinion
> that bringing Harry in-tree will be highly benefitial.
>
> I strongly believe that bringing Harry in-tree will help to lower the
> barrier for fuzz test and simplify co-development of Cassandra and Harry.
> Previously, it has been rather difficult to debug edge cases because I had
> to either re-compile an in-jvm dtest jar and bring it to Harry, or
> re-compile a Harry jar and bring it to Cassandra, which is both tedious and
> time consuming. Moreover, I believe we have missed at very least one RT
> regression [2] because Harry was not in-tree, as its tests would've caught
> the issue even with the model that existed.
>
> For other recently found issues, I think having Harry in-tree would have
> substantially lowered a turnaround time, and allowed me to share repros
> with developers of corresponding features much quicker.
>
> I do expect a slight learning curve for Harry, but my intention is to
> build a web of simple tests (worked on some of them yesterday after
> conversation with David already), which can follow the in-jvm-dtest pattern
> of find-similar-test / copy / modify. There's already copious
> documentation, so I do not believe not having docs for Harry was ever an
> issue, since there have been plenty.
>
> You all are aware of my dedication to testing and quality of Apache
> Cassandra, and I hope you also see the benefits of having a model checker
> in-tree.
>
> Thank you and happy upcoming holidays,
> --Alex
>
> [1] https://issues.apache.org/jira/browse/CASSANDRA-19210
> [2] https://issues.apache.org/jira/browse/CASSANDRA-18932
>
>

Re: Harry in-tree (Forked from "Long tests, Burn tests, Simulator tests, Fuzz tests - can we clarify the diffs?")

Posted by Ariel Weisberg <ar...@weisberg.ws>.
🥳🎉

Thanks for your work on this. Excited to have an easier way to write tests that leverage schema and data that also covers more.

Ariel
On Sat, Dec 23, 2023, at 9:17 AM, Alex Petrov wrote:
> Thanks everyone, Harry is now in tree! Of course, that's just a small milestone, hope it'll prove as useful as I expect it to be.
> 
> https://github.com/apache/cassandra/commit/439d1b122af334bf68c159b82ef4e4879c210bd5
> 
> Happy holidays!
> --Alex
> 
> On Sat, Dec 23, 2023, at 11:10 AM, Mick Semb Wever wrote:
>>    
>>   
>>> I strongly believe that bringing Harry in-tree will help to lower the barrier for fuzz test and simplify co-development of Cassandra and Harry. Previously, it has been rather difficult to debug edge cases because I had to either re-compile an in-jvm dtest jar and bring it to Harry, or re-compile a Harry jar and bring it to Cassandra, which is both tedious and time consuming. Moreover, I believe we have missed at very least one RT regression [2] because Harry was not in-tree, as its tests would've caught the issue even with the model that existed.
>>> 
>>> For other recently found issues, I think having Harry in-tree would have substantially lowered a turnaround time, and allowed me to share repros with developers of corresponding features much quicker.
>> 
>> 
>> Agree, looking forward to getting to know and writing Harry tests.  Thank you Alex, happy holidays :) 
>> 
> 

Re: Harry in-tree (Forked from "Long tests, Burn tests, Simulator tests, Fuzz tests - can we clarify the diffs?")

Posted by Alex Petrov <al...@coffeenco.de>.
Thanks everyone, Harry is now in tree! Of course, that's just a small milestone, hope it'll prove as useful as I expect it to be.

https://github.com/apache/cassandra/commit/439d1b122af334bf68c159b82ef4e4879c210bd5

Happy holidays!
--Alex

On Sat, Dec 23, 2023, at 11:10 AM, Mick Semb Wever wrote:
>    
>   
>> I strongly believe that bringing Harry in-tree will help to lower the barrier for fuzz test and simplify co-development of Cassandra and Harry. Previously, it has been rather difficult to debug edge cases because I had to either re-compile an in-jvm dtest jar and bring it to Harry, or re-compile a Harry jar and bring it to Cassandra, which is both tedious and time consuming. Moreover, I believe we have missed at very least one RT regression [2] because Harry was not in-tree, as its tests would've caught the issue even with the model that existed.
>> 
>> For other recently found issues, I think having Harry in-tree would have substantially lowered a turnaround time, and allowed me to share repros with developers of corresponding features much quicker.
> 
> 
> Agree, looking forward to getting to know and writing Harry tests.  Thank you Alex, happy holidays :) 
> 

Re: Harry in-tree (Forked from "Long tests, Burn tests, Simulator tests, Fuzz tests - can we clarify the diffs?")

Posted by Mick Semb Wever <mc...@apache.org>.
> I strongly believe that bringing Harry in-tree will help to lower the
> barrier for fuzz test and simplify co-development of Cassandra and Harry.
> Previously, it has been rather difficult to debug edge cases because I had
> to either re-compile an in-jvm dtest jar and bring it to Harry, or
> re-compile a Harry jar and bring it to Cassandra, which is both tedious and
> time consuming. Moreover, I believe we have missed at very least one RT
> regression [2] because Harry was not in-tree, as its tests would've caught
> the issue even with the model that existed.
>
> For other recently found issues, I think having Harry in-tree would have
> substantially lowered a turnaround time, and allowed me to share repros
> with developers of corresponding features much quicker.
>


Agree, looking forward to getting to know and writing Harry tests.  Thank
you Alex, happy holidays :)

Harry in-tree (Forked from "Long tests, Burn tests, Simulator tests, Fuzz tests - can we clarify the diffs?")

Posted by Alex Petrov <al...@coffeenco.de>.
Hey folks,

I am mostly done with a patch that brings Harry in-tree [1]. I will trigger one more CI run overnight, and my intention was to merge it some time soon, but I wanted to give a fair warning here, since this is a relatively large patch. 

Good news for everyone that it: 
  a) touches no production code whatsoever. Only test (in-jvm dtest namely) code that was using Harry already.
  b) the only tests that are changed are ones that used a duplicate version of placement simulator we had both for testing TCM, and in Harry
  c) in addition, I have converted 3 existing TCM tests to a new API to have some base for examples/usage.

Since we were effectively relying on this code for a while now, and the intention now is to converge to: 
  a) fewer different generators, and have a shareable version of generators for everyone to use accross the base
  b) a testing tool that can be useful for both trivial cases, and complex scenarios 
myself and many other Cassandra contributors have expressed an opinion that bringing Harry in-tree will be highly benefitial.

I strongly believe that bringing Harry in-tree will help to lower the barrier for fuzz test and simplify co-development of Cassandra and Harry. Previously, it has been rather difficult to debug edge cases because I had to either re-compile an in-jvm dtest jar and bring it to Harry, or re-compile a Harry jar and bring it to Cassandra, which is both tedious and time consuming. Moreover, I believe we have missed at very least one RT regression [2] because Harry was not in-tree, as its tests would've caught the issue even with the model that existed.

For other recently found issues, I think having Harry in-tree would have substantially lowered a turnaround time, and allowed me to share repros with developers of corresponding features much quicker.

I do expect a slight learning curve for Harry, but my intention is to build a web of simple tests (worked on some of them yesterday after conversation with David already), which can follow the in-jvm-dtest pattern of find-similar-test / copy / modify. There's already copious documentation, so I do not believe not having docs for Harry was ever an issue, since there have been plenty. 

You all are aware of my dedication to testing and quality of Apache Cassandra, and I hope you also see the benefits of having a model checker in-tree.

Thank you and happy upcoming holidays,
--Alex

[1] https://issues.apache.org/jira/browse/CASSANDRA-19210
[2] https://issues.apache.org/jira/browse/CASSANDRA-18932

Re: Long tests, Burn tests, Simulator tests, Fuzz tests - can we clarify the diffs?

Posted by Josh McKenzie <jm...@apache.org>.
> One thing where this “could” come into play is that we currently run with different configs at the CI level and we might be able to make this happen at the class or method level instead..
It'd be great to be able to declaratively indicate which configurations a test needed to exercise and we just have 1 CI run that includes them as appropriate. 

On Mon, Dec 18, 2023, at 7:22 PM, David Capwell wrote:
>> A brief perusal shows jqwik as integrated with JUnit 5 taking a fairly interesting annotation-based approach to property testing. Curious if you've looked into or used that at all David (Capwell)? (link for the lazy: https://jqwik.net/docs/current/user-guide.html#detailed-table-of-contents).
> 
> I have not no.  Looking at your link it moves from lambdas to annotations, and tries to define a API for stateful… I am neutral to that as its mostly style…. One thing to call out is that the project documents it tries to “shrink”… we ended up disabling this in QuickTheories as shrinking doesn’t work well for many of our tests (too high resource demand and unable to actually shrink once you move past trivial generators).  Looking at their docs and their code, its hard for me to see how we actually create C* generators… its so much class gen magic that I really don’t see how to create AbstractType or TableMetadata… the only example they gave was not random data but hand crafted data… 
> 
>> moving to JUnit 5
> 
> I am a fan of this.  If we add dependencies and don’t keep update with them it becomes painful over time (missing features, lack of support, etc).  
> 
>> First of all - when you want to have a parameterized test case you do not have to make the whole test class parameterized - it is per test case. Also, each method can have different parameters.
> 
> I strongly prefer this, but never had it as a blocker from me doing param tests…. One thing where this “could” come into play is that we currently run with different configs at the CI level and we might be able to make this happen at the class or method level instead..
> 
> @ServerConfigs(all) // can exclude unsupported configs
> public class InsertTest
> 
> It bothers me deeply that we run tests that don’t touch the configs we use in CI, causing us to waste resources… Can we solve this in junit4 param logic… no clue… 
> 
>> On Dec 15, 2023, at 6:52 PM, Josh McKenzie <jm...@apache.org> wrote:
>> 
>>> First of all - when you want to have a parameterized test case you do not have to make the whole test class parameterized - it is per test case. Also, each method can have different parameters.
>> This is a pretty compelling improvement to me having just had to use the somewhat painful and blunt instrument of our current framework's parameterization; pretty clunky and broad.
>> 
>> It also looks like they moved to a "test engine abstracted away from test identification" approach to their architecture in 5 w/the "vintage" model providing native unchanged backwards-compatibility w/junit 4. Assuming they didn't bork up their architecture that *should* lower risk of the framework change leading to disruption or failure (famous last words...).
>> 
>> A brief perusal shows jqwik as integrated with JUnit 5 taking a fairly interesting annotation-based approach to property testing. Curious if you've looked into or used that at all David (Capwell)? (link for the lazy: https://jqwik.net/docs/current/user-guide.html#detailed-table-of-contents).
>> 
>> On Tue, Dec 12, 2023, at 11:39 AM, Jacek Lewandowski wrote:
>>> First of all - when you want to have a parameterized test case you do not have to make the whole test class parameterized - it is per test case. Also, each method can have different parameters.
>>> 
>>> For the extensions - we can have extensions which provide Cassandra configuration, extensions which provide a running cluster and others. We could for example apply some extensions to all test classes externally without touching those classes, something like logging the begin and end of each test case. 
>>> 
>>> 
>>> 
>>> wt., 12 gru 2023 o 12:07 Benedict <be...@apache.org> napisał(a):
>>>> 
>>>> Could you give (or link to) some examples of how this would actually benefit our test suites?
>>>> 
>>>> 
>>>>> On 12 Dec 2023, at 10:51, Jacek Lewandowski <le...@gmail.com> wrote:
>>>>> 
>>>>> I have two major pros for JUnit 5:
>>>>> - much better support for parameterized tests
>>>>> - global test hooks (automatically detectable extensions) + multi-inheritance
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> pon., 11 gru 2023 o 13:38 Benedict <be...@apache.org> napisał(a):
>>>>>> 
>>>>>> Why do we want to move to JUnit 5? 
>>>>>> 
>>>>>> I’m generally opposed to churn unless well justified, which it may be - just not immediately obvious to me.
>>>>>> 
>>>>>> 
>>>>>>> On 11 Dec 2023, at 08:33, Jacek Lewandowski <le...@gmail.com> wrote:
>>>>>>> 
>>>>>>> Nobody referred so far to the idea of moving to JUnit 5, what are the opinions?
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> niedz., 10 gru 2023 o 11:03 Benedict <be...@apache.org> napisał(a):
>>>>>>>> 
>>>>>>>> Alex’s suggestion was that we meta randomise, ie we randomise the config parameters to gain better rather than lesser coverage overall. This means we cover these specific configs and more - just not necessarily on any single commit.
>>>>>>>> 
>>>>>>>> I strongly endorse this approach over the status quo.
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> On 8 Dec 2023, at 13:26, Mick Semb Wever <mc...@apache.org> wrote:
>>>>>>>>> 
>>>>>>>>>  
>>>>>>>>>  
>>>>>>>>>  
>>>>>>>>>> 
>>>>>>>>>>> I think everyone agrees here, but…. these variations are still catching failures, and until we have an improvement or replacement we do rely on them.   I'm not in favour of removing them until we have proof /confidence that any replacement is catching the same failures.  Especially oa, tries, vnodes. (Not tries and offheap is being replaced with "latest", which will be valuable simplification.)  
>>>>>>>>>> 
>>>>>>>>>> What kind of proof do you expect? I cannot imagine how we could prove that because the ability of detecting failures results from the randomness of those tests. That's why when such a test fail you usually cannot reproduce that easily.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Unit tests that fail consistently but only on one configuration, should not be removed/replaced until the replacement also catches the failure.
>>>>>>>>>  
>>>>>>>>>> We could extrapolate that to - why we only have those configurations? why don't test trie / oa + compression, or CDC, or system memtable? 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Because, along the way, people have decided a certain configuration deserves additional testing and it has been done this way in lieu of any other more efficient approach.

Re: Long tests, Burn tests, Simulator tests, Fuzz tests - can we clarify the diffs?

Posted by David Capwell <dc...@apple.com>.
> A brief perusal shows jqwik as integrated with JUnit 5 taking a fairly interesting annotation-based approach to property testing. Curious if you've looked into or used that at all David (Capwell)? (link for the lazy: https://jqwik.net/docs/current/user-guide.html#detailed-table-of-contents).

I have not no.  Looking at your link it moves from lambdas to annotations, and tries to define a API for stateful… I am neutral to that as its mostly style…. One thing to call out is that the project documents it tries to “shrink”… we ended up disabling this in QuickTheories as shrinking doesn’t work well for many of our tests (too high resource demand and unable to actually shrink once you move past trivial generators).  Looking at their docs and their code, its hard for me to see how we actually create C* generators… its so much class gen magic that I really don’t see how to create AbstractType or TableMetadata… the only example they gave was not random data but hand crafted data… 

> moving to JUnit 5

I am a fan of this.  If we add dependencies and don’t keep update with them it becomes painful over time (missing features, lack of support, etc).  

> First of all - when you want to have a parameterized test case you do not have to make the whole test class parameterized - it is per test case. Also, each method can have different parameters.

I strongly prefer this, but never had it as a blocker from me doing param tests…. One thing where this “could” come into play is that we currently run with different configs at the CI level and we might be able to make this happen at the class or method level instead..

@ServerConfigs(all) // can exclude unsupported configs
public class InsertTest

It bothers me deeply that we run tests that don’t touch the configs we use in CI, causing us to waste resources… Can we solve this in junit4 param logic… no clue… 

> On Dec 15, 2023, at 6:52 PM, Josh McKenzie <jm...@apache.org> wrote:
> 
>> First of all - when you want to have a parameterized test case you do not have to make the whole test class parameterized - it is per test case. Also, each method can have different parameters.
> This is a pretty compelling improvement to me having just had to use the somewhat painful and blunt instrument of our current framework's parameterization; pretty clunky and broad.
> 
> It also looks like they moved to a "test engine abstracted away from test identification" approach to their architecture in 5 w/the "vintage" model providing native unchanged backwards-compatibility w/junit 4. Assuming they didn't bork up their architecture that should lower risk of the framework change leading to disruption or failure (famous last words...).
> 
> A brief perusal shows jqwik as integrated with JUnit 5 taking a fairly interesting annotation-based approach to property testing. Curious if you've looked into or used that at all David (Capwell)? (link for the lazy: https://jqwik.net/docs/current/user-guide.html#detailed-table-of-contents).
> 
> On Tue, Dec 12, 2023, at 11:39 AM, Jacek Lewandowski wrote:
>> First of all - when you want to have a parameterized test case you do not have to make the whole test class parameterized - it is per test case. Also, each method can have different parameters.
>> 
>> For the extensions - we can have extensions which provide Cassandra configuration, extensions which provide a running cluster and others. We could for example apply some extensions to all test classes externally without touching those classes, something like logging the begin and end of each test case. 
>> 
>> 
>> 
>> wt., 12 gru 2023 o 12:07 Benedict <benedict@apache.org <ma...@apache.org>> napisał(a):
>> 
>> Could you give (or link to) some examples of how this would actually benefit our test suites?
>> 
>> 
>>> On 12 Dec 2023, at 10:51, Jacek Lewandowski <lewandowski.jacek@gmail.com <ma...@gmail.com>> wrote:
>>> 
>>> I have two major pros for JUnit 5:
>>> - much better support for parameterized tests
>>> - global test hooks (automatically detectable extensions) + multi-inheritance
>>> 
>>> 
>>> 
>>> 
>>> pon., 11 gru 2023 o 13:38 Benedict <benedict@apache.org <ma...@apache.org>> napisał(a):
>>> 
>>> Why do we want to move to JUnit 5? 
>>> 
>>> I’m generally opposed to churn unless well justified, which it may be - just not immediately obvious to me.
>>> 
>>> 
>>>> On 11 Dec 2023, at 08:33, Jacek Lewandowski <lewandowski.jacek@gmail.com <ma...@gmail.com>> wrote:
>>>> 
>>>> Nobody referred so far to the idea of moving to JUnit 5, what are the opinions?
>>>> 
>>>> 
>>>> 
>>>> niedz., 10 gru 2023 o 11:03 Benedict <benedict@apache.org <ma...@apache.org>> napisał(a):
>>>> 
>>>> Alex’s suggestion was that we meta randomise, ie we randomise the config parameters to gain better rather than lesser coverage overall. This means we cover these specific configs and more - just not necessarily on any single commit.
>>>> 
>>>> I strongly endorse this approach over the status quo.
>>>> 
>>>> 
>>>>> On 8 Dec 2023, at 13:26, Mick Semb Wever <mck@apache.org <ma...@apache.org>> wrote:
>>>>> 
>>>>>  
>>>>>  
>>>>>  
>>>>> 
>>>>> I think everyone agrees here, but…. these variations are still catching failures, and until we have an improvement or replacement we do rely on them.   I'm not in favour of removing them until we have proof /confidence that any replacement is catching the same failures.  Especially oa, tries, vnodes. (Not tries and offheap is being replaced with "latest", which will be valuable simplification.)  
>>>>> 
>>>>> What kind of proof do you expect? I cannot imagine how we could prove that because the ability of detecting failures results from the randomness of those tests. That's why when such a test fail you usually cannot reproduce that easily.
>>>>> 
>>>>> 
>>>>> Unit tests that fail consistently but only on one configuration, should not be removed/replaced until the replacement also catches the failure.
>>>>>  
>>>>> We could extrapolate that to - why we only have those configurations? why don't test trie / oa + compression, or CDC, or system memtable? 
>>>>> 
>>>>> 
>>>>> Because, along the way, people have decided a certain configuration deserves additional testing and it has been done this way in lieu of any other more efficient approach.


Re: Long tests, Burn tests, Simulator tests, Fuzz tests - can we clarify the diffs?

Posted by Josh McKenzie <jm...@apache.org>.
> First of all - when you want to have a parameterized test case you do not have to make the whole test class parameterized - it is per test case. Also, each method can have different parameters.
This is a pretty compelling improvement to me having just had to use the somewhat painful and blunt instrument of our current framework's parameterization; pretty clunky and broad.

It also looks like they moved to a "test engine abstracted away from test identification" approach to their architecture in 5 w/the "vintage" model providing native unchanged backwards-compatibility w/junit 4. Assuming they didn't bork up their architecture that *should* lower risk of the framework change leading to disruption or failure (famous last words...).

A brief perusal shows jqwik as integrated with JUnit 5 taking a fairly interesting annotation-based approach to property testing. Curious if you've looked into or used that at all David (Capwell)? (link for the lazy: https://jqwik.net/docs/current/user-guide.html#detailed-table-of-contents).

On Tue, Dec 12, 2023, at 11:39 AM, Jacek Lewandowski wrote:
> First of all - when you want to have a parameterized test case you do not have to make the whole test class parameterized - it is per test case. Also, each method can have different parameters.
> 
> For the extensions - we can have extensions which provide Cassandra configuration, extensions which provide a running cluster and others. We could for example apply some extensions to all test classes externally without touching those classes, something like logging the begin and end of each test case. 
> 
> 
> 
> wt., 12 gru 2023 o 12:07 Benedict <be...@apache.org> napisał(a):
>> 
>> Could you give (or link to) some examples of how this would actually benefit our test suites?
>> 
>> 
>>> On 12 Dec 2023, at 10:51, Jacek Lewandowski <le...@gmail.com> wrote:
>>> 
>>> I have two major pros for JUnit 5:
>>> - much better support for parameterized tests
>>> - global test hooks (automatically detectable extensions) + multi-inheritance
>>> 
>>> 
>>> 
>>> 
>>> pon., 11 gru 2023 o 13:38 Benedict <be...@apache.org> napisał(a):
>>>> 
>>>> Why do we want to move to JUnit 5? 
>>>> 
>>>> I’m generally opposed to churn unless well justified, which it may be - just not immediately obvious to me.
>>>> 
>>>> 
>>>>> On 11 Dec 2023, at 08:33, Jacek Lewandowski <le...@gmail.com> wrote:
>>>>> 
>>>>> Nobody referred so far to the idea of moving to JUnit 5, what are the opinions?
>>>>> 
>>>>> 
>>>>> 
>>>>> niedz., 10 gru 2023 o 11:03 Benedict <be...@apache.org> napisał(a):
>>>>>> 
>>>>>> Alex’s suggestion was that we meta randomise, ie we randomise the config parameters to gain better rather than lesser coverage overall. This means we cover these specific configs and more - just not necessarily on any single commit.
>>>>>> 
>>>>>> I strongly endorse this approach over the status quo.
>>>>>> 
>>>>>> 
>>>>>>> On 8 Dec 2023, at 13:26, Mick Semb Wever <mc...@apache.org> wrote:
>>>>>>> 
>>>>>>>  
>>>>>>>  
>>>>>>>  
>>>>>>>> 
>>>>>>>>> I think everyone agrees here, but…. these variations are still catching failures, and until we have an improvement or replacement we do rely on them.   I'm not in favour of removing them until we have proof /confidence that any replacement is catching the same failures.  Especially oa, tries, vnodes. (Not tries and offheap is being replaced with "latest", which will be valuable simplification.)  
>>>>>>>> 
>>>>>>>> What kind of proof do you expect? I cannot imagine how we could prove that because the ability of detecting failures results from the randomness of those tests. That's why when such a test fail you usually cannot reproduce that easily.
>>>>>>> 
>>>>>>> 
>>>>>>> Unit tests that fail consistently but only on one configuration, should not be removed/replaced until the replacement also catches the failure.
>>>>>>>  
>>>>>>>> We could extrapolate that to - why we only have those configurations? why don't test trie / oa + compression, or CDC, or system memtable? 
>>>>>>> 
>>>>>>> 
>>>>>>> Because, along the way, people have decided a certain configuration deserves additional testing and it has been done this way in lieu of any other more efficient approach.
>>>>>>> 
>>>>>>> 
>>>>>>> 

Re: Long tests, Burn tests, Simulator tests, Fuzz tests - can we clarify the diffs?

Posted by Jacek Lewandowski <le...@gmail.com>.
First of all - when you want to have a parameterized test case you do not
have to make the whole test class parameterized - it is per test case.
Also, each method can have different parameters.

For the extensions - we can have extensions which provide Cassandra
configuration, extensions which provide a running cluster and others. We
could for example apply some extensions to all test classes externally
without touching those classes, something like logging the begin and end of
each test case.



wt., 12 gru 2023 o 12:07 Benedict <be...@apache.org> napisał(a):

> Could you give (or link to) some examples of how this would actually
> benefit our test suites?
>
> On 12 Dec 2023, at 10:51, Jacek Lewandowski <le...@gmail.com>
> wrote:
>
> 
> I have two major pros for JUnit 5:
> - much better support for parameterized tests
> - global test hooks (automatically detectable extensions) +
> multi-inheritance
>
>
>
>
> pon., 11 gru 2023 o 13:38 Benedict <be...@apache.org> napisał(a):
>
>> Why do we want to move to JUnit 5?
>>
>> I’m generally opposed to churn unless well justified, which it may be -
>> just not immediately obvious to me.
>>
>> On 11 Dec 2023, at 08:33, Jacek Lewandowski <le...@gmail.com>
>> wrote:
>>
>> 
>> Nobody referred so far to the idea of moving to JUnit 5, what are the
>> opinions?
>>
>>
>>
>> niedz., 10 gru 2023 o 11:03 Benedict <be...@apache.org> napisał(a):
>>
>>> Alex’s suggestion was that we meta randomise, ie we randomise the config
>>> parameters to gain better rather than lesser coverage overall. This means
>>> we cover these specific configs and more - just not necessarily on any
>>> single commit.
>>>
>>> I strongly endorse this approach over the status quo.
>>>
>>> On 8 Dec 2023, at 13:26, Mick Semb Wever <mc...@apache.org> wrote:
>>>
>>> 
>>>
>>>
>>>
>>>>
>>>> I think everyone agrees here, but…. these variations are still
>>>>> catching failures, and until we have an improvement or replacement we
>>>>> do rely on them.   I'm not in favour of removing them until we have
>>>>> proof /confidence that any replacement is catching the same failures.
>>>>> Especially oa, tries, vnodes. (Not tries and offheap is being
>>>>> replaced with "latest", which will be valuable simplification.)
>>>>
>>>>
>>>> What kind of proof do you expect? I cannot imagine how we could prove
>>>> that because the ability of detecting failures results from the randomness
>>>> of those tests. That's why when such a test fail you usually cannot
>>>> reproduce that easily.
>>>>
>>>
>>>
>>> Unit tests that fail consistently but only on one configuration, should
>>> not be removed/replaced until the replacement also catches the failure.
>>>
>>>
>>>
>>>> We could extrapolate that to - why we only have those configurations?
>>>> why don't test trie / oa + compression, or CDC, or system memtable?
>>>>
>>>
>>>
>>> Because, along the way, people have decided a certain configuration
>>> deserves additional testing and it has been done this way in lieu of any
>>> other more efficient approach.
>>>
>>>
>>>

Re: Long tests, Burn tests, Simulator tests, Fuzz tests - can we clarify the diffs?

Posted by Benedict <be...@apache.org>.
Could you give (or link to) some examples of how this would actually benefit
our test suites?

  

> On 12 Dec 2023, at 10:51, Jacek Lewandowski <le...@gmail.com>
> wrote:  
>  
>

> 
>
> I have two major pros for JUnit 5:
>
> \- much better support for parameterized tests
>
> \- global test hooks (automatically detectable extensions) \+ multi-
> inheritance
>
>  
>
>
>  
>
>
>  
>
>
>  
>
>
> pon., 11 gru 2023 o 13:38 Benedict
> <[benedict@apache.org](mailto:benedict@apache.org)> napisał(a):  
>
>

>> Why do we want to move to JUnit 5?

>>

>>  
>
>>

>> I’m generally opposed to churn unless well justified, which it may be -
just not immediately obvious to me.

>>

>>  
>
>>

>>> On 11 Dec 2023, at 08:33, Jacek Lewandowski
<[lewandowski.jacek@gmail.com](mailto:lewandowski.jacek@gmail.com)> wrote:  
>  
>
>>

>>> 

>>>

>>> Nobody referred so far to the idea of moving to JUnit 5, what are the
opinions?

>>>

>>>  
>
>>>

>>>  
>
>>>

>>>  
>
>>>

>>> niedz., 10 gru 2023 o 11:03 Benedict
<[benedict@apache.org](mailto:benedict@apache.org)> napisał(a):  
>
>>>

>>>> Alex’s suggestion was that we meta randomise, ie we randomise the config
parameters to gain better rather than lesser coverage overall. This means we
cover these specific configs and more - just not necessarily on any single
commit.

>>>>

>>>>  
>
>>>>

>>>> I strongly endorse this approach over the status quo.

>>>>

>>>>  
>
>>>>

>>>>> On 8 Dec 2023, at 13:26, Mick Semb Wever
<[mck@apache.org](mailto:mck@apache.org)> wrote:  
>  
>
>>>>

>>>>> 

>>>>>

>>>>>  
>>>>>

>>>>>  
>>>>>

>>>>>  
>>>>>

>>>>>> > I think everyone agrees here, but…. these variations are still
catching failures, and until we have an improvement or replacement we do rely
on them.   I'm not in favour of removing them until we have proof /confidence
that any replacement is catching the same failures.  Especially oa, tries,
vnodes. (Not tries and offheap is being replaced with "latest", which will be
valuable simplification.)  
>>>>>>

>>>>>>  
>
>>>>>>

>>>>>> What kind of proof do you expect? I cannot imagine how we could prove
that because the ability of detecting failures results from the randomness of
those tests. That's why when such a test fail you usually cannot reproduce
that easily.

>>>>>

>>>>>  
>
>>>>>

>>>>>  
>
>>>>>

>>>>> Unit tests that fail consistently but only on one configuration, should
not be removed/replaced until the replacement also catches the failure.

>>>>>

>>>>>  
>
>>>>>

>>>>>  
>>>>>

>>>>>> We could extrapolate that to - why we only have those configurations?
why don't test trie / oa \+ compression, or CDC, or system memtable?

>>>>>

>>>>>  
>
>>>>>

>>>>>  
>
>>>>>

>>>>> Because, along the way, people have decided a certain configuration
deserves additional testing and it has been done this way in lieu of any other
more efficient approach.

>>>>>

>>>>>  
>


Re: Long tests, Burn tests, Simulator tests, Fuzz tests - can we clarify the diffs?

Posted by Jacek Lewandowski <le...@gmail.com>.
I have two major pros for JUnit 5:
- much better support for parameterized tests
- global test hooks (automatically detectable extensions) +
multi-inheritance




pon., 11 gru 2023 o 13:38 Benedict <be...@apache.org> napisał(a):

> Why do we want to move to JUnit 5?
>
> I’m generally opposed to churn unless well justified, which it may be -
> just not immediately obvious to me.
>
> On 11 Dec 2023, at 08:33, Jacek Lewandowski <le...@gmail.com>
> wrote:
>
> 
> Nobody referred so far to the idea of moving to JUnit 5, what are the
> opinions?
>
>
>
> niedz., 10 gru 2023 o 11:03 Benedict <be...@apache.org> napisał(a):
>
>> Alex’s suggestion was that we meta randomise, ie we randomise the config
>> parameters to gain better rather than lesser coverage overall. This means
>> we cover these specific configs and more - just not necessarily on any
>> single commit.
>>
>> I strongly endorse this approach over the status quo.
>>
>> On 8 Dec 2023, at 13:26, Mick Semb Wever <mc...@apache.org> wrote:
>>
>> 
>>
>>
>>
>>>
>>> I think everyone agrees here, but…. these variations are still catching
>>>> failures, and until we have an improvement or replacement we do rely
>>>> on them.   I'm not in favour of removing them until we have proof
>>>> /confidence that any replacement is catching the same failures.  Especially
>>>> oa, tries, vnodes. (Not tries and offheap is being replaced with
>>>> "latest", which will be valuable simplification.)
>>>
>>>
>>> What kind of proof do you expect? I cannot imagine how we could prove
>>> that because the ability of detecting failures results from the randomness
>>> of those tests. That's why when such a test fail you usually cannot
>>> reproduce that easily.
>>>
>>
>>
>> Unit tests that fail consistently but only on one configuration, should
>> not be removed/replaced until the replacement also catches the failure.
>>
>>
>>
>>> We could extrapolate that to - why we only have those configurations?
>>> why don't test trie / oa + compression, or CDC, or system memtable?
>>>
>>
>>
>> Because, along the way, people have decided a certain configuration
>> deserves additional testing and it has been done this way in lieu of any
>> other more efficient approach.
>>
>>
>>

Re: Long tests, Burn tests, Simulator tests, Fuzz tests - can we clarify the diffs?

Posted by Benedict <be...@apache.org>.
Why do we want to move to JUnit 5?

  

I’m generally opposed to churn unless well justified, which it may be - just
not immediately obvious to me.

  

> On 11 Dec 2023, at 08:33, Jacek Lewandowski <le...@gmail.com>
> wrote:  
>  
>

> 
>
> Nobody referred so far to the idea of moving to JUnit 5, what are the
> opinions?
>
>  
>
>
>  
>
>
>  
>
>
> niedz., 10 gru 2023 o 11:03 Benedict
> <[benedict@apache.org](mailto:benedict@apache.org)> napisał(a):  
>
>

>> Alex’s suggestion was that we meta randomise, ie we randomise the config
parameters to gain better rather than lesser coverage overall. This means we
cover these specific configs and more - just not necessarily on any single
commit.

>>

>>  
>
>>

>> I strongly endorse this approach over the status quo.

>>

>>  
>
>>

>>> On 8 Dec 2023, at 13:26, Mick Semb Wever
<[mck@apache.org](mailto:mck@apache.org)> wrote:  
>  
>
>>

>>> 

>>>

>>>  
>>>

>>>  
>>>

>>>  
>>>

>>>> > I think everyone agrees here, but…. these variations are still catching
failures, and until we have an improvement or replacement we do rely on them.
I'm not in favour of removing them until we have proof /confidence that any
replacement is catching the same failures.  Especially oa, tries, vnodes. (Not
tries and offheap is being replaced with "latest", which will be valuable
simplification.)  
>>>>

>>>>  
>
>>>>

>>>> What kind of proof do you expect? I cannot imagine how we could prove
that because the ability of detecting failures results from the randomness of
those tests. That's why when such a test fail you usually cannot reproduce
that easily.

>>>

>>>  
>
>>>

>>>  
>
>>>

>>> Unit tests that fail consistently but only on one configuration, should
not be removed/replaced until the replacement also catches the failure.

>>>

>>>  
>
>>>

>>>  
>>>

>>>> We could extrapolate that to - why we only have those configurations? why
don't test trie / oa \+ compression, or CDC, or system memtable?

>>>

>>>  
>
>>>

>>>  
>
>>>

>>> Because, along the way, people have decided a certain configuration
deserves additional testing and it has been done this way in lieu of any other
more efficient approach.

>>>

>>>  
>


Re: Long tests, Burn tests, Simulator tests, Fuzz tests - can we clarify the diffs?

Posted by Jacek Lewandowski <le...@gmail.com>.
Nobody referred so far to the idea of moving to JUnit 5, what are the
opinions?



niedz., 10 gru 2023 o 11:03 Benedict <be...@apache.org> napisał(a):

> Alex’s suggestion was that we meta randomise, ie we randomise the config
> parameters to gain better rather than lesser coverage overall. This means
> we cover these specific configs and more - just not necessarily on any
> single commit.
>
> I strongly endorse this approach over the status quo.
>
> On 8 Dec 2023, at 13:26, Mick Semb Wever <mc...@apache.org> wrote:
>
> 
>
>
>
>>
>> I think everyone agrees here, but…. these variations are still catching
>>> failures, and until we have an improvement or replacement we do rely on
>>> them.   I'm not in favour of removing them until we have proof /confidence
>>> that any replacement is catching the same failures.  Especially oa, tries,
>>> vnodes. (Not tries and offheap is being replaced with "latest", which
>>> will be valuable simplification.)
>>
>>
>> What kind of proof do you expect? I cannot imagine how we could prove
>> that because the ability of detecting failures results from the randomness
>> of those tests. That's why when such a test fail you usually cannot
>> reproduce that easily.
>>
>
>
> Unit tests that fail consistently but only on one configuration, should
> not be removed/replaced until the replacement also catches the failure.
>
>
>
>> We could extrapolate that to - why we only have those configurations? why
>> don't test trie / oa + compression, or CDC, or system memtable?
>>
>
>
> Because, along the way, people have decided a certain configuration
> deserves additional testing and it has been done this way in lieu of any
> other more efficient approach.
>
>
>

Re: Long tests, Burn tests, Simulator tests, Fuzz tests - can we clarify the diffs?

Posted by Benedict <be...@apache.org>.
Alex’s suggestion was that we meta randomise, ie we randomise the config parameters to gain better rather than lesser coverage overall. This means we cover these specific configs and more - just not necessarily on any single commit.

I strongly endorse this approach over the status quo.

> On 8 Dec 2023, at 13:26, Mick Semb Wever <mc...@apache.org> wrote:
> 
> 
>  
>  
>  
>> 
>>> I think everyone agrees here, but…. these variations are still catching failures, and until we have an improvement or replacement we do rely on them.   I'm not in favour of removing them until we have proof /confidence that any replacement is catching the same failures.  Especially oa, tries, vnodes. (Not tries and offheap is being replaced with "latest", which will be valuable simplification.)  
>> 
>> What kind of proof do you expect? I cannot imagine how we could prove that because the ability of detecting failures results from the randomness of those tests. That's why when such a test fail you usually cannot reproduce that easily.
> 
> 
> Unit tests that fail consistently but only on one configuration, should not be removed/replaced until the replacement also catches the failure.
> 
>  
>> We could extrapolate that to - why we only have those configurations? why don't test trie / oa + compression, or CDC, or system memtable? 
> 
> 
> Because, along the way, people have decided a certain configuration deserves additional testing and it has been done this way in lieu of any other more efficient approach.
> 

Re: Long tests, Burn tests, Simulator tests, Fuzz tests - can we clarify the diffs?

Posted by Josh McKenzie <jm...@apache.org>.
> Unit tests that fail consistently but only on one configuration, should not be removed/replaced until the replacement also catches the failure.

> along the way, people have decided a certain configuration deserves additional testing and it has been done this way in lieu of any other more efficient approach.

Totally agree with these sentiments as well as the framing of our current unit tests as "bad fuzz-tests thanks to non-determinism".

To me, this reinforces my stance on a "pre-commit vs. post-commit" approach to testing *with our current constraints:*
 • Test the default configuration on all supported JDK's pre-commit
 • Post-commit, treat *consistent *failures on non-default configurations as immediate interrupts to the author that introduced them
 • Pre-release, push for no consistent failures on any suite in any configuration, and no regressions in flaky tests from prior release (in ASF CI env).
I think there's value in having the non-default configurations, but I'm not convinced the benefits outweigh the costs *specifically in terms of pre-commit work* due to flakiness in the execution of the software env itself, not to mention hardware env variance on the ASF side today.

All that said - if we got to a world where we could run our jvm-based tests deterministically within the simulator, my intuition is that we'd see a lot of the test-specific, non-defect flakiness reduced drastically. In such a world I'd be in favor of running :allthethings: pre-commit as we'd have *much* higher confidence that those failures were actually attributable to the author of whatever diff the test is run against. 

On Fri, Dec 8, 2023, at 8:25 AM, Mick Semb Wever wrote:
>  
>  
>  
>> 
>>> I think everyone agrees here, but…. these variations are still catching failures, and until we have an improvement or replacement we do rely on them.   I'm not in favour of removing them until we have proof /confidence that any replacement is catching the same failures.  Especially oa, tries, vnodes. (Not tries and offheap is being replaced with "latest", which will be valuable simplification.)  
>> 
>> What kind of proof do you expect? I cannot imagine how we could prove that because the ability of detecting failures results from the randomness of those tests. That's why when such a test fail you usually cannot reproduce that easily.
> 
> 
> Unit tests that fail consistently but only on one configuration, should not be removed/replaced until the replacement also catches the failure.
>  
>> We could extrapolate that to - why we only have those configurations? why don't test trie / oa + compression, or CDC, or system memtable? 
> 
> 
> Because, along the way, people have decided a certain configuration deserves additional testing and it has been done this way in lieu of any other more efficient approach.
> 
> 
> 

Re: Long tests, Burn tests, Simulator tests, Fuzz tests - can we clarify the diffs?

Posted by Mick Semb Wever <mc...@apache.org>.
>
> I think everyone agrees here, but…. these variations are still catching
>> failures, and until we have an improvement or replacement we do rely on
>> them.   I'm not in favour of removing them until we have proof /confidence
>> that any replacement is catching the same failures.  Especially oa, tries,
>> vnodes. (Not tries and offheap is being replaced with "latest", which
>> will be valuable simplification.)
>
>
> What kind of proof do you expect? I cannot imagine how we could prove that
> because the ability of detecting failures results from the randomness of
> those tests. That's why when such a test fail you usually cannot reproduce
> that easily.
>


Unit tests that fail consistently but only on one configuration, should not
be removed/replaced until the replacement also catches the failure.



> We could extrapolate that to - why we only have those configurations? why
> don't test trie / oa + compression, or CDC, or system memtable?
>


Because, along the way, people have decided a certain configuration
deserves additional testing and it has been done this way in lieu of any
other more efficient approach.

Re: Long tests, Burn tests, Simulator tests, Fuzz tests - can we clarify the diffs?

Posted by Jacek Lewandowski <le...@gmail.com>.
>
> It would be great to setup a JUnitRunner using the simulator and find out
> though.
>

I like this idea - this is what I meant when asking about the current unit
tests - to me, a test is either simulation or a fuzz. Due to pretty random
execution order of unit tests, all of them can be considered really
unrobust fuzz tests, implemented with the intention to be simulation tests
(with exact execution order, testing a very specific behaviour).

I think everyone agrees here, but…. these variations are still catching
> failures, and until we have an improvement or replacement we do rely on
> them.   I'm not in favour of removing them until we have proof /confidence
> that any replacement is catching the same failures.  Especially oa, tries,
> vnodes. (Not tries and offheap is being replaced with "latest", which
> will be valuable simplification.)


What kind of proof do you expect? I cannot imagine how we could prove that
because the ability of detecting failures results from the randomness of
those tests. That's why when such a test fail you usually cannot reproduce
that easily. We could extrapolate that to - why we only have those
configurations? why don't test trie / oa + compression, or CDC, or system
memtable? Each random run of a any test can find a new problem. I'm in
favour of parametrizing the "clients" of a certain feature - like
parameterize storage engine tests, streaming and tools tests against
different sstable formats; though it make no sense to parameterize gossip
tests, utility classes tests or dedicated test for certain storage
implementations .



pt., 8 gru 2023 o 07:51 Alex Petrov <al...@coffeenco.de> napisał(a):

> My logic here was that CQLTester tests would probably be the best
> candidate as they are largely single-threaded and single-node. I'm sure
> there are background processes that might slow things down when serialised
> into a single execution thread, but my expectation would be that it will
> not be as significant as with other tests such as multinode in-jvm dtests.
>
> On Thu, Dec 7, 2023, at 7:44 PM, Benedict wrote:
>
>
> I think the biggest impediment to that is that most tests are probably not
> sufficiently robust for simulation. If things happen in a surprising order
> many tests fail, as they implicitly rely on the normal timing of things.
>
> Another issue is that the simulator does potentially slow things down a
> little at the moment. Not sure what the impact would be overall.
>
> It would be great to setup a JUnitRunner using the simulator and find out
> though.
>
>
> On 7 Dec 2023, at 15:43, Alex Petrov <al...@coffeenco.de> wrote:
>
> 
> We have been extensively using simulator for TCM, and I think we have make
> simulator tests more approachable. I think many of the existing tests
> should be ran under simulator instead of CQLTester, for example. This will
> both strengthen the simulator, and make things better in terms of
> determinism. Of course not to say that CQLTester tests are the biggest
> beneficiary there.
>
> On Thu, Dec 7, 2023, at 4:09 PM, Benedict wrote:
>
> To be fair, the lack of coherent framework doesn’t mean we can’t merge
> them from a naming perspective. I don’t mind losing one of burn or fuzz,
> and merging them.
>
> Today simulator tests are kept under the simulator test tree but that
> primarily exists for the simulator itself and testing it. It’s quite a
> complex source tree, as you might expect, and it exists primarily for
> managing its own complexity. It might make sense to bring the Paxos and
> Accord simulator entry points out into the burn/fuzz trees, though not sure
> it’s all that important.
>
>
> > On 7 Dec 2023, at 15:05, Benedict <be...@apache.org> wrote:
> >
> > Yes, the only system/real-time timeout is a progress one, wherein if
> nothing happens for ten minutes we assume the simulation has locked up.
> Hitting this is indicative of a bug, and the timeout is so long that no
> realistic system variability could trigger it.
> >
> >> On 7 Dec 2023, at 14:56, Brandon Williams <dr...@gmail.com> wrote:
> >>
> >> On Thu, Dec 7, 2023 at 8:50 AM Alex Petrov <al...@coffeenco.de> wrote:
> >>>> I've noticed many "sleeps" in the tests - is it possible with
> simulation tests to artificially move the clock forward by, say, 5 seconds
> instead of sleeping just to test, for example whether TTL works?)
> >>>
> >>> Yes, simulator will skip the sleep and do a simulated sleep with a
> simulated clock instead.
> >>
> >> Since it uses an artificial clock, does this mean that the simulator
> >> is also impervious to timeouts caused by the underlying environment?
> >>
> >> Kind Regards,
> >> Brandon
>
>
>
>
>

Re: Long tests, Burn tests, Simulator tests, Fuzz tests - can we clarify the diffs?

Posted by Alex Petrov <al...@coffeenco.de>.
My logic here was that CQLTester tests would probably be the best candidate as they are largely single-threaded and single-node. I'm sure there are background processes that might slow things down when serialised into a single execution thread, but my expectation would be that it will not be as significant as with other tests such as multinode in-jvm dtests. 

On Thu, Dec 7, 2023, at 7:44 PM, Benedict wrote:
> 
> I think the biggest impediment to that is that most tests are probably not sufficiently robust for simulation. If things happen in a surprising order many tests fail, as they implicitly rely on the normal timing of things.
> 
> Another issue is that the simulator does potentially slow things down a little at the moment. Not sure what the impact would be overall.
> 
> It would be great to setup a JUnitRunner using the simulator and find out though.
> 
> 
>> On 7 Dec 2023, at 15:43, Alex Petrov <al...@coffeenco.de> wrote:
>> 
>> We have been extensively using simulator for TCM, and I think we have make simulator tests more approachable. I think many of the existing tests should be ran under simulator instead of CQLTester, for example. This will both strengthen the simulator, and make things better in terms of determinism. Of course not to say that CQLTester tests are the biggest beneficiary there.
>> 
>> On Thu, Dec 7, 2023, at 4:09 PM, Benedict wrote:
>>> To be fair, the lack of coherent framework doesn’t mean we can’t merge them from a naming perspective. I don’t mind losing one of burn or fuzz, and merging them.
>>> 
>>> Today simulator tests are kept under the simulator test tree but that primarily exists for the simulator itself and testing it. It’s quite a complex source tree, as you might expect, and it exists primarily for managing its own complexity. It might make sense to bring the Paxos and Accord simulator entry points out into the burn/fuzz trees, though not sure it’s all that important.
>>> 
>>> 
>>> > On 7 Dec 2023, at 15:05, Benedict <be...@apache.org> wrote:
>>> > 
>>> > Yes, the only system/real-time timeout is a progress one, wherein if nothing happens for ten minutes we assume the simulation has locked up. Hitting this is indicative of a bug, and the timeout is so long that no realistic system variability could trigger it.
>>> > 
>>> >> On 7 Dec 2023, at 14:56, Brandon Williams <dr...@gmail.com> wrote:
>>> >> 
>>> >> On Thu, Dec 7, 2023 at 8:50 AM Alex Petrov <al...@coffeenco.de> wrote:
>>> >>>> I've noticed many "sleeps" in the tests - is it possible with simulation tests to artificially move the clock forward by, say, 5 seconds instead of sleeping just to test, for example whether TTL works?)
>>> >>> 
>>> >>> Yes, simulator will skip the sleep and do a simulated sleep with a simulated clock instead.
>>> >> 
>>> >> Since it uses an artificial clock, does this mean that the simulator
>>> >> is also impervious to timeouts caused by the underlying environment?
>>> >> 
>>> >> Kind Regards,
>>> >> Brandon
>>> 
>>> 
>> 

Re: Long tests, Burn tests, Simulator tests, Fuzz tests - can we clarify the diffs?

Posted by Benedict <be...@apache.org>.
I think the biggest impediment to that is that most tests are probably not
sufficiently robust for simulation. If things happen in a surprising order
many tests fail, as they implicitly rely on the normal timing of things.

  

Another issue is that the simulator does potentially slow things down a little
at the moment. Not sure what the impact would be overall.

  

It would be great to setup a JUnitRunner using the simulator and find out
though.

  

> On 7 Dec 2023, at 15:43, Alex Petrov <al...@coffeenco.de> wrote:  
>  
>

> 
>
> We have been extensively using simulator for TCM, and I think we have make
> simulator tests more approachable. I think many of the existing tests should
> be ran under simulator instead of CQLTester, for example. This will both
> strengthen the simulator, and make things better in terms of determinism. Of
> course not to say that CQLTester tests are the biggest beneficiary there.  
>
>
>  
>
>
> On Thu, Dec 7, 2023, at 4:09 PM, Benedict wrote:  
>
>

>> To be fair, the lack of coherent framework doesn’t mean we can’t merge them
from a naming perspective. I don’t mind losing one of burn or fuzz, and
merging them.  
>
>>

>>  
>
>>

>> Today simulator tests are kept under the simulator test tree but that
primarily exists for the simulator itself and testing it. It’s quite a complex
source tree, as you might expect, and it exists primarily for managing its own
complexity. It might make sense to bring the Paxos and Accord simulator entry
points out into the burn/fuzz trees, though not sure it’s all that important.  
>
>>

>>  
>
>>

>>  
>
>>

>> > On 7 Dec 2023, at 15:05, Benedict
<[benedict@apache.org](mailto:benedict@apache.org)> wrote:  
>
>>

>> >  
>
>>

>> > Yes, the only system/real-time timeout is a progress one, wherein if
nothing happens for ten minutes we assume the simulation has locked up.
Hitting this is indicative of a bug, and the timeout is so long that no
realistic system variability could trigger it.  
>
>>

>> >  
>
>>

>> >> On 7 Dec 2023, at 14:56, Brandon Williams
<[driftx@gmail.com](mailto:driftx@gmail.com)> wrote:  
>
>>

>> >>  
>
>>

>> >> On Thu, Dec 7, 2023 at 8:50 AM Alex Petrov
<[alexp@coffeenco.de](mailto:alexp@coffeenco.de)> wrote:  
>
>>

>> >>>> I've noticed many "sleeps" in the tests - is it possible with
simulation tests to artificially move the clock forward by, say, 5 seconds
instead of sleeping just to test, for example whether TTL works?)  
>
>>

>> >>>  
>
>>

>> >>> Yes, simulator will skip the sleep and do a simulated sleep with a
simulated clock instead.  
>
>>

>> >>  
>
>>

>> >> Since it uses an artificial clock, does this mean that the simulator  
>
>>

>> >> is also impervious to timeouts caused by the underlying environment?  
>
>>

>> >>  
>
>>

>> >> Kind Regards,  
>
>>

>> >> Brandon  
>
>>

>>  
>
>>

>>  
>
>
>  
>


Re: Long tests, Burn tests, Simulator tests, Fuzz tests - can we clarify the diffs?

Posted by Alex Petrov <al...@coffeenco.de>.
We have been extensively using simulator for TCM, and I think we have make simulator tests more approachable. I think many of the existing tests should be ran under simulator instead of CQLTester, for example. This will both strengthen the simulator, and make things better in terms of determinism. Of course not to say that CQLTester tests are the biggest beneficiary there.

On Thu, Dec 7, 2023, at 4:09 PM, Benedict wrote:
> To be fair, the lack of coherent framework doesn’t mean we can’t merge them from a naming perspective. I don’t mind losing one of burn or fuzz, and merging them.
> 
> Today simulator tests are kept under the simulator test tree but that primarily exists for the simulator itself and testing it. It’s quite a complex source tree, as you might expect, and it exists primarily for managing its own complexity. It might make sense to bring the Paxos and Accord simulator entry points out into the burn/fuzz trees, though not sure it’s all that important.
> 
> 
> > On 7 Dec 2023, at 15:05, Benedict <be...@apache.org> wrote:
> > 
> > Yes, the only system/real-time timeout is a progress one, wherein if nothing happens for ten minutes we assume the simulation has locked up. Hitting this is indicative of a bug, and the timeout is so long that no realistic system variability could trigger it.
> > 
> >> On 7 Dec 2023, at 14:56, Brandon Williams <dr...@gmail.com> wrote:
> >> 
> >> On Thu, Dec 7, 2023 at 8:50 AM Alex Petrov <al...@coffeenco.de> wrote:
> >>>> I've noticed many "sleeps" in the tests - is it possible with simulation tests to artificially move the clock forward by, say, 5 seconds instead of sleeping just to test, for example whether TTL works?)
> >>> 
> >>> Yes, simulator will skip the sleep and do a simulated sleep with a simulated clock instead.
> >> 
> >> Since it uses an artificial clock, does this mean that the simulator
> >> is also impervious to timeouts caused by the underlying environment?
> >> 
> >> Kind Regards,
> >> Brandon
> 
> 

Re: Long tests, Burn tests, Simulator tests, Fuzz tests - can we clarify the diffs?

Posted by Benedict <be...@apache.org>.
To be fair, the lack of coherent framework doesn’t mean we can’t merge them from a naming perspective. I don’t mind losing one of burn or fuzz, and merging them.

Today simulator tests are kept under the simulator test tree but that primarily exists for the simulator itself and testing it. It’s quite a complex source tree, as you might expect, and it exists primarily for managing its own complexity. It might make sense to bring the Paxos and Accord simulator entry points out into the burn/fuzz trees, though not sure it’s all that important.


> On 7 Dec 2023, at 15:05, Benedict <be...@apache.org> wrote:
> 
> Yes, the only system/real-time timeout is a progress one, wherein if nothing happens for ten minutes we assume the simulation has locked up. Hitting this is indicative of a bug, and the timeout is so long that no realistic system variability could trigger it.
> 
>> On 7 Dec 2023, at 14:56, Brandon Williams <dr...@gmail.com> wrote:
>> 
>> On Thu, Dec 7, 2023 at 8:50 AM Alex Petrov <al...@coffeenco.de> wrote:
>>>> I've noticed many "sleeps" in the tests - is it possible with simulation tests to artificially move the clock forward by, say, 5 seconds instead of sleeping just to test, for example whether TTL works?)
>>> 
>>> Yes, simulator will skip the sleep and do a simulated sleep with a simulated clock instead.
>> 
>> Since it uses an artificial clock, does this mean that the simulator
>> is also impervious to timeouts caused by the underlying environment?
>> 
>> Kind Regards,
>> Brandon


Re: Long tests, Burn tests, Simulator tests, Fuzz tests - can we clarify the diffs?

Posted by Benedict <be...@apache.org>.
Yes, the only system/real-time timeout is a progress one, wherein if nothing happens for ten minutes we assume the simulation has locked up. Hitting this is indicative of a bug, and the timeout is so long that no realistic system variability could trigger it.

> On 7 Dec 2023, at 14:56, Brandon Williams <dr...@gmail.com> wrote:
> 
> On Thu, Dec 7, 2023 at 8:50 AM Alex Petrov <al...@coffeenco.de> wrote:
>>> I've noticed many "sleeps" in the tests - is it possible with simulation tests to artificially move the clock forward by, say, 5 seconds instead of sleeping just to test, for example whether TTL works?)
>> 
>> Yes, simulator will skip the sleep and do a simulated sleep with a simulated clock instead.
> 
> Since it uses an artificial clock, does this mean that the simulator
> is also impervious to timeouts caused by the underlying environment?
> 
> Kind Regards,
> Brandon


Re: Long tests, Burn tests, Simulator tests, Fuzz tests - can we clarify the diffs?

Posted by Brandon Williams <dr...@gmail.com>.
On Thu, Dec 7, 2023 at 8:50 AM Alex Petrov <al...@coffeenco.de> wrote:
> > I've noticed many "sleeps" in the tests - is it possible with simulation tests to artificially move the clock forward by, say, 5 seconds instead of sleeping just to test, for example whether TTL works?)
>
> Yes, simulator will skip the sleep and do a simulated sleep with a simulated clock instead.

Since it uses an artificial clock, does this mean that the simulator
is also impervious to timeouts caused by the underlying environment?

Kind Regards,
Brandon

Re: Long tests, Burn tests, Simulator tests, Fuzz tests - can we clarify the diffs?

Posted by Alex Petrov <al...@coffeenco.de>.
> We should get rid of long-running unit tests altogether. They should run faster or be split.

I think we just need to evaluate on a case-by-case basis. Some tests are bad and need to go. And we need other/better ones to replace them. I am deliberately not making examples here both to avoid controversy and to highlight this will be a long process. 

> I'm still confused about the distinction between burn and fuzz tests - it seems to me that fuzz tests are just modern burn tests - should we refactor the existing burn tests to use the new framework?

At the particular moment we do not have a corerent generator framework. We have like 15 different ways to generate data and run tests. We need to evaluate, and bring them together.

> 3. Simulation tests - since you say they provide a way to execute a test deterministically, it should be a property of unit tests - well, a unit test is either deterministic or a fuzz test.

Unit tests do not have a guarantee of determinism. The fact that you have determinism from perspective of API (i.e. it is driven by a single thread), has no implications about the behaviour of the system. Simulator guarantees all executions, also concurrent ones, are fully deterministic, including messaging, executors, threads, delays, timeouts, etc. 

> I've noticed many "sleeps" in the tests - is it possible with simulation tests to artificially move the clock forward by, say, 5 seconds instead of sleeping just to test, for example whether TTL works?)

Yes, simulator will skip the sleep and do a simulated sleep with a simulated clock instead. 

> Also, as we start refactoring the tests, it will be an excellent opportunity to move to JUnit 5.

I am working on bringing Harry in-tree. I will need many reviewers and collaborators for making the test suite more powerful and coherent. It would be nice to be able to have a bit more lenience and flexibility and shorter tunarounds when we deal with tests, at least in early phases.

Thank you for the interest in the subject, I think we need to do a lot here.




On Fri, Dec 1, 2023, at 1:31 PM, Jacek Lewandowski wrote:
> Thanks for the exhaustive response, Alex :)
> 
> Let me bring my point of view:
> 
> 1. Since long tests are just unit tests that take a long time to run, it makes sense to separate them for efficient parallelization in CI. Since we are adding new tests, modifying the existing ones, etc., that should be something maintainable; otherwise, the distinction makes no sense to me. For example - adjust timeouts on CI to 1 minute per test class for "short" tests and more for "long" tests. To satisfy CI, the contributor will have to either make the test run faster or move it to the "long" tests. The opposite enforcement could be more difficult, though it is doable as well - failing the "long" test if it takes too little time and should be qualified as a regular unit test. As I'm reading what I've just written, it sounds stupid :/ We should get rid of long-running unit tests altogether. They should run faster or be split.
> 
> 2. I'm still confused about the distinction between burn and fuzz tests - it seems to me that fuzz tests are just modern burn tests - should we refactor the existing burn tests to use the new framework?
> 
> 3. Simulation tests - since you say they provide a way to execute a test deterministically, it should be a property of unit tests - well, a unit test is either deterministic or a fuzz test. Is the simulation framework usable for CQLTester-based tests? (side question here: I've noticed many "sleeps" in the tests - is it possible with simulation tests to artificially move the clock forward by, say, 5 seconds instead of sleeping just to test, for example whether TTL works?)
> 
> 4. Yeah, running a complete suite for each artificially crafted configuration brings little value compared to the maintenance and infrastructure costs. It feels like we are running all tests a bit blindly, hoping we catch something accidentally. I agree this is not the purpose of the unit tests and should be covered instead by fuzz. For features like CDC, compression, different sstable formats, trie memtable, commit log compression/encryption, system directory keyspace, etc... we should have dedicated tests that verify just that functionality
> 
> With more or more functionality offered by Cassandra, they will become a significant pain shortly. Let's start thinking about concrete actions. 
> 
> Also, as we start refactoring the tests, it will be an excellent opportunity to move to JUnit 5.
> 
> thanks,
> Jacek

Re: Long tests, Burn tests, Simulator tests, Fuzz tests - can we clarify the diffs?

Posted by Jacek Lewandowski <le...@gmail.com>.
Thanks for the exhaustive response, Alex :)

Let me bring my point of view:

1. Since long tests are just unit tests that take a long time to run, it
makes sense to separate them for efficient parallelization in CI. Since we
are adding new tests, modifying the existing ones, etc., that should be
something maintainable; otherwise, the distinction makes no sense to me.
For example - adjust timeouts on CI to 1 minute per test class for "short"
tests and more for "long" tests. To satisfy CI, the contributor will have
to either make the test run faster or move it to the "long" tests. The
opposite enforcement could be more difficult, though it is doable as well -
failing the "long" test if it takes too little time and should be qualified
as a regular unit test. As I'm reading what I've just written, it sounds
stupid :/ We should get rid of long-running unit tests altogether. They
should run faster or be split.

2. I'm still confused about the distinction between burn and fuzz tests -
it seems to me that fuzz tests are just modern burn tests - should we
refactor the existing burn tests to use the new framework?

3. Simulation tests - since you say they provide a way to execute a test
deterministically, it should be a property of unit tests - well, a unit
test is either deterministic or a fuzz test. Is the simulation framework
usable for CQLTester-based tests? (side question here: I've noticed many
"sleeps" in the tests - is it possible with simulation tests to
artificially move the clock forward by, say, 5 seconds instead of sleeping
just to test, for example whether TTL works?)

4. Yeah, running a complete suite for each artificially crafted
configuration brings little value compared to the maintenance and
infrastructure costs. It feels like we are running all tests a bit blindly,
hoping we catch something accidentally. I agree this is not the purpose of
the unit tests and should be covered instead by fuzz. For features like
CDC, compression, different sstable formats, trie memtable, commit log
compression/encryption, system directory keyspace, etc... we should have
dedicated tests that verify just that functionality

With more or more functionality offered by Cassandra, they will become a
significant pain shortly. Let's start thinking about concrete actions.

Also, as we start refactoring the tests, it will be an excellent
opportunity to move to JUnit 5.

thanks,
Jacek

Re: Long tests, Burn tests, Simulator tests, Fuzz tests - can we clarify the diffs?

Posted by Alex Petrov <al...@coffeenco.de>.
I will try to resopnd, but please keep in mind that all these terms are somewhat contextual. 

I think long and burn tests are somewhat synonymous. But most long/burn tests that we have in-tree aren't actually that long. They are just long compared to the unit tests. I personally would call the test long when it runs for hours at least, but realistically for days. 

Fuzz tests are randomised tests that attempt to find issues in the system under test. Most of fuzz tests we wrote using Harry are also property-based: they are using a model checker to simulate an internal state of the system and check its responses with a simplified representation.

Simulator tests are just tests that use our simulator framework, that executes a test against a cluster of nodes deterministically by fully serialising all of its operations. We also have a bunch of smaller simulations that simulate different scenarios: bounces, metadata changes, etc, without actually starting the cluster. Those are not simulator tests though. I have also used the word "simulate" in the context of model-checking, but also mostly to illustrate that it's all context-dependent. 

I personally believe that many tests, and test pipelines can (and probably should) be deprecated. But last time I brought this up, there was a bit of pushback, so I think before we can consider deprecation of tests that we think are redundant we will have to substantially improve adoption of the tools that allow better multiplexing.

As regards configurations, I do not think it is necessary to re-run an entire set of u/d/injvm tests with vnode/trie/etc configurations, and instead these scenarios should be exercised by config permuation using a fuzzer. As experience (and several recent issues particularly) show - some important settings are never touched by any of the tests at all, and since tests are static, a chance of finding any issues with some combination of those is slim. 

Apart from what we already have (data and schema generators and failure injection), we now need configuration generators that will find interesting configurations and run randomly generated workflows against those, expecting any configuration of Cassandra to behave the same. 

I do find our test matrix a bit convoluted, and in my experience you spend way more time tweaking tests to work for all configurations after some code changes, and they find legimitate issues rather infrequently. We would probably be better off with a quick "sanity check" for major configurations per commit which, again, would exercise a common set of operations, combined with a comprehensive test suite which will try to cover as much ground as possible.

Hope this helps.
--Alex


On Thu, Nov 30, 2023, at 10:25 AM, Jacek Lewandowski wrote:
> Hi,
> 
> I'm getting a bit lost - what are the exact differences between those test scenarios? What are the criteria for qualifying a test to be part of a certain scenario?
> 
> I'm working a little bit with tests and build scripts and the number of different configurations for which we have a separate target in the build starts to be problematic, I cannot imagine how problematic it is for a new contributor.
> 
> It is not urgent, but we should at least have a plan on how to simplify and unify things.
> 
> I'm in favour of reducing the number of test targets to the minimum - for different configurations I think we should provide a parameter pointing to jvm options file and maybe to cassandra.yaml. I know that we currently do some super hacky things with cassandra yaml for different configs - like concatenting parts of it. I presume it is not necessary - we can have a default test config yaml and a directory with overriding yamls; while building we could have a tool which is able to load the default configuration, apply the override and save the resulting yaml somewhere in the build/test/configs for example. That would allows us to easily use those yamls in IDE as well - currently it is impossible.
> 
> What do you think?
> 
> Thank you and my apologize for bothering about lower priority stuff while we have a 5.0 release headache...
> 
> Jacek
> 

Re: Long tests, Burn tests, Simulator tests, Fuzz tests - can we clarify the diffs?

Posted by Benedict <be...@apache.org>.
I don’t know - I’m not sure what fuzz test means in this context. It’s a newer
concept that I didn’t introduce.

  

> On 30 Nov 2023, at 20:06, Jacek Lewandowski <le...@gmail.com>
> wrote:  
>  
>

> 
>
> How those burn tests then compare to the fuzz tests? (the new ones)
>
>  
>
>
> czw., 30 lis 2023, 20:22 użytkownik Benedict
> <[benedict@apache.org](mailto:benedict@apache.org)> napisał:  
>
>

>> By “could run indefinitely” I don’t mean by default they run forever. There
will be parameters that change how much work is done for a given run, but just
running repeatedly (each time with a different generated seeds) is the
expected usage. Until you run out of compute or patience.

>>

>>  
>
>>

>> I agree they are only of value pre-commit to check they haven’t been broken
in some way by changes.

>>

>>  
>
>>

>>  
>>

>>  
>
>>

>>> On 30 Nov 2023, at 18:36, Josh McKenzie
<[jmckenzie@apache.org](mailto:jmckenzie@apache.org)> wrote:  
>  
>
>>

>>> 

>>>

>>>> that may be long-running and that could be run indefinitely  
>
>>>

>>> Perfect. That was the distinction I wasn't aware of. Also means having the
burn target as part of regular CI runs is probably a mistake, yes? i.e. if
someone adds a burn tests that runs indefinitely, are there any guardrails or
built-in checks or timeouts to keep it from running right up to job timeout
and then failing?

>>>

>>>  
>
>>>

>>> On Thu, Nov 30, 2023, at 1:11 PM, Benedict wrote:  
>
>>>

>>>>  
>
>>>>

>>>> A burn test is a randomised test targeting broad coverage of a single
system, subsystem or utility, that may be long-running and that could be run
indefinitely, each run providing incrementally more assurance of quality of
the system.  
>
>>>>

>>>>  
>
>>>>

>>>> A long test is a unit test that sometimes takes a long time to run, no
more no less. I’m not sure any of these offer all that much value anymore, and
perhaps we could look to deprecate them.  
>
>>>>

>>>>  
>
>>>>

>>>>> On 30 Nov 2023, at 17:20, Josh McKenzie
<[jmckenzie@apache.org](mailto:jmckenzie@apache.org)> wrote:  
>
>>>>

>>>>>   
>
>>>>>

>>>>> Strongly agree. I started working on a declarative refactor out of our
CI configuration so circle, ASFCI, and other systems could inherit from it
(for instance, see pre-commit pipeline declaration
[here](https://github.com/apache/cassandra/pull/2554/files#diff-a4c4d1d91048841f76d124386858bda9944644cfef8ccb4ab84319cedaf5b3feR71-R89));
I had to set that down while I finished up implementing an internal CI system
since the code in neither the ASF CI structure nor circle structure (.sh
embedded in .yml /cry) was re-usable in their current form.  
>
>>>>>

>>>>>  
>
>>>>>

>>>>> Having a jvm.options and cassandra.yaml file per suite and referencing
them from a [declarative job
definition](https://github.com/apache/cassandra/pull/2554/files#diff-a4c4d1d91048841f76d124386858bda9944644cfef8ccb4ab84319cedaf5b3feR237-R267)
would make things a lot easier to wrap our heads around and maintain I think.  
>
>>>>>

>>>>>  
>
>>>>>

>>>>> As for what qualifies as burn vs. long... /shrug couldn't tell you.
Would have to go down the git blame + dev ML + JIRA rabbit hole. :) Maybe
someone else on-list knows.  
>
>>>>>

>>>>>  
>
>>>>>

>>>>> On Thu, Nov 30, 2023, at 4:25 AM, Jacek Lewandowski wrote:  
>
>>>>>

>>>>>> Hi,  
>
>>>>>>

>>>>>>  
>
>>>>>>

>>>>>> I'm getting a bit lost - what are the exact differences between those
test scenarios? What are the criteria for qualifying a test to be part of a
certain scenario?  
>
>>>>>>

>>>>>>  
>
>>>>>>

>>>>>> I'm working a little bit with tests and build scripts and the number of
different configurations for which we have a separate target in the build
starts to be problematic, I cannot imagine how problematic it is for a new
contributor.  
>
>>>>>>

>>>>>>  
>
>>>>>>

>>>>>> It is not urgent, but we should at least have a plan on how to simplify
and unify things.  
>
>>>>>>

>>>>>>  
>
>>>>>>

>>>>>> I'm in favour of reducing the number of test targets to the minimum -
for different configurations I think we should provide a parameter pointing to
jvm options file and maybe to cassandra.yaml. I know that we currently do some
super hacky things with cassandra yaml for different configs - like
concatenting parts of it. I presume it is not necessary - we can have a
default test config yaml and a directory with overriding yamls; while building
we could have a tool which is able to load the default configuration, apply
the override and save the resulting yaml somewhere in the build/test/configs
for example. That would allows us to easily use those yamls in IDE as well -
currently it is impossible.  
>
>>>>>>

>>>>>>  
>
>>>>>>

>>>>>> What do you think?  
>
>>>>>>

>>>>>>  
>
>>>>>>

>>>>>> Thank you and my apologize for bothering about lower priority stuff
while we have a 5.0 release headache...  
>
>>>>>>

>>>>>>  
>
>>>>>>

>>>>>> Jacek  
>
>>>>>>

>>>>>>  
>
>>>>>

>>>>>  
>
>>>

>>>  
>


Re: Long tests, Burn tests, Simulator tests, Fuzz tests - can we clarify the diffs?

Posted by Jacek Lewandowski <le...@gmail.com>.
How those burn tests then compare to the fuzz tests? (the new ones)

czw., 30 lis 2023, 20:22 użytkownik Benedict <be...@apache.org> napisał:

> By “could run indefinitely” I don’t mean by default they run forever.
> There will be parameters that change how much work is done for a given run,
> but just running repeatedly (each time with a different generated seeds) is
> the expected usage. Until you run out of compute or patience.
>
> I agree they are only of value pre-commit to check they haven’t been
> broken in some way by changes.
>
>
>
> On 30 Nov 2023, at 18:36, Josh McKenzie <jm...@apache.org> wrote:
>
> 
>
> that may be long-running and that could be run indefinitely
>
> Perfect. That was the distinction I wasn't aware of. Also means having the
> burn target as part of regular CI runs is probably a mistake, yes? i.e. if
> someone adds a burn tests that runs indefinitely, are there any guardrails
> or built-in checks or timeouts to keep it from running right up to job
> timeout and then failing?
>
> On Thu, Nov 30, 2023, at 1:11 PM, Benedict wrote:
>
>
> A burn test is a randomised test targeting broad coverage of a single
> system, subsystem or utility, that may be long-running and that could be
> run indefinitely, each run providing incrementally more assurance of
> quality of the system.
>
> A long test is a unit test that sometimes takes a long time to run, no
> more no less. I’m not sure any of these offer all that much value anymore,
> and perhaps we could look to deprecate them.
>
> On 30 Nov 2023, at 17:20, Josh McKenzie <jm...@apache.org> wrote:
>
> 
> Strongly agree. I started working on a declarative refactor out of our CI
> configuration so circle, ASFCI, and other systems could inherit from it
> (for instance, see pre-commit pipeline declaration here
> <https://github.com/apache/cassandra/pull/2554/files#diff-a4c4d1d91048841f76d124386858bda9944644cfef8ccb4ab84319cedaf5b3feR71-R89>);
> I had to set that down while I finished up implementing an internal CI
> system since the code in neither the ASF CI structure nor circle structure
> (.sh embedded in .yml /cry) was re-usable in their current form.
>
> Having a jvm.options and cassandra.yaml file per suite and referencing
> them from a declarative job definition
> <https://github.com/apache/cassandra/pull/2554/files#diff-a4c4d1d91048841f76d124386858bda9944644cfef8ccb4ab84319cedaf5b3feR237-R267>
> would make things a lot easier to wrap our heads around and maintain I
> think.
>
> As for what qualifies as burn vs. long... /shrug couldn't tell you. Would
> have to go down the git blame + dev ML + JIRA rabbit hole. :) Maybe someone
> else on-list knows.
>
> On Thu, Nov 30, 2023, at 4:25 AM, Jacek Lewandowski wrote:
>
> Hi,
>
> I'm getting a bit lost - what are the exact differences between those
> test scenarios? What are the criteria for qualifying a test to be part of a
> certain scenario?
>
> I'm working a little bit with tests and build scripts and the number of
> different configurations for which we have a separate target in the build
> starts to be problematic, I cannot imagine how problematic it is for a new
> contributor.
>
> It is not urgent, but we should at least have a plan on how to
> simplify and unify things.
>
> I'm in favour of reducing the number of test targets to the minimum - for
> different configurations I think we should provide a parameter pointing to
> jvm options file and maybe to cassandra.yaml. I know that we currently do
> some super hacky things with cassandra yaml for different configs - like
> concatenting parts of it. I presume it is not necessary - we can have a
> default test config yaml and a directory with overriding yamls; while
> building we could have a tool which is able to load the default
> configuration, apply the override and save the resulting yaml somewhere in
> the build/test/configs for example. That would allows us to easily use
> those yamls in IDE as well - currently it is impossible.
>
> What do you think?
>
> Thank you and my apologize for bothering about lower priority stuff while
> we have a 5.0 release headache...
>
> Jacek
>
>
>
>

Re: Long tests, Burn tests, Simulator tests, Fuzz tests - can we clarify the diffs?

Posted by Benedict <be...@apache.org>.
By “could run indefinitely” I don’t mean by default they run forever. There
will be parameters that change how much work is done for a given run, but just
running repeatedly (each time with a different generated seeds) is the
expected usage. Until you run out of compute or patience.

  

I agree they are only of value pre-commit to check they haven’t been broken in
some way by changes.

  



  

> On 30 Nov 2023, at 18:36, Josh McKenzie <jm...@apache.org> wrote:  
>  
>

> 
>

>> that may be long-running and that could be run indefinitely  
>
>
> Perfect. That was the distinction I wasn't aware of. Also means having the
> burn target as part of regular CI runs is probably a mistake, yes? i.e. if
> someone adds a burn tests that runs indefinitely, are there any guardrails
> or built-in checks or timeouts to keep it from running right up to job
> timeout and then failing?
>
>  
>
>
> On Thu, Nov 30, 2023, at 1:11 PM, Benedict wrote:  
>
>

>>  
>
>>

>> A burn test is a randomised test targeting broad coverage of a single
system, subsystem or utility, that may be long-running and that could be run
indefinitely, each run providing incrementally more assurance of quality of
the system.  
>
>>

>>  
>
>>

>> A long test is a unit test that sometimes takes a long time to run, no more
no less. I’m not sure any of these offer all that much value anymore, and
perhaps we could look to deprecate them.  
>
>>

>>  
>
>>

>>> On 30 Nov 2023, at 17:20, Josh McKenzie <jm...@apache.org> wrote:  
>
>>

>>>   
>
>>>

>>> Strongly agree. I started working on a declarative refactor out of our CI
configuration so circle, ASFCI, and other systems could inherit from it (for
instance, see pre-commit pipeline declaration
[here](https://github.com/apache/cassandra/pull/2554/files#diff-a4c4d1d91048841f76d124386858bda9944644cfef8ccb4ab84319cedaf5b3feR71-R89));
I had to set that down while I finished up implementing an internal CI system
since the code in neither the ASF CI structure nor circle structure (.sh
embedded in .yml /cry) was re-usable in their current form.  
>
>>>

>>>  
>
>>>

>>> Having a jvm.options and cassandra.yaml file per suite and referencing
them from a [declarative job
definition](https://github.com/apache/cassandra/pull/2554/files#diff-a4c4d1d91048841f76d124386858bda9944644cfef8ccb4ab84319cedaf5b3feR237-R267)
would make things a lot easier to wrap our heads around and maintain I think.  
>
>>>

>>>  
>
>>>

>>> As for what qualifies as burn vs. long... /shrug couldn't tell you. Would
have to go down the git blame + dev ML + JIRA rabbit hole. :) Maybe someone
else on-list knows.  
>
>>>

>>>  
>
>>>

>>> On Thu, Nov 30, 2023, at 4:25 AM, Jacek Lewandowski wrote:  
>
>>>

>>>> Hi,  
>
>>>>

>>>>  
>
>>>>

>>>> I'm getting a bit lost - what are the exact differences between those
test scenarios? What are the criteria for qualifying a test to be part of a
certain scenario?  
>
>>>>

>>>>  
>
>>>>

>>>> I'm working a little bit with tests and build scripts and the number of
different configurations for which we have a separate target in the build
starts to be problematic, I cannot imagine how problematic it is for a new
contributor.  
>
>>>>

>>>>  
>
>>>>

>>>> It is not urgent, but we should at least have a plan on how to simplify
and unify things.  
>
>>>>

>>>>  
>
>>>>

>>>> I'm in favour of reducing the number of test targets to the minimum - for
different configurations I think we should provide a parameter pointing to jvm
options file and maybe to cassandra.yaml. I know that we currently do some
super hacky things with cassandra yaml for different configs - like
concatenting parts of it. I presume it is not necessary - we can have a
default test config yaml and a directory with overriding yamls; while building
we could have a tool which is able to load the default configuration, apply
the override and save the resulting yaml somewhere in the build/test/configs
for example. That would allows us to easily use those yamls in IDE as well -
currently it is impossible.  
>
>>>>

>>>>  
>
>>>>

>>>> What do you think?  
>
>>>>

>>>>  
>
>>>>

>>>> Thank you and my apologize for bothering about lower priority stuff while
we have a 5.0 release headache...  
>
>>>>

>>>>  
>
>>>>

>>>> Jacek  
>
>>>>

>>>>  
>
>>>

>>>  
>
>
>  
>


Re: Long tests, Burn tests, Simulator tests, Fuzz tests - can we clarify the diffs?

Posted by Josh McKenzie <jm...@apache.org>.
> that may be long-running and that could be run indefinitely
Perfect. That was the distinction I wasn't aware of. Also means having the burn target as part of regular CI runs is probably a mistake, yes? i.e. if someone adds a burn tests that runs indefinitely, are there any guardrails or built-in checks or timeouts to keep it from running right up to job timeout and then failing?

On Thu, Nov 30, 2023, at 1:11 PM, Benedict wrote:
> 
> A burn test is a randomised test targeting broad coverage of a single system, subsystem or utility, that may be long-running and that could be run indefinitely, each run providing incrementally more assurance of quality of the system.
> 
> A long test is a unit test that sometimes takes a long time to run, no more no less. I’m not sure any of these offer all that much value anymore, and perhaps we could look to deprecate them.
> 
>> On 30 Nov 2023, at 17:20, Josh McKenzie <jm...@apache.org> wrote:
>> 
>> Strongly agree. I started working on a declarative refactor out of our CI configuration so circle, ASFCI, and other systems could inherit from it (for instance, see pre-commit pipeline declaration here <https://github.com/apache/cassandra/pull/2554/files#diff-a4c4d1d91048841f76d124386858bda9944644cfef8ccb4ab84319cedaf5b3feR71-R89>); I had to set that down while I finished up implementing an internal CI system since the code in neither the ASF CI structure nor circle structure (.sh embedded in .yml /cry) was re-usable in their current form.
>> 
>> Having a jvm.options and cassandra.yaml file per suite and referencing them from a declarative job definition <https://github.com/apache/cassandra/pull/2554/files#diff-a4c4d1d91048841f76d124386858bda9944644cfef8ccb4ab84319cedaf5b3feR237-R267> would make things a lot easier to wrap our heads around and maintain I think.
>> 
>> As for what qualifies as burn vs. long... /shrug couldn't tell you. Would have to go down the git blame + dev ML + JIRA rabbit hole. :) Maybe someone else on-list knows.
>> 
>> On Thu, Nov 30, 2023, at 4:25 AM, Jacek Lewandowski wrote:
>>> Hi,
>>> 
>>> I'm getting a bit lost - what are the exact differences between those test scenarios? What are the criteria for qualifying a test to be part of a certain scenario?
>>> 
>>> I'm working a little bit with tests and build scripts and the number of different configurations for which we have a separate target in the build starts to be problematic, I cannot imagine how problematic it is for a new contributor.
>>> 
>>> It is not urgent, but we should at least have a plan on how to simplify and unify things.
>>> 
>>> I'm in favour of reducing the number of test targets to the minimum - for different configurations I think we should provide a parameter pointing to jvm options file and maybe to cassandra.yaml. I know that we currently do some super hacky things with cassandra yaml for different configs - like concatenting parts of it. I presume it is not necessary - we can have a default test config yaml and a directory with overriding yamls; while building we could have a tool which is able to load the default configuration, apply the override and save the resulting yaml somewhere in the build/test/configs for example. That would allows us to easily use those yamls in IDE as well - currently it is impossible.
>>> 
>>> What do you think?
>>> 
>>> Thank you and my apologize for bothering about lower priority stuff while we have a 5.0 release headache...
>>> 
>>> Jacek
>>> 
>> 

Re: Long tests, Burn tests, Simulator tests, Fuzz tests - can we clarify the diffs?

Posted by Benedict <be...@apache.org>.
A burn test is a randomised test targeting broad coverage of a single system,
subsystem or utility, that may be long-running and that could be run
indefinitely, each run providing incrementally more assurance of quality of
the system.

  

A long test is a unit test that sometimes takes a long time to run, no more no
less. I’m not sure any of these offer all that much value anymore, and perhaps
we could look to deprecate them.

  

> On 30 Nov 2023, at 17:20, Josh McKenzie <jm...@apache.org> wrote:  
>  
>

> 
>
> Strongly agree. I started working on a declarative refactor out of our CI
> configuration so circle, ASFCI, and other systems could inherit from it (for
> instance, see pre-commit pipeline declaration
> [here](https://github.com/apache/cassandra/pull/2554/files#diff-a4c4d1d91048841f76d124386858bda9944644cfef8ccb4ab84319cedaf5b3feR71-R89));
> I had to set that down while I finished up implementing an internal CI
> system since the code in neither the ASF CI structure nor circle structure
> (.sh embedded in .yml /cry) was re-usable in their current form.  
>
>
>  
>
>
> Having a jvm.options and cassandra.yaml file per suite and referencing them
> from a [declarative job
> definition](https://github.com/apache/cassandra/pull/2554/files#diff-a4c4d1d91048841f76d124386858bda9944644cfef8ccb4ab84319cedaf5b3feR237-R267)
> would make things a lot easier to wrap our heads around and maintain I
> think.  
>
>
>  
>
>
> As for what qualifies as burn vs. long... /shrug couldn't tell you. Would
> have to go down the git blame + dev ML + JIRA rabbit hole. :) Maybe someone
> else on-list knows.
>
>  
>
>
> On Thu, Nov 30, 2023, at 4:25 AM, Jacek Lewandowski wrote:  
>
>

>> Hi,  
>
>>

>>  
>
>>

>> I'm getting a bit lost - what are the exact differences between those test
scenarios? What are the criteria for qualifying a test to be part of a certain
scenario?  
>
>>

>>  
>
>>

>> I'm working a little bit with tests and build scripts and the number of
different configurations for which we have a separate target in the build
starts to be problematic, I cannot imagine how problematic it is for a new
contributor.  
>
>>

>>  
>
>>

>> It is not urgent, but we should at least have a plan on how to simplify and
unify things.  
>
>>

>>  
>
>>

>> I'm in favour of reducing the number of test targets to the minimum - for
different configurations I think we should provide a parameter pointing to jvm
options file and maybe to cassandra.yaml. I know that we currently do some
super hacky things with cassandra yaml for different configs - like
concatenting parts of it. I presume it is not necessary - we can have a
default test config yaml and a directory with overriding yamls; while building
we could have a tool which is able to load the default configuration, apply
the override and save the resulting yaml somewhere in the build/test/configs
for example. That would allows us to easily use those yamls in IDE as well -
currently it is impossible.  
>
>>

>>  
>
>>

>> What do you think?  
>
>>

>>  
>
>>

>> Thank you and my apologize for bothering about lower priority stuff while
we have a 5.0 release headache...  
>
>>

>>  
>
>>

>> Jacek  
>
>>

>>  
>
>
>  
>


Re: Long tests, Burn tests, Simulator tests, Fuzz tests - can we clarify the diffs?

Posted by Josh McKenzie <jm...@apache.org>.
Strongly agree. I started working on a declarative refactor out of our CI configuration so circle, ASFCI, and other systems could inherit from it (for instance, see pre-commit pipeline declaration here <https://github.com/apache/cassandra/pull/2554/files#diff-a4c4d1d91048841f76d124386858bda9944644cfef8ccb4ab84319cedaf5b3feR71-R89>); I had to set that down while I finished up implementing an internal CI system since the code in neither the ASF CI structure nor circle structure (.sh embedded in .yml /cry) was re-usable in their current form.

Having a jvm.options and cassandra.yaml file per suite and referencing them from a declarative job definition <https://github.com/apache/cassandra/pull/2554/files#diff-a4c4d1d91048841f76d124386858bda9944644cfef8ccb4ab84319cedaf5b3feR237-R267> would make things a lot easier to wrap our heads around and maintain I think.

As for what qualifies as burn vs. long... /shrug couldn't tell you. Would have to go down the git blame + dev ML + JIRA rabbit hole. :) Maybe someone else on-list knows.

On Thu, Nov 30, 2023, at 4:25 AM, Jacek Lewandowski wrote:
> Hi,
> 
> I'm getting a bit lost - what are the exact differences between those test scenarios? What are the criteria for qualifying a test to be part of a certain scenario?
> 
> I'm working a little bit with tests and build scripts and the number of different configurations for which we have a separate target in the build starts to be problematic, I cannot imagine how problematic it is for a new contributor.
> 
> It is not urgent, but we should at least have a plan on how to simplify and unify things.
> 
> I'm in favour of reducing the number of test targets to the minimum - for different configurations I think we should provide a parameter pointing to jvm options file and maybe to cassandra.yaml. I know that we currently do some super hacky things with cassandra yaml for different configs - like concatenting parts of it. I presume it is not necessary - we can have a default test config yaml and a directory with overriding yamls; while building we could have a tool which is able to load the default configuration, apply the override and save the resulting yaml somewhere in the build/test/configs for example. That would allows us to easily use those yamls in IDE as well - currently it is impossible.
> 
> What do you think?
> 
> Thank you and my apologize for bothering about lower priority stuff while we have a 5.0 release headache...
> 
> Jacek
>