You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cassandra.apache.org by "Miklosovic, Stefan" <St...@netapp.com> on 2022/10/24 09:01:28 UTC

Some tests are never executed in CI due to their name

Hi list,

it was reported that some tests are not executed as part of CI because of their name (1). There is (2) and (3) so right now it runs only "*Test.java" files.

Except of us renaming the affected classes, we should fix this in a more robust way - via checkstyle - so we enforce how test files have to be called.

We came up with two approaches and we do not know which one to choose so we want to gather more feedback / opinions.

The first approach would make this happen (implemented here (4))

class name - passes / fails the checkstyle

TestIt - fails
TestsThatFeature - fails
ThisTestsMyFeature - fails
SomeTests - fails

SomeHelperClassForTesting - passes
SomeTest - passes
DistributedTestBase - passes

MyTestSplit1 - passes

We would fix "SplitN" test files by hand (I think they are not executed right now in CI either) to something like MySplit1Test and MySplit2Test but the regexp we would eventually use would not prevent this from happening in the future. I do not know how to construct such a complex regexp yet which would cover all cases above plus this splitting one. Feel free to look into the ticket where details are discussed.

The second approach would forbid to have any occurrence of "test" in the file name but at the end. For example, it would fail on classes like these

MyTestSplit1
TestHelper
BurnTestUtil
FuzzTestBase
DistributedTestSnitch
CASCommonTestCases
ServerTestUtils
AuthTestUtils
TestMemtable

While this would also fix "MyTestSplitN" cases, I consider this to be way more invasive to the current codebase because we would have to rename a lot of files (like, hundreds, plus a lot of places they are referenced / used at) and I do not even think this is necessary and desirable. Like, how would we call "FuzzTestBase" to not contain "Test" in it? I think having "test" string in the middle is just fine, specifically for helper classes. One possible fix would be to replace "Test" with "Tester" so we would have "TesterMemtable", "AuthTesterUtils", "TesterHelper", "DistributedTesterSnitch" ... Word "Tester" would be explicitly allowed.

I personally lean to the first approach even though cases like "MyTestSplit1" will not be spotted in compile time, but hey, we still do have a review process on top of this. The obvious violations like "TestThisFeature" and "MySuperTests" would be evaluated as violations which I would argue happens the most often.

Please keep in mind that we have checkstyle only in 4.1 and trunk. So, while we would fix names of classes in all branches we support so they are picked up by CI, checkstyle for this would be introduced in 4.1 and trunk only. 

(1) https://issues.apache.org/jira/browse/CASSANDRA-17964
(2) https://github.com/apache/cassandra/blob/trunk/build.xml#L61
(3) https://github.com/apache/cassandra/blob/trunk/build.xml#L1033
(4) https://github.com/instaclustr/cassandra/commits/CASSANDRA-17964-4.1-2

Re: Some tests are never executed in CI due to their name

Posted by "Miklosovic, Stefan" <St...@netapp.com>.
I meant MyFeatureTest -> DistributedTestBaseImpl  -> DistributedTestBase

TestBaseImpl would be renamed to DistributedTestBaseImpl

DistributedTestBase is in dtest package (separate project / dependency)

________________________________________
From: Miklosovic, Stefan <St...@netapp.com>
Sent: Monday, October 24, 2022 15:26
To: dev@cassandra.apache.org
Subject: Re: Some tests are never executed in CI due to their name

Hi Derek,

yes, the regexp is ^(?!(?:Test))(?!.*?(?:Tests)).*$

What we are trying to achieve by that is to have a test which ends on "Test" and nothing else and it may contain "Test" only in case it does not start with it.

To your second paragraph, I do not think we have any *Base files which are tests (that would be a bug) nor they are standalone (not inherited). By first approach, TestBaseImpl (which we do have) would start to be invalid, we would have to rename it to "DistributedTestBaseImpl" because it was starting on "Test" which is no-no. I agree that "TestBaseImpl" is rather unfortunate name. Currently, TestBaseImpl itself extends "DistributedTestBase" so in the end we would have "DistributedTestBaseImpl -> TestBaseImpl -> DistributedTestBase". Please keep in mind that "DistributedTestBase" is located in dtest jar (separate project) and that is rather inconvenient to rename.

Regards

________________________________________
From: Derek Chen-Becker <de...@chen-becker.org>
Sent: Monday, October 24, 2022 15:09
To: dev@cassandra.apache.org
Subject: Re: Some tests are never executed in CI due to their name

NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.



For clarity, what is the final regex you've landed on for the first approach? Is it "^(?!(?:Test))(?!.*?(?:Tests)).*$" ? Are we just trying to reject anything that *starts* with "Test" or contains "Tests" somewhere in the name? It might be more clear to state what we think a valid test name should be, not in regex form.

When it comes to "TestBase" files, are there any of these that are actually intended to be run as a test? The "Base" would indicate to me that it's intended to be inherited, not run directly. Do we have some "TestBase" classes somewhere that aren't inherited and are meant to be run directly, because that definitely seems like a misleading name that should be fixed, invasive or not. I would lean toward the principle of least surprise here, even if it means an involved one-time cleanup.

Cheers,

Derek

On Mon, Oct 24, 2022 at 3:02 AM Miklosovic, Stefan <St...@netapp.com>> wrote:
Hi list,

it was reported that some tests are not executed as part of CI because of their name (1). There is (2) and (3) so right now it runs only "*Test.java" files.

Except of us renaming the affected classes, we should fix this in a more robust way - via checkstyle - so we enforce how test files have to be called.

We came up with two approaches and we do not know which one to choose so we want to gather more feedback / opinions.

The first approach would make this happen (implemented here (4))

class name - passes / fails the checkstyle

TestIt - fails
TestsThatFeature - fails
ThisTestsMyFeature - fails
SomeTests - fails

SomeHelperClassForTesting - passes
SomeTest - passes
DistributedTestBase - passes

MyTestSplit1 - passes

We would fix "SplitN" test files by hand (I think they are not executed right now in CI either) to something like MySplit1Test and MySplit2Test but the regexp we would eventually use would not prevent this from happening in the future. I do not know how to construct such a complex regexp yet which would cover all cases above plus this splitting one. Feel free to look into the ticket where details are discussed.

The second approach would forbid to have any occurrence of "test" in the file name but at the end. For example, it would fail on classes like these

MyTestSplit1
TestHelper
BurnTestUtil
FuzzTestBase
DistributedTestSnitch
CASCommonTestCases
ServerTestUtils
AuthTestUtils
TestMemtable

While this would also fix "MyTestSplitN" cases, I consider this to be way more invasive to the current codebase because we would have to rename a lot of files (like, hundreds, plus a lot of places they are referenced / used at) and I do not even think this is necessary and desirable. Like, how would we call "FuzzTestBase" to not contain "Test" in it? I think having "test" string in the middle is just fine, specifically for helper classes. One possible fix would be to replace "Test" with "Tester" so we would have "TesterMemtable", "AuthTesterUtils", "TesterHelper", "DistributedTesterSnitch" ... Word "Tester" would be explicitly allowed.

I personally lean to the first approach even though cases like "MyTestSplit1" will not be spotted in compile time, but hey, we still do have a review process on top of this. The obvious violations like "TestThisFeature" and "MySuperTests" would be evaluated as violations which I would argue happens the most often.

Please keep in mind that we have checkstyle only in 4.1 and trunk. So, while we would fix names of classes in all branches we support so they are picked up by CI, checkstyle for this would be introduced in 4.1 and trunk only.

(1) https://issues.apache.org/jira/browse/CASSANDRA-17964
(2) https://github.com/apache/cassandra/blob/trunk/build.xml#L61
(3) https://github.com/apache/cassandra/blob/trunk/build.xml#L1033
(4) https://github.com/instaclustr/cassandra/commits/CASSANDRA-17964-4.1-2


--
+---------------------------------------------------------------+
| Derek Chen-Becker                                             |
| GPG Key available at https://keybase.io/dchenbecker and       |
| https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
| Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
+---------------------------------------------------------------+


Re: Some tests are never executed in CI due to their name

Posted by Andrés de la Peña <ad...@apache.org>.
+1 to waiver

On Tue, 15 Nov 2022 at 05:54, Berenguer Blasi <be...@gmail.com>
wrote:

> +1 to waiver
> On 15/11/22 2:07, Josh McKenzie wrote:
>
> +1 to waiver.
>
> We still don't have some kind of @flaky annotation that sequesters tests
> do we? :)
>
> On Mon, Nov 14, 2022, at 5:58 PM, Ekaterina Dimitrova wrote:
>
> +1
>
> On Mon, 14 Nov 2022 at 17:55, Brandon Williams <dr...@gmail.com> wrote:
>
> +1 to waiving these.
>
> On Mon, Nov 14, 2022, 4:49 PM Miklosovic, Stefan <
> Stefan.Miklosovic@netapp.com> wrote:
>
> Tickets for the flaky tests are here
>
> https://issues.apache.org/jira/browse/CASSANDRA-18047
> https://issues.apache.org/jira/browse/CASSANDRA-18048
>
> ________________________________________
> From: Mick Semb Wever <mc...@apache.org>
> Sent: Monday, November 14, 2022 23:28
> To: dev@cassandra.apache.org
> Subject: Re: Some tests are never executed in CI due to their name
>
> NetApp Security WARNING: This is an external email. Do not click links or
> open attachments unless you recognize the sender and know the content is
> safe.
>
>
>
>
> in CASSANDRA-18029, two flaky tests were committed by mistake due to my
> misunderstanding. We agreed on this thread that we should not commit flaky
> tests right before rc. So now the rc is technically blocked by them. To
> unblock it, what is needed is to have a waiver on them. If there is not a
> waiver, I need to go back to that test and remove the two test methods
> which are flaky. (In practice they will be probably just @Ignore-ed with
> comment about flakiness so we can fix them later).
>
> Flaky tests are
>
>
> org.apache.cassandra.distributed.test.PaxosRepair2Test.paxosRepairHistoryIsntUpdatedInForcedRepair
>
> org.apache.cassandra.distributed.test.PaxosRepair2Test.legacyPurgeRepairLoop
>
>
> +1 to a waiver on these two 4.1 flaky regressions to the RC and GA
> releases.
>
> Thanks for bringing it back to dev@ Stefan. Waivers should be done on dev@
> (build/release managers can't be keeping up with every ticket), and dev
> threads and tickets should be kept (reasonably) in-sync, for the sake of
> inclusiveness.
>
> I believe there will be follow up tickets to address these flakies in
> 4.1.x ?
>
>
>

Re: Some tests are never executed in CI due to their name

Posted by Berenguer Blasi <be...@gmail.com>.
+1 to waiver

On 15/11/22 2:07, Josh McKenzie wrote:
> +1 to waiver.
>
> We still don't have some kind of @flaky annotation that sequesters 
> tests do we? :)
>
> On Mon, Nov 14, 2022, at 5:58 PM, Ekaterina Dimitrova wrote:
>> +1
>>
>> On Mon, 14 Nov 2022 at 17:55, Brandon Williams <dr...@gmail.com> wrote:
>>
>>     +1 to waiving these.
>>
>>     On Mon, Nov 14, 2022, 4:49 PM Miklosovic, Stefan
>>     <St...@netapp.com> wrote:
>>
>>         Tickets for the flaky tests are here
>>
>>         https://issues.apache.org/jira/browse/CASSANDRA-18047
>>         https://issues.apache.org/jira/browse/CASSANDRA-18048
>>
>>         ________________________________________
>>         From: Mick Semb Wever <mc...@apache.org>
>>         Sent: Monday, November 14, 2022 23:28
>>         To: dev@cassandra.apache.org
>>         Subject: Re: Some tests are never executed in CI due to their
>>         name
>>
>>         NetApp Security WARNING: This is an external email. Do not
>>         click links or open attachments unless you recognize the
>>         sender and know the content is safe.
>>
>>
>>
>>
>>         in CASSANDRA-18029, two flaky tests were committed by mistake
>>         due to my misunderstanding. We agreed on this thread that we
>>         should not commit flaky tests right before rc. So now the rc
>>         is technically blocked by them. To unblock it, what is needed
>>         is to have a waiver on them. If there is not a waiver, I need
>>         to go back to that test and remove the two test methods which
>>         are flaky. (In practice they will be probably just @Ignore-ed
>>         with comment about flakiness so we can fix them later).
>>
>>         Flaky tests are
>>
>>         org.apache.cassandra.distributed.test.PaxosRepair2Test.paxosRepairHistoryIsntUpdatedInForcedRepair
>>         org.apache.cassandra.distributed.test.PaxosRepair2Test.legacyPurgeRepairLoop
>>
>>
>>         +1 to a waiver on these two 4.1 flaky regressions to the RC
>>         and GA releases.
>>
>>         Thanks for bringing it back to dev@ Stefan. Waivers should be
>>         done on dev@ (build/release managers can't be keeping up with
>>         every ticket), and dev threads and tickets should be kept
>>         (reasonably) in-sync, for the sake of inclusiveness.
>>
>>         I believe there will be follow up tickets to address these
>>         flakies in 4.1.x ?
>>
>

Re: Some tests are never executed in CI due to their name

Posted by Josh McKenzie <jm...@apache.org>.
+1 to waiver.

We still don't have some kind of @flaky annotation that sequesters tests do we? :)

On Mon, Nov 14, 2022, at 5:58 PM, Ekaterina Dimitrova wrote:
> +1
> 
> On Mon, 14 Nov 2022 at 17:55, Brandon Williams <dr...@gmail.com> wrote:
>> +1 to waiving these.
>> 
>> On Mon, Nov 14, 2022, 4:49 PM Miklosovic, Stefan <St...@netapp.com> wrote:
>>> Tickets for the flaky tests are here
>>> 
>>> https://issues.apache.org/jira/browse/CASSANDRA-18047
>>> https://issues.apache.org/jira/browse/CASSANDRA-18048
>>> 
>>> ________________________________________
>>> From: Mick Semb Wever <mc...@apache.org>
>>> Sent: Monday, November 14, 2022 23:28
>>> To: dev@cassandra.apache.org
>>> Subject: Re: Some tests are never executed in CI due to their name
>>> 
>>> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
>>> 
>>> 
>>> 
>>> 
>>> in CASSANDRA-18029, two flaky tests were committed by mistake due to my misunderstanding. We agreed on this thread that we should not commit flaky tests right before rc. So now the rc is technically blocked by them. To unblock it, what is needed is to have a waiver on them. If there is not a waiver, I need to go back to that test and remove the two test methods which are flaky. (In practice they will be probably just @Ignore-ed with comment about flakiness so we can fix them later).
>>> 
>>> Flaky tests are
>>> 
>>> org.apache.cassandra.distributed.test.PaxosRepair2Test.paxosRepairHistoryIsntUpdatedInForcedRepair
>>> org.apache.cassandra.distributed.test.PaxosRepair2Test.legacyPurgeRepairLoop
>>> 
>>> 
>>> +1 to a waiver on these two 4.1 flaky regressions to the RC and GA releases.
>>> 
>>> Thanks for bringing it back to dev@ Stefan. Waivers should be done on dev@ (build/release managers can't be keeping up with every ticket), and dev threads and tickets should be kept (reasonably) in-sync, for the sake of inclusiveness.
>>> 
>>> I believe there will be follow up tickets to address these flakies in 4.1.x ?

Re: Some tests are never executed in CI due to their name

Posted by Ekaterina Dimitrova <e....@gmail.com>.
+1

On Mon, 14 Nov 2022 at 17:55, Brandon Williams <dr...@gmail.com> wrote:

> +1 to waiving these.
>
> On Mon, Nov 14, 2022, 4:49 PM Miklosovic, Stefan <
> Stefan.Miklosovic@netapp.com> wrote:
>
>> Tickets for the flaky tests are here
>>
>> https://issues.apache.org/jira/browse/CASSANDRA-18047
>> https://issues.apache.org/jira/browse/CASSANDRA-18048
>>
>> ________________________________________
>> From: Mick Semb Wever <mc...@apache.org>
>> Sent: Monday, November 14, 2022 23:28
>> To: dev@cassandra.apache.org
>> Subject: Re: Some tests are never executed in CI due to their name
>>
>> NetApp Security WARNING: This is an external email. Do not click links or
>> open attachments unless you recognize the sender and know the content is
>> safe.
>>
>>
>>
>>
>> in CASSANDRA-18029, two flaky tests were committed by mistake due to my
>> misunderstanding. We agreed on this thread that we should not commit flaky
>> tests right before rc. So now the rc is technically blocked by them. To
>> unblock it, what is needed is to have a waiver on them. If there is not a
>> waiver, I need to go back to that test and remove the two test methods
>> which are flaky. (In practice they will be probably just @Ignore-ed with
>> comment about flakiness so we can fix them later).
>>
>> Flaky tests are
>>
>>
>> org.apache.cassandra.distributed.test.PaxosRepair2Test.paxosRepairHistoryIsntUpdatedInForcedRepair
>>
>> org.apache.cassandra.distributed.test.PaxosRepair2Test.legacyPurgeRepairLoop
>>
>>
>> +1 to a waiver on these two 4.1 flaky regressions to the RC and GA
>> releases.
>>
>> Thanks for bringing it back to dev@ Stefan. Waivers should be done on
>> dev@ (build/release managers can't be keeping up with every ticket), and
>> dev threads and tickets should be kept (reasonably) in-sync, for the sake
>> of inclusiveness.
>>
>> I believe there will be follow up tickets to address these flakies in
>> 4.1.x ?
>>
>

Re: Some tests are never executed in CI due to their name

Posted by Brandon Williams <dr...@gmail.com>.
+1 to waiving these.

On Mon, Nov 14, 2022, 4:49 PM Miklosovic, Stefan <
Stefan.Miklosovic@netapp.com> wrote:

> Tickets for the flaky tests are here
>
> https://issues.apache.org/jira/browse/CASSANDRA-18047
> https://issues.apache.org/jira/browse/CASSANDRA-18048
>
> ________________________________________
> From: Mick Semb Wever <mc...@apache.org>
> Sent: Monday, November 14, 2022 23:28
> To: dev@cassandra.apache.org
> Subject: Re: Some tests are never executed in CI due to their name
>
> NetApp Security WARNING: This is an external email. Do not click links or
> open attachments unless you recognize the sender and know the content is
> safe.
>
>
>
>
> in CASSANDRA-18029, two flaky tests were committed by mistake due to my
> misunderstanding. We agreed on this thread that we should not commit flaky
> tests right before rc. So now the rc is technically blocked by them. To
> unblock it, what is needed is to have a waiver on them. If there is not a
> waiver, I need to go back to that test and remove the two test methods
> which are flaky. (In practice they will be probably just @Ignore-ed with
> comment about flakiness so we can fix them later).
>
> Flaky tests are
>
>
> org.apache.cassandra.distributed.test.PaxosRepair2Test.paxosRepairHistoryIsntUpdatedInForcedRepair
>
> org.apache.cassandra.distributed.test.PaxosRepair2Test.legacyPurgeRepairLoop
>
>
> +1 to a waiver on these two 4.1 flaky regressions to the RC and GA
> releases.
>
> Thanks for bringing it back to dev@ Stefan. Waivers should be done on dev@
> (build/release managers can't be keeping up with every ticket), and dev
> threads and tickets should be kept (reasonably) in-sync, for the sake of
> inclusiveness.
>
> I believe there will be follow up tickets to address these flakies in
> 4.1.x ?
>

Re: Some tests are never executed in CI due to their name

Posted by "Miklosovic, Stefan" <St...@netapp.com>.
Tickets for the flaky tests are here

https://issues.apache.org/jira/browse/CASSANDRA-18047
https://issues.apache.org/jira/browse/CASSANDRA-18048

________________________________________
From: Mick Semb Wever <mc...@apache.org>
Sent: Monday, November 14, 2022 23:28
To: dev@cassandra.apache.org
Subject: Re: Some tests are never executed in CI due to their name

NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.




in CASSANDRA-18029, two flaky tests were committed by mistake due to my misunderstanding. We agreed on this thread that we should not commit flaky tests right before rc. So now the rc is technically blocked by them. To unblock it, what is needed is to have a waiver on them. If there is not a waiver, I need to go back to that test and remove the two test methods which are flaky. (In practice they will be probably just @Ignore-ed with comment about flakiness so we can fix them later).

Flaky tests are

org.apache.cassandra.distributed.test.PaxosRepair2Test.paxosRepairHistoryIsntUpdatedInForcedRepair
org.apache.cassandra.distributed.test.PaxosRepair2Test.legacyPurgeRepairLoop


+1 to a waiver on these two 4.1 flaky regressions to the RC and GA releases.

Thanks for bringing it back to dev@ Stefan. Waivers should be done on dev@ (build/release managers can't be keeping up with every ticket), and dev threads and tickets should be kept (reasonably) in-sync, for the sake of inclusiveness.

I believe there will be follow up tickets to address these flakies in 4.1.x ?

Re: Some tests are never executed in CI due to their name

Posted by Mick Semb Wever <mc...@apache.org>.
> in CASSANDRA-18029, two flaky tests were committed by mistake due to my
> misunderstanding. We agreed on this thread that we should not commit flaky
> tests right before rc. So now the rc is technically blocked by them. To
> unblock it, what is needed is to have a waiver on them. If there is not a
> waiver, I need to go back to that test and remove the two test methods
> which are flaky. (In practice they will be probably just @Ignore-ed with
> comment about flakiness so we can fix them later).
>
> Flaky tests are
>
>
> org.apache.cassandra.distributed.test.PaxosRepair2Test.paxosRepairHistoryIsntUpdatedInForcedRepair
>
> org.apache.cassandra.distributed.test.PaxosRepair2Test.legacyPurgeRepairLoop
>


+1 to a waiver on these two 4.1 flaky regressions to the RC and GA releases.

Thanks for bringing it back to dev@ Stefan. Waivers should be done on dev@
(build/release managers can't be keeping up with every ticket), and dev
threads and tickets should be kept (reasonably) in-sync, for the sake of
inclusiveness.

I believe there will be follow up tickets to address these flakies in
4.1.x ?

Re: Some tests are never executed in CI due to their name

Posted by "Miklosovic, Stefan" <St...@netapp.com>.
Hi list,

in CASSANDRA-18029, two flaky tests were committed by mistake due to my misunderstanding. We agreed on this thread that we should not commit flaky tests right before rc. So now the rc is technically blocked by them. To unblock it, what is needed is to have a waiver on them. If there is not a waiver, I need to go back to that test and remove the two test methods which are flaky. (In practice they will be probably just @Ignore-ed with comment about flakiness so we can fix them later).

Flaky tests are

org.apache.cassandra.distributed.test.PaxosRepair2Test.paxosRepairHistoryIsntUpdatedInForcedRepair
org.apache.cassandra.distributed.test.PaxosRepair2Test.legacyPurgeRepairLoop

I sincerely apologize for messing things up and being drifted away by the ticket and forgetting about the ML thread which has the ultimate precedence in this matter. This should have been ideally requested before the commit and it was just an oversight about keeping both ticket and ML communication in sync.

Regards

________________________________________
From: Brandon Williams <dr...@gmail.com>
Sent: Thursday, November 10, 2022 13:20
To: dev@cassandra.apache.org
Subject: Re: Some tests are never executed in CI due to their name

NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.




+1 to treating as new unvetted tests.

Kind Regards,
Brandon

On Thu, Nov 10, 2022 at 5:41 AM Mick Semb Wever <mc...@apache.org> wrote:
>
>
>
> On Thu, 10 Nov 2022 at 10:18, Miklosovic, Stefan <St...@netapp.com> wrote:
>>
>> I want to ask if people are comfortable with merging CASSANDRA-17964 before 4.1 is out. We clean a lot of things and rename tests which we should take care of. (like  PaxosRepairTest2).
>>
>> We might still fix it without merging 17964 as we know what should be fixed and we can wait until 4.1 is out but I personally do not see any reason for this.
>
>
>
> Unless these are runtime failures that will hurt users running 4.1, let's treat them as new incomplete tests. So please let's not add new broken tests right before the rc.

Re: Some tests are never executed in CI due to their name

Posted by Brandon Williams <dr...@gmail.com>.
+1 to treating as new unvetted tests.

Kind Regards,
Brandon

On Thu, Nov 10, 2022 at 5:41 AM Mick Semb Wever <mc...@apache.org> wrote:
>
>
>
> On Thu, 10 Nov 2022 at 10:18, Miklosovic, Stefan <St...@netapp.com> wrote:
>>
>> I want to ask if people are comfortable with merging CASSANDRA-17964 before 4.1 is out. We clean a lot of things and rename tests which we should take care of. (like  PaxosRepairTest2).
>>
>> We might still fix it without merging 17964 as we know what should be fixed and we can wait until 4.1 is out but I personally do not see any reason for this.
>
>
>
> Unless these are runtime failures that will hurt users running 4.1, let's treat them as new incomplete tests. So please let's not add new broken tests right before the rc.

Re: Some tests are never executed in CI due to their name

Posted by Mick Semb Wever <mc...@apache.org>.
On Thu, 10 Nov 2022 at 10:18, Miklosovic, Stefan <
Stefan.Miklosovic@netapp.com> wrote:

> I want to ask if people are comfortable with merging CASSANDRA-17964
> before 4.1 is out. We clean a lot of things and rename tests which we
> should take care of. (like  PaxosRepairTest2).
>
> We might still fix it without merging 17964 as we know what should be
> fixed and we can wait until 4.1 is out but I personally do not see any
> reason for this.
>


Unless these are runtime failures that will hurt users running 4.1, let's
treat them as new incomplete tests. So please let's not add new broken
tests right before the rc.

Re: Some tests are never executed in CI due to their name

Posted by "Miklosovic, Stefan" <St...@netapp.com>.
I want to ask if people are comfortable with merging CASSANDRA-17964 before 4.1 is out. We clean a lot of things and rename tests which we should take care of. (like  PaxosRepairTest2).

We might still fix it without merging 17964 as we know what should be fixed and we can wait until 4.1 is out but I personally do not see any reason for this.

Thoughts?

________________________________________
From: Miklosovic, Stefan <St...@netapp.com>
Sent: Monday, November 7, 2022 23:23
To: dev@cassandra.apache.org
Subject: Re: Some tests are never executed in CI due to their name

Hi,

A while ago we came up with a very nice solution (thank you David Capwell for suggesting this). Sorry for no updating this thread sooner, a lot of discussion was done in the ticket. (CASSANDRA-17964).

The solution we think is the best is to have an Ant task which would scan all classes via reflection on build-test target (separate / dedicated target will be available too) and it will scan files on containing Test annotation. If that annotation is present, it means that class is a test and it has to have so and so naming convention (currently it has to end on Test). There are details covered as well (for example, cases when there is an abstract class containing all @Test methods and a concrete test class is just extending that, all is covered too).

The reason for this approach is that:

1) it is way more robust that trying to come up with some super regexp
2) We can put arbitrary logic into it, for example, we may ignore the invalid name of a test class if we "know what we are doing" but putting @IgnoreInvalidTestName annotation on top of that class and Ant task would just skip that.
3) We can introduce this task into all branches we support. Checkstyle is present only in 4.1 and trunk so we would have kind of half-baked solution here and older branches would not be covered.

As we enabled more tests because they were ignored / effectively skipped previously, some enabled tests started to fail. There are two (currently know):

(1) This one started to fail on 4.0, not sure whats the deal for older branches, it probably needs deeper look
(2) PaxorRepairTest2 - this one fails to start because of configuration in that test but even it is fixed it fails anyway on something else. I will create a ticket for that soon.

I will reference this email to Cassandra update status to raise awareness that some tests we enabled started to fail in 4.0 -> trunk.

1) https://issues.apache.org/jira/browse/CASSANDRA-18021
2) ticket to be created

Thanks

________________________________________
From: Berenguer Blasi <be...@gmail.com>
Sent: Thursday, November 3, 2022 9:47
To: dev@cassandra.apache.org
Subject: Re: Some tests are never executed in CI due to their name

NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.




Re '@Test' without 'abstract': There's a corner case of a base abstract class with all test cases. Then 2 or 3 inheriting classes with no '@Test' annotations bc they just run the base class with different configs, setups, envs, parametrized runs, etc I've seen some of those.

On 1/11/22 22:54, Mick Semb Wever wrote:


Aha wait a minute I got it :D Yeah. This is not what you invented, it is already in, sorry.

So, you are saying we should just modify this to cover @Test / abstract.

I dont know ... interesting.

Lets wait for others to answer.


This thread is rather painful to digest because of the top-posting :'(

I agree that all non-abstract classes that contain a @test annotation must have a filename ending in "Test.java"

That leaves CI scripts untouched.

How we enforce that I have no opinion about. If it's not easy in checkstyle, just do it with some bash in the cassandra-test.sh script. It'll get caught pre-commit either way, and once the rename of existing files is done the precedence is clearly in place.

Re: Some tests are never executed in CI due to their name

Posted by "Miklosovic, Stefan" <St...@netapp.com>.
Hi,

A while ago we came up with a very nice solution (thank you David Capwell for suggesting this). Sorry for no updating this thread sooner, a lot of discussion was done in the ticket. (CASSANDRA-17964).

The solution we think is the best is to have an Ant task which would scan all classes via reflection on build-test target (separate / dedicated target will be available too) and it will scan files on containing Test annotation. If that annotation is present, it means that class is a test and it has to have so and so naming convention (currently it has to end on Test). There are details covered as well (for example, cases when there is an abstract class containing all @Test methods and a concrete test class is just extending that, all is covered too).

The reason for this approach is that:

1) it is way more robust that trying to come up with some super regexp
2) We can put arbitrary logic into it, for example, we may ignore the invalid name of a test class if we "know what we are doing" but putting @IgnoreInvalidTestName annotation on top of that class and Ant task would just skip that.
3) We can introduce this task into all branches we support. Checkstyle is present only in 4.1 and trunk so we would have kind of half-baked solution here and older branches would not be covered.

As we enabled more tests because they were ignored / effectively skipped previously, some enabled tests started to fail. There are two (currently know):

(1) This one started to fail on 4.0, not sure whats the deal for older branches, it probably needs deeper look
(2) PaxorRepairTest2 - this one fails to start because of configuration in that test but even it is fixed it fails anyway on something else. I will create a ticket for that soon.

I will reference this email to Cassandra update status to raise awareness that some tests we enabled started to fail in 4.0 -> trunk.

1) https://issues.apache.org/jira/browse/CASSANDRA-18021
2) ticket to be created

Thanks

________________________________________
From: Berenguer Blasi <be...@gmail.com>
Sent: Thursday, November 3, 2022 9:47
To: dev@cassandra.apache.org
Subject: Re: Some tests are never executed in CI due to their name

NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.




Re '@Test' without 'abstract': There's a corner case of a base abstract class with all test cases. Then 2 or 3 inheriting classes with no '@Test' annotations bc they just run the base class with different configs, setups, envs, parametrized runs, etc I've seen some of those.

On 1/11/22 22:54, Mick Semb Wever wrote:


Aha wait a minute I got it :D Yeah. This is not what you invented, it is already in, sorry.

So, you are saying we should just modify this to cover @Test / abstract.

I dont know ... interesting.

Lets wait for others to answer.


This thread is rather painful to digest because of the top-posting :'(

I agree that all non-abstract classes that contain a @test annotation must have a filename ending in "Test.java"

That leaves CI scripts untouched.

How we enforce that I have no opinion about. If it's not easy in checkstyle, just do it with some bash in the cassandra-test.sh script. It'll get caught pre-commit either way, and once the rename of existing files is done the precedence is clearly in place.

Re: Some tests are never executed in CI due to their name

Posted by Berenguer Blasi <be...@gmail.com>.
Re '@Test' without 'abstract': There's a corner case of a base abstract 
class with all test cases. Then 2 or 3 inheriting classes with no 
'@Test' annotations bc they just run the base class with different 
configs, setups, envs, parametrized runs, etc I've seen some of those.

On 1/11/22 22:54, Mick Semb Wever wrote:
>
>
>     Aha wait a minute I got it :D Yeah. This is not what you invented,
>     it is already in, sorry.
>
>     So, you are saying we should just modify this to cover @Test /
>     abstract.
>
>     I dont know ... interesting.
>
>     Lets wait for others to answer.
>
>
>
> This thread is rather painful to digest because of the top-posting :'(
>
> I agree that all non-abstract classes that contain a @test annotation 
> must have a filename ending in "Test.java"
>
> That leaves CI scripts untouched.
>
> How we enforce that I have no opinion about. If it's not easy in 
> checkstyle, just do it with some bash in the cassandra-test.sh script. 
> It'll get caught pre-commit either way, and once the rename of 
> existing files is done the precedence is clearly in place.

Re: Some tests are never executed in CI due to their name

Posted by Mick Semb Wever <mc...@apache.org>.
Aha wait a minute I got it :D Yeah. This is not what you invented, it is
> already in, sorry.
>
> So, you are saying we should just modify this to cover @Test / abstract.
>
> I dont know ... interesting.
>
> Lets wait for others to answer.
>


This thread is rather painful to digest because of the top-posting :'(

I agree that all non-abstract classes that contain a @test annotation must
have a filename ending in "Test.java"

That leaves CI scripts untouched.

How we enforce that I have no opinion about. If it's not easy in
checkstyle, just do it with some bash in the cassandra-test.sh script.
It'll get caught pre-commit either way, and once the rename of existing
files is done the precedence is clearly in place.

Re: Some tests are never executed in CI due to their name

Posted by "Miklosovic, Stefan" <St...@netapp.com>.
Aha wait a minute I got it :D Yeah. This is not what you invented, it is already in, sorry.

So, you are saying we should just modify this to cover @Test / abstract.

I dont know ... interesting.

Lets wait for others to answer.

________________________________________
From: Josh McKenzie <jm...@apache.org>
Sent: Monday, October 31, 2022 20:36
To: dev
Subject: Re: Some tests are never executed in CI due to their name

NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.



I just feel we need to plumb this on quite a low level to achieve this. Why not to put a 5 line checkstyle in place, rename few classes and move on.
https://github.com/apache/cassandra-builds/blob/trunk/build-scripts/cassandra-test.sh#L17-L21

# lists all tests for the specific test type
_list_tests() {
    local -r classlistprefix="$1"
    find "test/$classlistprefix" -name '*Test.java' | sed "s;^test/$classlistprefix/;;g" | sort
}

Shouldn't be that painful should it?

On Mon, Oct 31, 2022, at 3:21 PM, Miklosovic, Stefan wrote:
>> It's unclear to me how the combination of "Has @Test, is not an abstract class top level matching file, and is inside the test folder" is going to significantly break anything. Maybe I'm missing something; would love to learn.

I just don't see where you want to "hook" this step exactly either in the CI run or in build.xml or what have you. My wild guess is that this translates to some kind of a bash script, right (checking if it has @Test and not abstract) ? And then what? So we have test.name property, so your idea is to somehow get the list of classes and then you want to feed it to test.name property? That property would be very long if we are going to enumerate whole test suite with package names as well, not super user friendly if you ask me. I just feel we need to plumb this on quite a low level to achieve this. Why not to put a 5 line checkstyle in place, rename few classes and move on.

I would still love to have file names somehow consistent so they are also visually easily recognizable (what is test and what not) in IDE or similar.

Also, I would like to hear opinions of others when it comes to this second idea.

________________________________________
From: Josh McKenzie <jm...@apache.org>>
Sent: Monday, October 31, 2022 19:25
To: dev
Subject: Re: Some tests are never executed in CI due to their name

NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.



So to be clear, I don't really feel super strongly on this, I just thought Derek's suggestion seemed clean and elegant. Checkstyle to force a certain naming convention on things is fine enough; I certainly wouldn't stand in the way. Tooling like that self-educates users (see: Guardrails) so it's pretty low friction to the community IMO.

Secondly, I still do not know how you want to practically implement what you suggest without significantly breaking anything.
It's unclear to me how the combination of "Has @Test, is not an abstract class top level matching file, and is inside the test folder" is going to significantly break anything. Maybe I'm missing something; would love to learn.

how to actually be sure that a developer knows that his test not only passed but it was selected to be run
Andres' changes to .circleci/generate.sh mean you're going to see a _repeat run of whatever tests you change or add. I think both approaches (heuristic to determine if a file contains tests to run vs. checkstyle enforced naming convention) are more or less comparable here.

On Mon, Oct 31, 2022, at 12:26 PM, Miklosovic, Stefan wrote:
I think we are already adhering to them without any effort. Overwhelming majority of devs are naming their tests "just right" and I do not think they are doing that with any significant effort, probably none. We are only re-enforcing the unspoken everybody implicitly agreed on. If somebody does not name their tests like the rest, I think it would raise the red flags already during a review.

Secondly, I still do not know how you want to practically implement what you suggest without significantly breaking anything.

I think the crucial question we should ask is how to actually be sure that a developer knows that his test not only passed but it was selected to be run (now based on *Test). How did these tests which skipped as of now end up there in the first place? Somebody just committed them, run the CI and "it was green" so they proceed with the commit. But they did not check it was actually run because "why not?".

So, we are putting a kind of detector in place to signal that a test is not going to be actually even run when the checkstyle fails.

________________________________________
From: Josh McKenzie <jm...@apache.org>>>
Sent: Monday, October 31, 2022 17:16
To: dev
Subject: Re: Some tests are never executed in CI due to their name

NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.



I do not think that having a test class called "SdsdakASsdadsada" which happen to contain @Test and is not abstract should be considered to be a valid file name / test class
I mean, I wouldn't recommend it certainly, but I'm not concerned about whether a file is "TestThing" or "ThingTest" or "WeTestThing". The ordering of names on those files adds very little IMO.

I do not think we are actually winning anything with that approach.
Well, the name of this thread is "Some tests are never executed in CI due to their name" and it fixes that in a robust way without asking of any behavior changes of the project community to adhere to an arbitrary rigid naming guideline for something. So there's that. :)

On Mon, Oct 31, 2022, at 8:02 AM, Miklosovic, Stefan wrote:
I always hit "send" before realizing I have other points to add :)

I was thinking we want to go with checkstyle and check the name. Not manually parsing the file names. Do not we also want these test files to look consistent? I do not think that having a test class called "SdsdakASsdadsada" which happen to contain @Test and is not abstract should be considered to be a valid file name / test class. Do not we also want to enforce some regularity into this?

Another point is that even we parsed we so it will be picked up while tests are executed, that means that it will not be checked on the commit?

Lastly, how do you actually want to do this if we currently have test.name property which is a simple regexp (*Test)? -Dtest.name is pretty much how we do this currently and I am not sure we want to break this in all CI scripts.

I do not think we are actually winning anything with that approach.

Regards

________________________________________
From: Josh McKenzie <jm...@apache.org>>>>
Sent: Monday, October 31, 2022 12:49
To: dev
Subject: Re: Some tests are never executed in CI due to their name

NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.



To Derek's point, why not go with:

Would it be sufficient to look for files that don't end in "Test.java" but contain "@Test" annotations?
+ exclude files that contain "abstract class" in them that matches the file name?

Combine that w/checking specific subdirectories and we relax the required naming convention quite a bit; makes it easy to do the right thing and hard to do the wrong thing.

It's also very simple which is a feature.

On Fri, Oct 28, 2022, at 4:17 AM, Miklosovic, Stefan wrote:
Hi Berenguer,

I agree. We can always go heavy-weight anyway. I will dedicate more time to that the next week. It would be nice if you tagged along and help with the review.

FYI the ticket is touching all branches we support (there are some skipped tests in 3.0 too), we would just add checkstyle to 4.1 and trunk as discussed, so I expect we fix all 5 branches at once.

Regards

Stefan

________________________________________
From: Berenguer Blasi <be...@gmail.com>>>>>
Sent: Friday, October 28, 2022 9:43
To: dev@cassandra.apache.org<ma...@cassandra.apache.org>>>>
Subject: Re: Some tests are never executed in CI due to their name

NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.




Hi Stefan,

seems none of the 2 proposals are strongly opposed to. If you feel more comfortable with option 1 I say we do that, which will always be a step in the right direction. If we see it bites us again in the future then we do 2. But at least we bring this benefit/ticket in. I would also be happy to drive option 2 to completion if needed.

wdyt?

On 25/10/22 14:35, Berenguer Blasi wrote:

Well examples lend themselves to many things. I am sure I could come up with examples in the opposite direction. I'd rather talk in terms of the actual files and tests we have.

What I was trying to clarify is what I was misunderstanding as I had the impression now we were saying option 1 would not leak again.

So option 2, given ant and CI seem to rely on the *Test.java notation and also prevents future leaks doesn't seem to me a bad idea. Renaming the offending files seems a reasonable cost to me (Eclipse does it for me).

My 2cts! But hey, I've been known to be wrong before :shrug: lol

On 25/10/22 12:04, Andrés de la Peña wrote:
Note that the test multiplexer also searches for tests ending with "Test", so it will also miss new or modified tests with nonstandard names. The automatically detected tests are listed when running generate.sh, and they are added to the repeated run jobs. That gives us another opportunity to detect missed tests when we introduce something like "MyTestSplit1.java" and see that it's not been included in the mandatory repeated runs.

On Tue, 25 Oct 2022 at 10:31, Miklosovic, Stefan <St...@netapp.com>>>>>> wrote:
Hi Berenguer,

I am glad you asked. I was expecting this question.

I think there is a fundamental difference in how we approach this problem. I do not say mine is better, I just find it important to describe it clearly.

Let's say we are on a ship which leaks. When I detect that, my course of action is to fix the leakage with the biggest patches I can find around with minimal effort necessary so we are not taking water anymore. There might still be occasional leakages which are very rare but I have a crew who is checking the ship constantly anyway (review) and the risk that big leakages like we just fixed happen again are relatively very low.

You spot a leakage and instead of fixing it with big patches and calling it a day, you are trying to remodel the cabins so they are completely waterproof but while doing so, you renamed them so everyone needs to get used to how these cabins are called and where they are located, because there is this minimal chance that some cadet comes around and starts to live in a cabin he is not supposed to and we need to explain it to him. Thousands of cadets found their cabins just fine. Occasionally, there are few people who just miss the right board completely (Test at start, Tests at the end) and they are automatically navigated around (checkstyle) but once in five years there is this guy who just completely missed it.

I believe we can just navigate him when that happens. You want to cover that guy too.

I just find my approach easier.

You can remodel it all, for sure but I am afraid I will not be a part of that. I just do not find it necessary to do that.

________________________________________
From: Berenguer Blasi <be...@gmail.com>>>>>>
Sent: Tuesday, October 25, 2022 11:08
To: dev@cassandra.apache.org<ma...@cassandra.apache.org>>>>>
Subject: Re: Some tests are never executed in CI due to their name

NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.




IIUC we're relying on catching the word 'Split' in the file name for option 1. If somebody named his test i.e. 'MyTestGroup1', 'MyTestGroup2', 'TTLTestPre2038', 'TTLTestPost2038',... I think we would leak tests again? or any other word that is not specifically accounted for. Unless I am missing sthg ofc! :-)

On 25/10/22 10:39, Miklosovic, Stefan wrote:
> I think that what you wrote is not entirely correct. It will prevent it from happening again when there are tests ending on "Tests" or starting on "Test". The only case it will not cover is "SplitN" issue we plan to cover with relaxed test.name<http://test.name> property.
>
> It seems like what you wrote means that we will fix it and tests will leak again. That is not true.
>
> ________________________________________
> From: Berenguer Blasi <be...@gmail.com>>>>>>
> Sent: Tuesday, October 25, 2022 7:26
> To: dev@cassandra.apache.org<ma...@cassandra.apache.org>>>>>
> Subject: Re: Some tests are never executed in CI due to their name
>
> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
>
>
>
>
> The problem with using the first approach is that it fixes the current
> situation but it doesn't prevent it from happening again. The second
> proposal prevents it from happening again but at the cost of a bigger
> rename I'd volunteer to if needed.
>
> Regards
>
> On 24/10/22 20:38, Miklosovic, Stefan wrote:
>> Yeah, that is what the branch in my original email actually already solved. I mean ...
>>
>> CassandraAuthorizerTruncatingTests
>> BatchTests
>> UUIDTests
>>
>> these are ending on Tests which is illegal and they are fixed there.
>>
>> Other cases are either "TestBase" or "Tester" which should be legal.
>>
>> I think using the first approach and fixing all "SplitN" PLUS adding Split* to test.name<http://test.name> regexp should do it.
>>
>> I think we can "automate" at most like you suggest and scan it manually, fix the stuff and from then incorporate checkstyle to take care of that.
>>
>> There are also some classes which do include @Test methods but I think they are abstract as they are meant to be extended as the real test is just wrapping that. This might happen when there are slight variations across test classes. This is fine as well.
>>
>> ________________________________________
>> From: Derek Chen-Becker <de...@chen-becker.org>>>>>>
>> Sent: Monday, October 24, 2022 19:18
>> To: dev@cassandra.apache.org<ma...@cassandra.apache.org>>>>>
>> Subject: Re: Some tests are never executed in CI due to their name
>>
>> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
>>
>>
>>
>> OK, I think the approach you're proposing is fine. As an orthogonal idea, is there any way to do a one-time audit to narrow down files that we need to inspect, or any way to automate confirmation of a file containing tests? For example:
>>
>> * Start with all of the files under the `test` directory (is it safe to assume that this is the only location for test classes?) => 1457 files
>> * Remove any files that already end in "Test.java" => down to 396 files
>> * Is it safe to assume that only files having "Test" somewhere in their name are test classes? That reduces the list down to 60 files
>>
>> Alternatively, are we only talking about unit tests? Would it be sufficient to look for files that don't end in "Test.java" but contain "@Test" annotations? That's a significantly smaller set, so maybe we could fix unit tests first:
>>
>> ??? for fl in test/unit/**/*.java; do if [[ $fl != *Test.java ]]; then if grep -qF '@Test' $fl; then print $fl; fi; fi; done                                                                                                                                                              circleci-parity ? ?
>> test/unit/org/apache/cassandra/auth/CassandraAuthorizerTruncatingTests.java
>> test/unit/org/apache/cassandra/cql3/BatchTests.java
>> test/unit/org/apache/cassandra/cql3/KeywordTestBase.java
>> test/unit/org/apache/cassandra/db/guardrails/GuardrailAllowUncompressedTables.java
>> test/unit/org/apache/cassandra/db/guardrails/GuardrailConsistencyLevelsTester.java
>> test/unit/org/apache/cassandra/db/guardrails/GuardrailSecondaryIndexTester.java
>> test/unit/org/apache/cassandra/db/guardrails/GuardrailSecondaryIndexesPerTable.java
>> test/unit/org/apache/cassandra/db/guardrails/ThresholdTester.java
>> test/unit/org/apache/cassandra/dht/PartitionerTestCase.java
>> test/unit/org/apache/cassandra/utils/UUIDTests.java
>>
>> Cheers,
>>
>> Derek
>>
>>
>> On Mon, Oct 24, 2022 at 9:00 AM Miklosovic, Stefan <St...@netapp.com>>>>>>> wrote:
>> One more point as I was reading your email in a hurry. The ultimate goal is to run all tests. So even if we apply the proposed regexp in checkstyle and there is an unfortunate case that MyTestSplit1 would slip through on a review and it would not be executed in CI, if we do not figure out some better regexp, we might relax "test.name<http://test.name><http://test.name>" property to include Split*.java as well. Yes, this might be also done.
>>
>> Enforcing "all tests to end on "Test" " means that we would need to know which .java files contain tests and which not without looking into them. I do not think that is possible under circumstances we have where (checkstyle module).
>>
>> ________________________________________
>> From: Derek Chen-Becker <de...@chen-becker.org>>>>>>>
>> Sent: Monday, October 24, 2022 15:53
>> To: dev@cassandra.apache.org<ma...@cassandra.apache.org>>>>>>
>> Subject: Re: Some tests are never executed in CI due to their name
>>
>> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
>>
>>
>>
>> OK, that makes sense, and I missed the rename of "TestBaseImpl" (which I agree is in need of fixing). When you say:
>>
>>> What we are trying to achieve by that is to have a test which ends on "Test" and nothing else and it may contain "Test" only in case it does not start with it.
>> Should the regex enforce the "which ends on Test" part? Otherwise, if we want to allow tests like MyTestSplit1, do we need to update the "test.name<http://test.name><http://test.name><http://test.name>" parameter to allow that in the test runner?
>>
>> Thanks,
>>
>> Derek
>>
>>
>> On Mon, Oct 24, 2022 at 7:27 AM Miklosovic, Stefan <St...@netapp.com>>>>>>>> wrote:
>> Hi Derek,
>>
>> yes, the regexp is ^(?!(?:Test))(?!.*?(?:Tests)).*$
>>
>> What we are trying to achieve by that is to have a test which ends on "Test" and nothing else and it may contain "Test" only in case it does not start with it.
>>
>> To your second paragraph, I do not think we have any *Base files which are tests (that would be a bug) nor they are standalone (not inherited). By first approach, TestBaseImpl (which we do have) would start to be invalid, we would have to rename it to "DistributedTestBaseImpl" because it was starting on "Test" which is no-no. I agree that "TestBaseImpl" is rather unfortunate name. Currently, TestBaseImpl itself extends "DistributedTestBase" so in the end we would have "DistributedTestBaseImpl -> TestBaseImpl -> DistributedTestBase". Please keep in mind that "DistributedTestBase" is located in dtest jar (separate project) and that is rather inconvenient to rename.
>>
>> Regards
>>
>> ________________________________________
>> From: Derek Chen-Becker <de...@chen-becker.org>>>>>>>>
>> Sent: Monday, October 24, 2022 15:09
>> To: dev@cassandra.apache.org<ma...@cassandra.apache.org>>>>>>>
>> Subject: Re: Some tests are never executed in CI due to their name
>>
>> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
>>
>>
>>
>> For clarity, what is the final regex you've landed on for the first approach? Is it "^(?!(?:Test))(?!.*?(?:Tests)).*$" ? Are we just trying to reject anything that *starts* with "Test" or contains "Tests" somewhere in the name? It might be more clear to state what we think a valid test name should be, not in regex form.
>>
>> When it comes to "TestBase" files, are there any of these that are actually intended to be run as a test? The "Base" would indicate to me that it's intended to be inherited, not run directly. Do we have some "TestBase" classes somewhere that aren't inherited and are meant to be run directly, because that definitely seems like a misleading name that should be fixed, invasive or not. I would lean toward the principle of least surprise here, even if it means an involved one-time cleanup.
>>
>> Cheers,
>>
>> Derek
>>
>> On Mon, Oct 24, 2022 at 3:02 AM Miklosovic, Stefan <St...@netapp.com>>>>>>>>> wrote:
>> Hi list,
>>
>> it was reported that some tests are not executed as part of CI because of their name (1). There is (2) and (3) so right now it runs only "*Test.java" files.
>>
>> Except of us renaming the affected classes, we should fix this in a more robust way - via checkstyle - so we enforce how test files have to be called.
>>
>> We came up with two approaches and we do not know which one to choose so we want to gather more feedback / opinions.
>>
>> The first approach would make this happen (implemented here (4))
>>
>> class name - passes / fails the checkstyle
>>
>> TestIt - fails
>> TestsThatFeature - fails
>> ThisTestsMyFeature - fails
>> SomeTests - fails
>>
>> SomeHelperClassForTesting - passes
>> SomeTest - passes
>> DistributedTestBase - passes
>>
>> MyTestSplit1 - passes
>>
>> We would fix "SplitN" test files by hand (I think they are not executed right now in CI either) to something like MySplit1Test and MySplit2Test but the regexp we would eventually use would not prevent this from happening in the future. I do not know how to construct such a complex regexp yet which would cover all cases above plus this splitting one. Feel free to look into the ticket where details are discussed.
>>
>> The second approach would forbid to have any occurrence of "test" in the file name but at the end. For example, it would fail on classes like these
>>
>> MyTestSplit1
>> TestHelper
>> BurnTestUtil
>> FuzzTestBase
>> DistributedTestSnitch
>> CASCommonTestCases
>> ServerTestUtils
>> AuthTestUtils
>> TestMemtable
>>
>> While this would also fix "MyTestSplitN" cases, I consider this to be way more invasive to the current codebase because we would have to rename a lot of files (like, hundreds, plus a lot of places they are referenced / used at) and I do not even think this is necessary and desirable. Like, how would we call "FuzzTestBase" to not contain "Test" in it? I think having "test" string in the middle is just fine, specifically for helper classes. One possible fix would be to replace "Test" with "Tester" so we would have "TesterMemtable", "AuthTesterUtils", "TesterHelper", "DistributedTesterSnitch" ... Word "Tester" would be explicitly allowed.
>>
>> I personally lean to the first approach even though cases like "MyTestSplit1" will not be spotted in compile time, but hey, we still do have a review process on top of this. The obvious violations like "TestThisFeature" and "MySuperTests" would be evaluated as violations which I would argue happens the most often.
>>
>> Please keep in mind that we have checkstyle only in 4.1 and trunk. So, while we would fix names of classes in all branches we support so they are picked up by CI, checkstyle for this would be introduced in 4.1 and trunk only.
>>
>> (1) https://issues.apache.org/jira/browse/CASSANDRA-17964
>> (2) https://github.com/apache/cassandra/blob/trunk/build.xml#L61
>> (3) https://github.com/apache/cassandra/blob/trunk/build.xml#L1033
>> (4) https://github.com/instaclustr/cassandra/commits/CASSANDRA-17964-4.1-2
>>
>>
>> --
>> +---------------------------------------------------------------+
>> | Derek Chen-Becker                                             |
>> | GPG Key available at https://keybase.io/dchenbecker and       |
>> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
>> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
>> +---------------------------------------------------------------+
>>
>>
>>
>> --
>> +---------------------------------------------------------------+
>> | Derek Chen-Becker                                             |
>> | GPG Key available at https://keybase.io/dchenbecker and       |
>> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
>> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
>> +---------------------------------------------------------------+
>>
>>
>>
>> --
>> +---------------------------------------------------------------+
>> | Derek Chen-Becker                                             |
>> | GPG Key available at https://keybase.io/dchenbecker and       |
>> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
>> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
>> +---------------------------------------------------------------+
>>









Re: Some tests are never executed in CI due to their name

Posted by "Miklosovic, Stefan" <St...@netapp.com>.
Finish it all to the completion be if you don't mind. To see it all in action. Again, I don't know what to do with this function, sorry. Why is it in builds? So standalone build.xml and ant target would be without this or we would need to duplicate?

I am by no mean against this exploration, I would like to see it all playing together, that is all.

________________________________________
From: Josh McKenzie <jm...@apache.org>
Sent: Monday, October 31, 2022 20:36
To: dev
Subject: Re: Some tests are never executed in CI due to their name

NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.



I just feel we need to plumb this on quite a low level to achieve this. Why not to put a 5 line checkstyle in place, rename few classes and move on.
https://github.com/apache/cassandra-builds/blob/trunk/build-scripts/cassandra-test.sh#L17-L21

# lists all tests for the specific test type
_list_tests() {
    local -r classlistprefix="$1"
    find "test/$classlistprefix" -name '*Test.java' | sed "s;^test/$classlistprefix/;;g" | sort
}

Shouldn't be that painful should it?

On Mon, Oct 31, 2022, at 3:21 PM, Miklosovic, Stefan wrote:
>> It's unclear to me how the combination of "Has @Test, is not an abstract class top level matching file, and is inside the test folder" is going to significantly break anything. Maybe I'm missing something; would love to learn.

I just don't see where you want to "hook" this step exactly either in the CI run or in build.xml or what have you. My wild guess is that this translates to some kind of a bash script, right (checking if it has @Test and not abstract) ? And then what? So we have test.name property, so your idea is to somehow get the list of classes and then you want to feed it to test.name property? That property would be very long if we are going to enumerate whole test suite with package names as well, not super user friendly if you ask me. I just feel we need to plumb this on quite a low level to achieve this. Why not to put a 5 line checkstyle in place, rename few classes and move on.

I would still love to have file names somehow consistent so they are also visually easily recognizable (what is test and what not) in IDE or similar.

Also, I would like to hear opinions of others when it comes to this second idea.

________________________________________
From: Josh McKenzie <jm...@apache.org>>
Sent: Monday, October 31, 2022 19:25
To: dev
Subject: Re: Some tests are never executed in CI due to their name

NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.



So to be clear, I don't really feel super strongly on this, I just thought Derek's suggestion seemed clean and elegant. Checkstyle to force a certain naming convention on things is fine enough; I certainly wouldn't stand in the way. Tooling like that self-educates users (see: Guardrails) so it's pretty low friction to the community IMO.

Secondly, I still do not know how you want to practically implement what you suggest without significantly breaking anything.
It's unclear to me how the combination of "Has @Test, is not an abstract class top level matching file, and is inside the test folder" is going to significantly break anything. Maybe I'm missing something; would love to learn.

how to actually be sure that a developer knows that his test not only passed but it was selected to be run
Andres' changes to .circleci/generate.sh mean you're going to see a _repeat run of whatever tests you change or add. I think both approaches (heuristic to determine if a file contains tests to run vs. checkstyle enforced naming convention) are more or less comparable here.

On Mon, Oct 31, 2022, at 12:26 PM, Miklosovic, Stefan wrote:
I think we are already adhering to them without any effort. Overwhelming majority of devs are naming their tests "just right" and I do not think they are doing that with any significant effort, probably none. We are only re-enforcing the unspoken everybody implicitly agreed on. If somebody does not name their tests like the rest, I think it would raise the red flags already during a review.

Secondly, I still do not know how you want to practically implement what you suggest without significantly breaking anything.

I think the crucial question we should ask is how to actually be sure that a developer knows that his test not only passed but it was selected to be run (now based on *Test). How did these tests which skipped as of now end up there in the first place? Somebody just committed them, run the CI and "it was green" so they proceed with the commit. But they did not check it was actually run because "why not?".

So, we are putting a kind of detector in place to signal that a test is not going to be actually even run when the checkstyle fails.

________________________________________
From: Josh McKenzie <jm...@apache.org>>>
Sent: Monday, October 31, 2022 17:16
To: dev
Subject: Re: Some tests are never executed in CI due to their name

NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.



I do not think that having a test class called "SdsdakASsdadsada" which happen to contain @Test and is not abstract should be considered to be a valid file name / test class
I mean, I wouldn't recommend it certainly, but I'm not concerned about whether a file is "TestThing" or "ThingTest" or "WeTestThing". The ordering of names on those files adds very little IMO.

I do not think we are actually winning anything with that approach.
Well, the name of this thread is "Some tests are never executed in CI due to their name" and it fixes that in a robust way without asking of any behavior changes of the project community to adhere to an arbitrary rigid naming guideline for something. So there's that. :)

On Mon, Oct 31, 2022, at 8:02 AM, Miklosovic, Stefan wrote:
I always hit "send" before realizing I have other points to add :)

I was thinking we want to go with checkstyle and check the name. Not manually parsing the file names. Do not we also want these test files to look consistent? I do not think that having a test class called "SdsdakASsdadsada" which happen to contain @Test and is not abstract should be considered to be a valid file name / test class. Do not we also want to enforce some regularity into this?

Another point is that even we parsed we so it will be picked up while tests are executed, that means that it will not be checked on the commit?

Lastly, how do you actually want to do this if we currently have test.name property which is a simple regexp (*Test)? -Dtest.name is pretty much how we do this currently and I am not sure we want to break this in all CI scripts.

I do not think we are actually winning anything with that approach.

Regards

________________________________________
From: Josh McKenzie <jm...@apache.org>>>>
Sent: Monday, October 31, 2022 12:49
To: dev
Subject: Re: Some tests are never executed in CI due to their name

NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.



To Derek's point, why not go with:

Would it be sufficient to look for files that don't end in "Test.java" but contain "@Test" annotations?
+ exclude files that contain "abstract class" in them that matches the file name?

Combine that w/checking specific subdirectories and we relax the required naming convention quite a bit; makes it easy to do the right thing and hard to do the wrong thing.

It's also very simple which is a feature.

On Fri, Oct 28, 2022, at 4:17 AM, Miklosovic, Stefan wrote:
Hi Berenguer,

I agree. We can always go heavy-weight anyway. I will dedicate more time to that the next week. It would be nice if you tagged along and help with the review.

FYI the ticket is touching all branches we support (there are some skipped tests in 3.0 too), we would just add checkstyle to 4.1 and trunk as discussed, so I expect we fix all 5 branches at once.

Regards

Stefan

________________________________________
From: Berenguer Blasi <be...@gmail.com>>>>>
Sent: Friday, October 28, 2022 9:43
To: dev@cassandra.apache.org<ma...@cassandra.apache.org>>>>
Subject: Re: Some tests are never executed in CI due to their name

NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.




Hi Stefan,

seems none of the 2 proposals are strongly opposed to. If you feel more comfortable with option 1 I say we do that, which will always be a step in the right direction. If we see it bites us again in the future then we do 2. But at least we bring this benefit/ticket in. I would also be happy to drive option 2 to completion if needed.

wdyt?

On 25/10/22 14:35, Berenguer Blasi wrote:

Well examples lend themselves to many things. I am sure I could come up with examples in the opposite direction. I'd rather talk in terms of the actual files and tests we have.

What I was trying to clarify is what I was misunderstanding as I had the impression now we were saying option 1 would not leak again.

So option 2, given ant and CI seem to rely on the *Test.java notation and also prevents future leaks doesn't seem to me a bad idea. Renaming the offending files seems a reasonable cost to me (Eclipse does it for me).

My 2cts! But hey, I've been known to be wrong before :shrug: lol

On 25/10/22 12:04, Andrés de la Peña wrote:
Note that the test multiplexer also searches for tests ending with "Test", so it will also miss new or modified tests with nonstandard names. The automatically detected tests are listed when running generate.sh, and they are added to the repeated run jobs. That gives us another opportunity to detect missed tests when we introduce something like "MyTestSplit1.java" and see that it's not been included in the mandatory repeated runs.

On Tue, 25 Oct 2022 at 10:31, Miklosovic, Stefan <St...@netapp.com>>>>>> wrote:
Hi Berenguer,

I am glad you asked. I was expecting this question.

I think there is a fundamental difference in how we approach this problem. I do not say mine is better, I just find it important to describe it clearly.

Let's say we are on a ship which leaks. When I detect that, my course of action is to fix the leakage with the biggest patches I can find around with minimal effort necessary so we are not taking water anymore. There might still be occasional leakages which are very rare but I have a crew who is checking the ship constantly anyway (review) and the risk that big leakages like we just fixed happen again are relatively very low.

You spot a leakage and instead of fixing it with big patches and calling it a day, you are trying to remodel the cabins so they are completely waterproof but while doing so, you renamed them so everyone needs to get used to how these cabins are called and where they are located, because there is this minimal chance that some cadet comes around and starts to live in a cabin he is not supposed to and we need to explain it to him. Thousands of cadets found their cabins just fine. Occasionally, there are few people who just miss the right board completely (Test at start, Tests at the end) and they are automatically navigated around (checkstyle) but once in five years there is this guy who just completely missed it.

I believe we can just navigate him when that happens. You want to cover that guy too.

I just find my approach easier.

You can remodel it all, for sure but I am afraid I will not be a part of that. I just do not find it necessary to do that.

________________________________________
From: Berenguer Blasi <be...@gmail.com>>>>>>
Sent: Tuesday, October 25, 2022 11:08
To: dev@cassandra.apache.org<ma...@cassandra.apache.org>>>>>
Subject: Re: Some tests are never executed in CI due to their name

NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.




IIUC we're relying on catching the word 'Split' in the file name for option 1. If somebody named his test i.e. 'MyTestGroup1', 'MyTestGroup2', 'TTLTestPre2038', 'TTLTestPost2038',... I think we would leak tests again? or any other word that is not specifically accounted for. Unless I am missing sthg ofc! :-)

On 25/10/22 10:39, Miklosovic, Stefan wrote:
> I think that what you wrote is not entirely correct. It will prevent it from happening again when there are tests ending on "Tests" or starting on "Test". The only case it will not cover is "SplitN" issue we plan to cover with relaxed test.name<http://test.name> property.
>
> It seems like what you wrote means that we will fix it and tests will leak again. That is not true.
>
> ________________________________________
> From: Berenguer Blasi <be...@gmail.com>>>>>>
> Sent: Tuesday, October 25, 2022 7:26
> To: dev@cassandra.apache.org<ma...@cassandra.apache.org>>>>>
> Subject: Re: Some tests are never executed in CI due to their name
>
> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
>
>
>
>
> The problem with using the first approach is that it fixes the current
> situation but it doesn't prevent it from happening again. The second
> proposal prevents it from happening again but at the cost of a bigger
> rename I'd volunteer to if needed.
>
> Regards
>
> On 24/10/22 20:38, Miklosovic, Stefan wrote:
>> Yeah, that is what the branch in my original email actually already solved. I mean ...
>>
>> CassandraAuthorizerTruncatingTests
>> BatchTests
>> UUIDTests
>>
>> these are ending on Tests which is illegal and they are fixed there.
>>
>> Other cases are either "TestBase" or "Tester" which should be legal.
>>
>> I think using the first approach and fixing all "SplitN" PLUS adding Split* to test.name<http://test.name> regexp should do it.
>>
>> I think we can "automate" at most like you suggest and scan it manually, fix the stuff and from then incorporate checkstyle to take care of that.
>>
>> There are also some classes which do include @Test methods but I think they are abstract as they are meant to be extended as the real test is just wrapping that. This might happen when there are slight variations across test classes. This is fine as well.
>>
>> ________________________________________
>> From: Derek Chen-Becker <de...@chen-becker.org>>>>>>
>> Sent: Monday, October 24, 2022 19:18
>> To: dev@cassandra.apache.org<ma...@cassandra.apache.org>>>>>
>> Subject: Re: Some tests are never executed in CI due to their name
>>
>> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
>>
>>
>>
>> OK, I think the approach you're proposing is fine. As an orthogonal idea, is there any way to do a one-time audit to narrow down files that we need to inspect, or any way to automate confirmation of a file containing tests? For example:
>>
>> * Start with all of the files under the `test` directory (is it safe to assume that this is the only location for test classes?) => 1457 files
>> * Remove any files that already end in "Test.java" => down to 396 files
>> * Is it safe to assume that only files having "Test" somewhere in their name are test classes? That reduces the list down to 60 files
>>
>> Alternatively, are we only talking about unit tests? Would it be sufficient to look for files that don't end in "Test.java" but contain "@Test" annotations? That's a significantly smaller set, so maybe we could fix unit tests first:
>>
>> ??? for fl in test/unit/**/*.java; do if [[ $fl != *Test.java ]]; then if grep -qF '@Test' $fl; then print $fl; fi; fi; done                                                                                                                                                              circleci-parity ? ?
>> test/unit/org/apache/cassandra/auth/CassandraAuthorizerTruncatingTests.java
>> test/unit/org/apache/cassandra/cql3/BatchTests.java
>> test/unit/org/apache/cassandra/cql3/KeywordTestBase.java
>> test/unit/org/apache/cassandra/db/guardrails/GuardrailAllowUncompressedTables.java
>> test/unit/org/apache/cassandra/db/guardrails/GuardrailConsistencyLevelsTester.java
>> test/unit/org/apache/cassandra/db/guardrails/GuardrailSecondaryIndexTester.java
>> test/unit/org/apache/cassandra/db/guardrails/GuardrailSecondaryIndexesPerTable.java
>> test/unit/org/apache/cassandra/db/guardrails/ThresholdTester.java
>> test/unit/org/apache/cassandra/dht/PartitionerTestCase.java
>> test/unit/org/apache/cassandra/utils/UUIDTests.java
>>
>> Cheers,
>>
>> Derek
>>
>>
>> On Mon, Oct 24, 2022 at 9:00 AM Miklosovic, Stefan <St...@netapp.com>>>>>>> wrote:
>> One more point as I was reading your email in a hurry. The ultimate goal is to run all tests. So even if we apply the proposed regexp in checkstyle and there is an unfortunate case that MyTestSplit1 would slip through on a review and it would not be executed in CI, if we do not figure out some better regexp, we might relax "test.name<http://test.name><http://test.name>" property to include Split*.java as well. Yes, this might be also done.
>>
>> Enforcing "all tests to end on "Test" " means that we would need to know which .java files contain tests and which not without looking into them. I do not think that is possible under circumstances we have where (checkstyle module).
>>
>> ________________________________________
>> From: Derek Chen-Becker <de...@chen-becker.org>>>>>>>
>> Sent: Monday, October 24, 2022 15:53
>> To: dev@cassandra.apache.org<ma...@cassandra.apache.org>>>>>>
>> Subject: Re: Some tests are never executed in CI due to their name
>>
>> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
>>
>>
>>
>> OK, that makes sense, and I missed the rename of "TestBaseImpl" (which I agree is in need of fixing). When you say:
>>
>>> What we are trying to achieve by that is to have a test which ends on "Test" and nothing else and it may contain "Test" only in case it does not start with it.
>> Should the regex enforce the "which ends on Test" part? Otherwise, if we want to allow tests like MyTestSplit1, do we need to update the "test.name<http://test.name><http://test.name><http://test.name>" parameter to allow that in the test runner?
>>
>> Thanks,
>>
>> Derek
>>
>>
>> On Mon, Oct 24, 2022 at 7:27 AM Miklosovic, Stefan <St...@netapp.com>>>>>>>> wrote:
>> Hi Derek,
>>
>> yes, the regexp is ^(?!(?:Test))(?!.*?(?:Tests)).*$
>>
>> What we are trying to achieve by that is to have a test which ends on "Test" and nothing else and it may contain "Test" only in case it does not start with it.
>>
>> To your second paragraph, I do not think we have any *Base files which are tests (that would be a bug) nor they are standalone (not inherited). By first approach, TestBaseImpl (which we do have) would start to be invalid, we would have to rename it to "DistributedTestBaseImpl" because it was starting on "Test" which is no-no. I agree that "TestBaseImpl" is rather unfortunate name. Currently, TestBaseImpl itself extends "DistributedTestBase" so in the end we would have "DistributedTestBaseImpl -> TestBaseImpl -> DistributedTestBase". Please keep in mind that "DistributedTestBase" is located in dtest jar (separate project) and that is rather inconvenient to rename.
>>
>> Regards
>>
>> ________________________________________
>> From: Derek Chen-Becker <de...@chen-becker.org>>>>>>>>
>> Sent: Monday, October 24, 2022 15:09
>> To: dev@cassandra.apache.org<ma...@cassandra.apache.org>>>>>>>
>> Subject: Re: Some tests are never executed in CI due to their name
>>
>> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
>>
>>
>>
>> For clarity, what is the final regex you've landed on for the first approach? Is it "^(?!(?:Test))(?!.*?(?:Tests)).*$" ? Are we just trying to reject anything that *starts* with "Test" or contains "Tests" somewhere in the name? It might be more clear to state what we think a valid test name should be, not in regex form.
>>
>> When it comes to "TestBase" files, are there any of these that are actually intended to be run as a test? The "Base" would indicate to me that it's intended to be inherited, not run directly. Do we have some "TestBase" classes somewhere that aren't inherited and are meant to be run directly, because that definitely seems like a misleading name that should be fixed, invasive or not. I would lean toward the principle of least surprise here, even if it means an involved one-time cleanup.
>>
>> Cheers,
>>
>> Derek
>>
>> On Mon, Oct 24, 2022 at 3:02 AM Miklosovic, Stefan <St...@netapp.com>>>>>>>>> wrote:
>> Hi list,
>>
>> it was reported that some tests are not executed as part of CI because of their name (1). There is (2) and (3) so right now it runs only "*Test.java" files.
>>
>> Except of us renaming the affected classes, we should fix this in a more robust way - via checkstyle - so we enforce how test files have to be called.
>>
>> We came up with two approaches and we do not know which one to choose so we want to gather more feedback / opinions.
>>
>> The first approach would make this happen (implemented here (4))
>>
>> class name - passes / fails the checkstyle
>>
>> TestIt - fails
>> TestsThatFeature - fails
>> ThisTestsMyFeature - fails
>> SomeTests - fails
>>
>> SomeHelperClassForTesting - passes
>> SomeTest - passes
>> DistributedTestBase - passes
>>
>> MyTestSplit1 - passes
>>
>> We would fix "SplitN" test files by hand (I think they are not executed right now in CI either) to something like MySplit1Test and MySplit2Test but the regexp we would eventually use would not prevent this from happening in the future. I do not know how to construct such a complex regexp yet which would cover all cases above plus this splitting one. Feel free to look into the ticket where details are discussed.
>>
>> The second approach would forbid to have any occurrence of "test" in the file name but at the end. For example, it would fail on classes like these
>>
>> MyTestSplit1
>> TestHelper
>> BurnTestUtil
>> FuzzTestBase
>> DistributedTestSnitch
>> CASCommonTestCases
>> ServerTestUtils
>> AuthTestUtils
>> TestMemtable
>>
>> While this would also fix "MyTestSplitN" cases, I consider this to be way more invasive to the current codebase because we would have to rename a lot of files (like, hundreds, plus a lot of places they are referenced / used at) and I do not even think this is necessary and desirable. Like, how would we call "FuzzTestBase" to not contain "Test" in it? I think having "test" string in the middle is just fine, specifically for helper classes. One possible fix would be to replace "Test" with "Tester" so we would have "TesterMemtable", "AuthTesterUtils", "TesterHelper", "DistributedTesterSnitch" ... Word "Tester" would be explicitly allowed.
>>
>> I personally lean to the first approach even though cases like "MyTestSplit1" will not be spotted in compile time, but hey, we still do have a review process on top of this. The obvious violations like "TestThisFeature" and "MySuperTests" would be evaluated as violations which I would argue happens the most often.
>>
>> Please keep in mind that we have checkstyle only in 4.1 and trunk. So, while we would fix names of classes in all branches we support so they are picked up by CI, checkstyle for this would be introduced in 4.1 and trunk only.
>>
>> (1) https://issues.apache.org/jira/browse/CASSANDRA-17964
>> (2) https://github.com/apache/cassandra/blob/trunk/build.xml#L61
>> (3) https://github.com/apache/cassandra/blob/trunk/build.xml#L1033
>> (4) https://github.com/instaclustr/cassandra/commits/CASSANDRA-17964-4.1-2
>>
>>
>> --
>> +---------------------------------------------------------------+
>> | Derek Chen-Becker                                             |
>> | GPG Key available at https://keybase.io/dchenbecker and       |
>> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
>> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
>> +---------------------------------------------------------------+
>>
>>
>>
>> --
>> +---------------------------------------------------------------+
>> | Derek Chen-Becker                                             |
>> | GPG Key available at https://keybase.io/dchenbecker and       |
>> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
>> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
>> +---------------------------------------------------------------+
>>
>>
>>
>> --
>> +---------------------------------------------------------------+
>> | Derek Chen-Becker                                             |
>> | GPG Key available at https://keybase.io/dchenbecker and       |
>> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
>> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
>> +---------------------------------------------------------------+
>>









Re: Some tests are never executed in CI due to their name

Posted by Josh McKenzie <jm...@apache.org>.
> I just feel we need to plumb this on quite a low level to achieve this. Why not to put a 5 line checkstyle in place, rename few classes and move on.
https://github.com/apache/cassandra-builds/blob/trunk/build-scripts/cassandra-test.sh#L17-L21

# lists all tests for the specific test type
_list_tests() {
    local -r classlistprefix="$1"
    find "test/$classlistprefix" -name '*Test.java' | sed "s;^test/$classlistprefix/;;g" | sort
}

Shouldn't be *that* painful should it?

On Mon, Oct 31, 2022, at 3:21 PM, Miklosovic, Stefan wrote:
> >> It's unclear to me how the combination of "Has @Test, is not an abstract class top level matching file, and is inside the test folder" is going to significantly break anything. Maybe I'm missing something; would love to learn.
> 
> I just don't see where you want to "hook" this step exactly either in the CI run or in build.xml or what have you. My wild guess is that this translates to some kind of a bash script, right (checking if it has @Test and not abstract) ? And then what? So we have test.name property, so your idea is to somehow get the list of classes and then you want to feed it to test.name property? That property would be very long if we are going to enumerate whole test suite with package names as well, not super user friendly if you ask me. I just feel we need to plumb this on quite a low level to achieve this. Why not to put a 5 line checkstyle in place, rename few classes and move on.
> 
> I would still love to have file names somehow consistent so they are also visually easily recognizable (what is test and what not) in IDE or similar.
> 
> Also, I would like to hear opinions of others when it comes to this second idea.
> 
> ________________________________________
> From: Josh McKenzie <jm...@apache.org>
> Sent: Monday, October 31, 2022 19:25
> To: dev
> Subject: Re: Some tests are never executed in CI due to their name
> 
> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
> 
> 
> 
> So to be clear, I don't really feel super strongly on this, I just thought Derek's suggestion seemed clean and elegant. Checkstyle to force a certain naming convention on things is fine enough; I certainly wouldn't stand in the way. Tooling like that self-educates users (see: Guardrails) so it's pretty low friction to the community IMO.
> 
> Secondly, I still do not know how you want to practically implement what you suggest without significantly breaking anything.
> It's unclear to me how the combination of "Has @Test, is not an abstract class top level matching file, and is inside the test folder" is going to significantly break anything. Maybe I'm missing something; would love to learn.
> 
> how to actually be sure that a developer knows that his test not only passed but it was selected to be run
> Andres' changes to .circleci/generate.sh mean you're going to see a _repeat run of whatever tests you change or add. I think both approaches (heuristic to determine if a file contains tests to run vs. checkstyle enforced naming convention) are more or less comparable here.
> 
> On Mon, Oct 31, 2022, at 12:26 PM, Miklosovic, Stefan wrote:
> I think we are already adhering to them without any effort. Overwhelming majority of devs are naming their tests "just right" and I do not think they are doing that with any significant effort, probably none. We are only re-enforcing the unspoken everybody implicitly agreed on. If somebody does not name their tests like the rest, I think it would raise the red flags already during a review.
> 
> Secondly, I still do not know how you want to practically implement what you suggest without significantly breaking anything.
> 
> I think the crucial question we should ask is how to actually be sure that a developer knows that his test not only passed but it was selected to be run (now based on *Test). How did these tests which skipped as of now end up there in the first place? Somebody just committed them, run the CI and "it was green" so they proceed with the commit. But they did not check it was actually run because "why not?".
> 
> So, we are putting a kind of detector in place to signal that a test is not going to be actually even run when the checkstyle fails.
> 
> ________________________________________
> From: Josh McKenzie <jm...@apache.org>>
> Sent: Monday, October 31, 2022 17:16
> To: dev
> Subject: Re: Some tests are never executed in CI due to their name
> 
> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
> 
> 
> 
> I do not think that having a test class called "SdsdakASsdadsada" which happen to contain @Test and is not abstract should be considered to be a valid file name / test class
> I mean, I wouldn't recommend it certainly, but I'm not concerned about whether a file is "TestThing" or "ThingTest" or "WeTestThing". The ordering of names on those files adds very little IMO.
> 
> I do not think we are actually winning anything with that approach.
> Well, the name of this thread is "Some tests are never executed in CI due to their name" and it fixes that in a robust way without asking of any behavior changes of the project community to adhere to an arbitrary rigid naming guideline for something. So there's that. :)
> 
> On Mon, Oct 31, 2022, at 8:02 AM, Miklosovic, Stefan wrote:
> I always hit "send" before realizing I have other points to add :)
> 
> I was thinking we want to go with checkstyle and check the name. Not manually parsing the file names. Do not we also want these test files to look consistent? I do not think that having a test class called "SdsdakASsdadsada" which happen to contain @Test and is not abstract should be considered to be a valid file name / test class. Do not we also want to enforce some regularity into this?
> 
> Another point is that even we parsed we so it will be picked up while tests are executed, that means that it will not be checked on the commit?
> 
> Lastly, how do you actually want to do this if we currently have test.name property which is a simple regexp (*Test)? -Dtest.name is pretty much how we do this currently and I am not sure we want to break this in all CI scripts.
> 
> I do not think we are actually winning anything with that approach.
> 
> Regards
> 
> ________________________________________
> From: Josh McKenzie <jm...@apache.org>>>
> Sent: Monday, October 31, 2022 12:49
> To: dev
> Subject: Re: Some tests are never executed in CI due to their name
> 
> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
> 
> 
> 
> To Derek's point, why not go with:
> 
> Would it be sufficient to look for files that don't end in "Test.java" but contain "@Test" annotations?
> + exclude files that contain "abstract class" in them that matches the file name?
> 
> Combine that w/checking specific subdirectories and we relax the required naming convention quite a bit; makes it easy to do the right thing and hard to do the wrong thing.
> 
> It's also very simple which is a feature.
> 
> On Fri, Oct 28, 2022, at 4:17 AM, Miklosovic, Stefan wrote:
> Hi Berenguer,
> 
> I agree. We can always go heavy-weight anyway. I will dedicate more time to that the next week. It would be nice if you tagged along and help with the review.
> 
> FYI the ticket is touching all branches we support (there are some skipped tests in 3.0 too), we would just add checkstyle to 4.1 and trunk as discussed, so I expect we fix all 5 branches at once.
> 
> Regards
> 
> Stefan
> 
> ________________________________________
> From: Berenguer Blasi <be...@gmail.com>>>>
> Sent: Friday, October 28, 2022 9:43
> To: dev@cassandra.apache.org<ma...@cassandra.apache.org>>>
> Subject: Re: Some tests are never executed in CI due to their name
> 
> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
> 
> 
> 
> 
> Hi Stefan,
> 
> seems none of the 2 proposals are strongly opposed to. If you feel more comfortable with option 1 I say we do that, which will always be a step in the right direction. If we see it bites us again in the future then we do 2. But at least we bring this benefit/ticket in. I would also be happy to drive option 2 to completion if needed.
> 
> wdyt?
> 
> On 25/10/22 14:35, Berenguer Blasi wrote:
> 
> Well examples lend themselves to many things. I am sure I could come up with examples in the opposite direction. I'd rather talk in terms of the actual files and tests we have.
> 
> What I was trying to clarify is what I was misunderstanding as I had the impression now we were saying option 1 would not leak again.
> 
> So option 2, given ant and CI seem to rely on the *Test.java notation and also prevents future leaks doesn't seem to me a bad idea. Renaming the offending files seems a reasonable cost to me (Eclipse does it for me).
> 
> My 2cts! But hey, I've been known to be wrong before :shrug: lol
> 
> On 25/10/22 12:04, Andrés de la Peña wrote:
> Note that the test multiplexer also searches for tests ending with "Test", so it will also miss new or modified tests with nonstandard names. The automatically detected tests are listed when running generate.sh, and they are added to the repeated run jobs. That gives us another opportunity to detect missed tests when we introduce something like "MyTestSplit1.java" and see that it's not been included in the mandatory repeated runs.
> 
> On Tue, 25 Oct 2022 at 10:31, Miklosovic, Stefan <St...@netapp.com>>>>> wrote:
> Hi Berenguer,
> 
> I am glad you asked. I was expecting this question.
> 
> I think there is a fundamental difference in how we approach this problem. I do not say mine is better, I just find it important to describe it clearly.
> 
> Let's say we are on a ship which leaks. When I detect that, my course of action is to fix the leakage with the biggest patches I can find around with minimal effort necessary so we are not taking water anymore. There might still be occasional leakages which are very rare but I have a crew who is checking the ship constantly anyway (review) and the risk that big leakages like we just fixed happen again are relatively very low.
> 
> You spot a leakage and instead of fixing it with big patches and calling it a day, you are trying to remodel the cabins so they are completely waterproof but while doing so, you renamed them so everyone needs to get used to how these cabins are called and where they are located, because there is this minimal chance that some cadet comes around and starts to live in a cabin he is not supposed to and we need to explain it to him. Thousands of cadets found their cabins just fine. Occasionally, there are few people who just miss the right board completely (Test at start, Tests at the end) and they are automatically navigated around (checkstyle) but once in five years there is this guy who just completely missed it.
> 
> I believe we can just navigate him when that happens. You want to cover that guy too.
> 
> I just find my approach easier.
> 
> You can remodel it all, for sure but I am afraid I will not be a part of that. I just do not find it necessary to do that.
> 
> ________________________________________
> From: Berenguer Blasi <be...@gmail.com>>>>>
> Sent: Tuesday, October 25, 2022 11:08
> To: dev@cassandra.apache.org<ma...@cassandra.apache.org>>>>
> Subject: Re: Some tests are never executed in CI due to their name
> 
> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
> 
> 
> 
> 
> IIUC we're relying on catching the word 'Split' in the file name for option 1. If somebody named his test i.e. 'MyTestGroup1', 'MyTestGroup2', 'TTLTestPre2038', 'TTLTestPost2038',... I think we would leak tests again? or any other word that is not specifically accounted for. Unless I am missing sthg ofc! :-)
> 
> On 25/10/22 10:39, Miklosovic, Stefan wrote:
> > I think that what you wrote is not entirely correct. It will prevent it from happening again when there are tests ending on "Tests" or starting on "Test". The only case it will not cover is "SplitN" issue we plan to cover with relaxed test.name<http://test.name> property.
> >
> > It seems like what you wrote means that we will fix it and tests will leak again. That is not true.
> >
> > ________________________________________
> > From: Berenguer Blasi <be...@gmail.com>>>>>
> > Sent: Tuesday, October 25, 2022 7:26
> > To: dev@cassandra.apache.org<ma...@cassandra.apache.org>>>>
> > Subject: Re: Some tests are never executed in CI due to their name
> >
> > NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
> >
> >
> >
> >
> > The problem with using the first approach is that it fixes the current
> > situation but it doesn't prevent it from happening again. The second
> > proposal prevents it from happening again but at the cost of a bigger
> > rename I'd volunteer to if needed.
> >
> > Regards
> >
> > On 24/10/22 20:38, Miklosovic, Stefan wrote:
> >> Yeah, that is what the branch in my original email actually already solved. I mean ...
> >>
> >> CassandraAuthorizerTruncatingTests
> >> BatchTests
> >> UUIDTests
> >>
> >> these are ending on Tests which is illegal and they are fixed there.
> >>
> >> Other cases are either "TestBase" or "Tester" which should be legal.
> >>
> >> I think using the first approach and fixing all "SplitN" PLUS adding Split* to test.name<http://test.name> regexp should do it.
> >>
> >> I think we can "automate" at most like you suggest and scan it manually, fix the stuff and from then incorporate checkstyle to take care of that.
> >>
> >> There are also some classes which do include @Test methods but I think they are abstract as they are meant to be extended as the real test is just wrapping that. This might happen when there are slight variations across test classes. This is fine as well.
> >>
> >> ________________________________________
> >> From: Derek Chen-Becker <de...@chen-becker.org>>>>>
> >> Sent: Monday, October 24, 2022 19:18
> >> To: dev@cassandra.apache.org<ma...@cassandra.apache.org>>>>
> >> Subject: Re: Some tests are never executed in CI due to their name
> >>
> >> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
> >>
> >>
> >>
> >> OK, I think the approach you're proposing is fine. As an orthogonal idea, is there any way to do a one-time audit to narrow down files that we need to inspect, or any way to automate confirmation of a file containing tests? For example:
> >>
> >> * Start with all of the files under the `test` directory (is it safe to assume that this is the only location for test classes?) => 1457 files
> >> * Remove any files that already end in "Test.java" => down to 396 files
> >> * Is it safe to assume that only files having "Test" somewhere in their name are test classes? That reduces the list down to 60 files
> >>
> >> Alternatively, are we only talking about unit tests? Would it be sufficient to look for files that don't end in "Test.java" but contain "@Test" annotations? That's a significantly smaller set, so maybe we could fix unit tests first:
> >>
> >> ❯❯❯ for fl in test/unit/**/*.java; do if [[ $fl != *Test.java ]]; then if grep -qF '@Test' $fl; then print $fl; fi; fi; done                                                                                                                                                              circleci-parity ✱ ◼
> >> test/unit/org/apache/cassandra/auth/CassandraAuthorizerTruncatingTests.java
> >> test/unit/org/apache/cassandra/cql3/BatchTests.java
> >> test/unit/org/apache/cassandra/cql3/KeywordTestBase.java
> >> test/unit/org/apache/cassandra/db/guardrails/GuardrailAllowUncompressedTables.java
> >> test/unit/org/apache/cassandra/db/guardrails/GuardrailConsistencyLevelsTester.java
> >> test/unit/org/apache/cassandra/db/guardrails/GuardrailSecondaryIndexTester.java
> >> test/unit/org/apache/cassandra/db/guardrails/GuardrailSecondaryIndexesPerTable.java
> >> test/unit/org/apache/cassandra/db/guardrails/ThresholdTester.java
> >> test/unit/org/apache/cassandra/dht/PartitionerTestCase.java
> >> test/unit/org/apache/cassandra/utils/UUIDTests.java
> >>
> >> Cheers,
> >>
> >> Derek
> >>
> >>
> >> On Mon, Oct 24, 2022 at 9:00 AM Miklosovic, Stefan <St...@netapp.com>>>>>> wrote:
> >> One more point as I was reading your email in a hurry. The ultimate goal is to run all tests. So even if we apply the proposed regexp in checkstyle and there is an unfortunate case that MyTestSplit1 would slip through on a review and it would not be executed in CI, if we do not figure out some better regexp, we might relax "test.name<http://test.name><http://test.name>" property to include Split*.java as well. Yes, this might be also done.
> >>
> >> Enforcing "all tests to end on "Test" " means that we would need to know which .java files contain tests and which not without looking into them. I do not think that is possible under circumstances we have where (checkstyle module).
> >>
> >> ________________________________________
> >> From: Derek Chen-Becker <de...@chen-becker.org>>>>>>
> >> Sent: Monday, October 24, 2022 15:53
> >> To: dev@cassandra.apache.org<ma...@cassandra.apache.org>>>>>
> >> Subject: Re: Some tests are never executed in CI due to their name
> >>
> >> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
> >>
> >>
> >>
> >> OK, that makes sense, and I missed the rename of "TestBaseImpl" (which I agree is in need of fixing). When you say:
> >>
> >>> What we are trying to achieve by that is to have a test which ends on "Test" and nothing else and it may contain "Test" only in case it does not start with it.
> >> Should the regex enforce the "which ends on Test" part? Otherwise, if we want to allow tests like MyTestSplit1, do we need to update the "test.name<http://test.name><http://test.name><http://test.name>" parameter to allow that in the test runner?
> >>
> >> Thanks,
> >>
> >> Derek
> >>
> >>
> >> On Mon, Oct 24, 2022 at 7:27 AM Miklosovic, Stefan <St...@netapp.com>>>>>>> wrote:
> >> Hi Derek,
> >>
> >> yes, the regexp is ^(?!(?:Test))(?!.*?(?:Tests)).*$
> >>
> >> What we are trying to achieve by that is to have a test which ends on "Test" and nothing else and it may contain "Test" only in case it does not start with it.
> >>
> >> To your second paragraph, I do not think we have any *Base files which are tests (that would be a bug) nor they are standalone (not inherited). By first approach, TestBaseImpl (which we do have) would start to be invalid, we would have to rename it to "DistributedTestBaseImpl" because it was starting on "Test" which is no-no. I agree that "TestBaseImpl" is rather unfortunate name. Currently, TestBaseImpl itself extends "DistributedTestBase" so in the end we would have "DistributedTestBaseImpl -> TestBaseImpl -> DistributedTestBase". Please keep in mind that "DistributedTestBase" is located in dtest jar (separate project) and that is rather inconvenient to rename.
> >>
> >> Regards
> >>
> >> ________________________________________
> >> From: Derek Chen-Becker <de...@chen-becker.org>>>>>>>
> >> Sent: Monday, October 24, 2022 15:09
> >> To: dev@cassandra.apache.org<ma...@cassandra.apache.org>>>>>>
> >> Subject: Re: Some tests are never executed in CI due to their name
> >>
> >> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
> >>
> >>
> >>
> >> For clarity, what is the final regex you've landed on for the first approach? Is it "^(?!(?:Test))(?!.*?(?:Tests)).*$" ? Are we just trying to reject anything that *starts* with "Test" or contains "Tests" somewhere in the name? It might be more clear to state what we think a valid test name should be, not in regex form.
> >>
> >> When it comes to "TestBase" files, are there any of these that are actually intended to be run as a test? The "Base" would indicate to me that it's intended to be inherited, not run directly. Do we have some "TestBase" classes somewhere that aren't inherited and are meant to be run directly, because that definitely seems like a misleading name that should be fixed, invasive or not. I would lean toward the principle of least surprise here, even if it means an involved one-time cleanup.
> >>
> >> Cheers,
> >>
> >> Derek
> >>
> >> On Mon, Oct 24, 2022 at 3:02 AM Miklosovic, Stefan <St...@netapp.com>>>>>>>> wrote:
> >> Hi list,
> >>
> >> it was reported that some tests are not executed as part of CI because of their name (1). There is (2) and (3) so right now it runs only "*Test.java" files.
> >>
> >> Except of us renaming the affected classes, we should fix this in a more robust way - via checkstyle - so we enforce how test files have to be called.
> >>
> >> We came up with two approaches and we do not know which one to choose so we want to gather more feedback / opinions.
> >>
> >> The first approach would make this happen (implemented here (4))
> >>
> >> class name - passes / fails the checkstyle
> >>
> >> TestIt - fails
> >> TestsThatFeature - fails
> >> ThisTestsMyFeature - fails
> >> SomeTests - fails
> >>
> >> SomeHelperClassForTesting - passes
> >> SomeTest - passes
> >> DistributedTestBase - passes
> >>
> >> MyTestSplit1 - passes
> >>
> >> We would fix "SplitN" test files by hand (I think they are not executed right now in CI either) to something like MySplit1Test and MySplit2Test but the regexp we would eventually use would not prevent this from happening in the future. I do not know how to construct such a complex regexp yet which would cover all cases above plus this splitting one. Feel free to look into the ticket where details are discussed.
> >>
> >> The second approach would forbid to have any occurrence of "test" in the file name but at the end. For example, it would fail on classes like these
> >>
> >> MyTestSplit1
> >> TestHelper
> >> BurnTestUtil
> >> FuzzTestBase
> >> DistributedTestSnitch
> >> CASCommonTestCases
> >> ServerTestUtils
> >> AuthTestUtils
> >> TestMemtable
> >>
> >> While this would also fix "MyTestSplitN" cases, I consider this to be way more invasive to the current codebase because we would have to rename a lot of files (like, hundreds, plus a lot of places they are referenced / used at) and I do not even think this is necessary and desirable. Like, how would we call "FuzzTestBase" to not contain "Test" in it? I think having "test" string in the middle is just fine, specifically for helper classes. One possible fix would be to replace "Test" with "Tester" so we would have "TesterMemtable", "AuthTesterUtils", "TesterHelper", "DistributedTesterSnitch" ... Word "Tester" would be explicitly allowed.
> >>
> >> I personally lean to the first approach even though cases like "MyTestSplit1" will not be spotted in compile time, but hey, we still do have a review process on top of this. The obvious violations like "TestThisFeature" and "MySuperTests" would be evaluated as violations which I would argue happens the most often.
> >>
> >> Please keep in mind that we have checkstyle only in 4.1 and trunk. So, while we would fix names of classes in all branches we support so they are picked up by CI, checkstyle for this would be introduced in 4.1 and trunk only.
> >>
> >> (1) https://issues.apache.org/jira/browse/CASSANDRA-17964
> >> (2) https://github.com/apache/cassandra/blob/trunk/build.xml#L61
> >> (3) https://github.com/apache/cassandra/blob/trunk/build.xml#L1033
> >> (4) https://github.com/instaclustr/cassandra/commits/CASSANDRA-17964-4.1-2
> >>
> >>
> >> --
> >> +---------------------------------------------------------------+
> >> | Derek Chen-Becker                                             |
> >> | GPG Key available at https://keybase.io/dchenbecker and       |
> >> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
> >> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
> >> +---------------------------------------------------------------+
> >>
> >>
> >>
> >> --
> >> +---------------------------------------------------------------+
> >> | Derek Chen-Becker                                             |
> >> | GPG Key available at https://keybase.io/dchenbecker and       |
> >> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
> >> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
> >> +---------------------------------------------------------------+
> >>
> >>
> >>
> >> --
> >> +---------------------------------------------------------------+
> >> | Derek Chen-Becker                                             |
> >> | GPG Key available at https://keybase.io/dchenbecker and       |
> >> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
> >> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
> >> +---------------------------------------------------------------+
> >>
> 
> 
> 
> 
> 
> 
> 

Re: Some tests are never executed in CI due to their name

Posted by "Miklosovic, Stefan" <St...@netapp.com>.
>> It's unclear to me how the combination of "Has @Test, is not an abstract class top level matching file, and is inside the test folder" is going to significantly break anything. Maybe I'm missing something; would love to learn.

I just don't see where you want to "hook" this step exactly either in the CI run or in build.xml or what have you. My wild guess is that this translates to some kind of a bash script, right (checking if it has @Test and not abstract) ? And then what? So we have test.name property, so your idea is to somehow get the list of classes and then you want to feed it to test.name property? That property would be very long if we are going to enumerate whole test suite with package names as well, not super user friendly if you ask me. I just feel we need to plumb this on quite a low level to achieve this. Why not to put a 5 line checkstyle in place, rename few classes and move on.

I would still love to have file names somehow consistent so they are also visually easily recognizable (what is test and what not) in IDE or similar.

Also, I would like to hear opinions of others when it comes to this second idea.

________________________________________
From: Josh McKenzie <jm...@apache.org>
Sent: Monday, October 31, 2022 19:25
To: dev
Subject: Re: Some tests are never executed in CI due to their name

NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.



So to be clear, I don't really feel super strongly on this, I just thought Derek's suggestion seemed clean and elegant. Checkstyle to force a certain naming convention on things is fine enough; I certainly wouldn't stand in the way. Tooling like that self-educates users (see: Guardrails) so it's pretty low friction to the community IMO.

Secondly, I still do not know how you want to practically implement what you suggest without significantly breaking anything.
It's unclear to me how the combination of "Has @Test, is not an abstract class top level matching file, and is inside the test folder" is going to significantly break anything. Maybe I'm missing something; would love to learn.

how to actually be sure that a developer knows that his test not only passed but it was selected to be run
Andres' changes to .circleci/generate.sh mean you're going to see a _repeat run of whatever tests you change or add. I think both approaches (heuristic to determine if a file contains tests to run vs. checkstyle enforced naming convention) are more or less comparable here.

On Mon, Oct 31, 2022, at 12:26 PM, Miklosovic, Stefan wrote:
I think we are already adhering to them without any effort. Overwhelming majority of devs are naming their tests "just right" and I do not think they are doing that with any significant effort, probably none. We are only re-enforcing the unspoken everybody implicitly agreed on. If somebody does not name their tests like the rest, I think it would raise the red flags already during a review.

Secondly, I still do not know how you want to practically implement what you suggest without significantly breaking anything.

I think the crucial question we should ask is how to actually be sure that a developer knows that his test not only passed but it was selected to be run (now based on *Test). How did these tests which skipped as of now end up there in the first place? Somebody just committed them, run the CI and "it was green" so they proceed with the commit. But they did not check it was actually run because "why not?".

So, we are putting a kind of detector in place to signal that a test is not going to be actually even run when the checkstyle fails.

________________________________________
From: Josh McKenzie <jm...@apache.org>>
Sent: Monday, October 31, 2022 17:16
To: dev
Subject: Re: Some tests are never executed in CI due to their name

NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.



I do not think that having a test class called "SdsdakASsdadsada" which happen to contain @Test and is not abstract should be considered to be a valid file name / test class
I mean, I wouldn't recommend it certainly, but I'm not concerned about whether a file is "TestThing" or "ThingTest" or "WeTestThing". The ordering of names on those files adds very little IMO.

I do not think we are actually winning anything with that approach.
Well, the name of this thread is "Some tests are never executed in CI due to their name" and it fixes that in a robust way without asking of any behavior changes of the project community to adhere to an arbitrary rigid naming guideline for something. So there's that. :)

On Mon, Oct 31, 2022, at 8:02 AM, Miklosovic, Stefan wrote:
I always hit "send" before realizing I have other points to add :)

I was thinking we want to go with checkstyle and check the name. Not manually parsing the file names. Do not we also want these test files to look consistent? I do not think that having a test class called "SdsdakASsdadsada" which happen to contain @Test and is not abstract should be considered to be a valid file name / test class. Do not we also want to enforce some regularity into this?

Another point is that even we parsed we so it will be picked up while tests are executed, that means that it will not be checked on the commit?

Lastly, how do you actually want to do this if we currently have test.name property which is a simple regexp (*Test)? -Dtest.name is pretty much how we do this currently and I am not sure we want to break this in all CI scripts.

I do not think we are actually winning anything with that approach.

Regards

________________________________________
From: Josh McKenzie <jm...@apache.org>>>
Sent: Monday, October 31, 2022 12:49
To: dev
Subject: Re: Some tests are never executed in CI due to their name

NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.



To Derek's point, why not go with:

Would it be sufficient to look for files that don't end in "Test.java" but contain "@Test" annotations?
+ exclude files that contain "abstract class" in them that matches the file name?

Combine that w/checking specific subdirectories and we relax the required naming convention quite a bit; makes it easy to do the right thing and hard to do the wrong thing.

It's also very simple which is a feature.

On Fri, Oct 28, 2022, at 4:17 AM, Miklosovic, Stefan wrote:
Hi Berenguer,

I agree. We can always go heavy-weight anyway. I will dedicate more time to that the next week. It would be nice if you tagged along and help with the review.

FYI the ticket is touching all branches we support (there are some skipped tests in 3.0 too), we would just add checkstyle to 4.1 and trunk as discussed, so I expect we fix all 5 branches at once.

Regards

Stefan

________________________________________
From: Berenguer Blasi <be...@gmail.com>>>>
Sent: Friday, October 28, 2022 9:43
To: dev@cassandra.apache.org<ma...@cassandra.apache.org>>>
Subject: Re: Some tests are never executed in CI due to their name

NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.




Hi Stefan,

seems none of the 2 proposals are strongly opposed to. If you feel more comfortable with option 1 I say we do that, which will always be a step in the right direction. If we see it bites us again in the future then we do 2. But at least we bring this benefit/ticket in. I would also be happy to drive option 2 to completion if needed.

wdyt?

On 25/10/22 14:35, Berenguer Blasi wrote:

Well examples lend themselves to many things. I am sure I could come up with examples in the opposite direction. I'd rather talk in terms of the actual files and tests we have.

What I was trying to clarify is what I was misunderstanding as I had the impression now we were saying option 1 would not leak again.

So option 2, given ant and CI seem to rely on the *Test.java notation and also prevents future leaks doesn't seem to me a bad idea. Renaming the offending files seems a reasonable cost to me (Eclipse does it for me).

My 2cts! But hey, I've been known to be wrong before :shrug: lol

On 25/10/22 12:04, Andrés de la Peña wrote:
Note that the test multiplexer also searches for tests ending with "Test", so it will also miss new or modified tests with nonstandard names. The automatically detected tests are listed when running generate.sh, and they are added to the repeated run jobs. That gives us another opportunity to detect missed tests when we introduce something like "MyTestSplit1.java" and see that it's not been included in the mandatory repeated runs.

On Tue, 25 Oct 2022 at 10:31, Miklosovic, Stefan <St...@netapp.com>>>>> wrote:
Hi Berenguer,

I am glad you asked. I was expecting this question.

I think there is a fundamental difference in how we approach this problem. I do not say mine is better, I just find it important to describe it clearly.

Let's say we are on a ship which leaks. When I detect that, my course of action is to fix the leakage with the biggest patches I can find around with minimal effort necessary so we are not taking water anymore. There might still be occasional leakages which are very rare but I have a crew who is checking the ship constantly anyway (review) and the risk that big leakages like we just fixed happen again are relatively very low.

You spot a leakage and instead of fixing it with big patches and calling it a day, you are trying to remodel the cabins so they are completely waterproof but while doing so, you renamed them so everyone needs to get used to how these cabins are called and where they are located, because there is this minimal chance that some cadet comes around and starts to live in a cabin he is not supposed to and we need to explain it to him. Thousands of cadets found their cabins just fine. Occasionally, there are few people who just miss the right board completely (Test at start, Tests at the end) and they are automatically navigated around (checkstyle) but once in five years there is this guy who just completely missed it.

I believe we can just navigate him when that happens. You want to cover that guy too.

I just find my approach easier.

You can remodel it all, for sure but I am afraid I will not be a part of that. I just do not find it necessary to do that.

________________________________________
From: Berenguer Blasi <be...@gmail.com>>>>>
Sent: Tuesday, October 25, 2022 11:08
To: dev@cassandra.apache.org<ma...@cassandra.apache.org>>>>
Subject: Re: Some tests are never executed in CI due to their name

NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.




IIUC we're relying on catching the word 'Split' in the file name for option 1. If somebody named his test i.e. 'MyTestGroup1', 'MyTestGroup2', 'TTLTestPre2038', 'TTLTestPost2038',... I think we would leak tests again? or any other word that is not specifically accounted for. Unless I am missing sthg ofc! :-)

On 25/10/22 10:39, Miklosovic, Stefan wrote:
> I think that what you wrote is not entirely correct. It will prevent it from happening again when there are tests ending on "Tests" or starting on "Test". The only case it will not cover is "SplitN" issue we plan to cover with relaxed test.name<http://test.name> property.
>
> It seems like what you wrote means that we will fix it and tests will leak again. That is not true.
>
> ________________________________________
> From: Berenguer Blasi <be...@gmail.com>>>>>
> Sent: Tuesday, October 25, 2022 7:26
> To: dev@cassandra.apache.org<ma...@cassandra.apache.org>>>>
> Subject: Re: Some tests are never executed in CI due to their name
>
> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
>
>
>
>
> The problem with using the first approach is that it fixes the current
> situation but it doesn't prevent it from happening again. The second
> proposal prevents it from happening again but at the cost of a bigger
> rename I'd volunteer to if needed.
>
> Regards
>
> On 24/10/22 20:38, Miklosovic, Stefan wrote:
>> Yeah, that is what the branch in my original email actually already solved. I mean ...
>>
>> CassandraAuthorizerTruncatingTests
>> BatchTests
>> UUIDTests
>>
>> these are ending on Tests which is illegal and they are fixed there.
>>
>> Other cases are either "TestBase" or "Tester" which should be legal.
>>
>> I think using the first approach and fixing all "SplitN" PLUS adding Split* to test.name<http://test.name> regexp should do it.
>>
>> I think we can "automate" at most like you suggest and scan it manually, fix the stuff and from then incorporate checkstyle to take care of that.
>>
>> There are also some classes which do include @Test methods but I think they are abstract as they are meant to be extended as the real test is just wrapping that. This might happen when there are slight variations across test classes. This is fine as well.
>>
>> ________________________________________
>> From: Derek Chen-Becker <de...@chen-becker.org>>>>>
>> Sent: Monday, October 24, 2022 19:18
>> To: dev@cassandra.apache.org<ma...@cassandra.apache.org>>>>
>> Subject: Re: Some tests are never executed in CI due to their name
>>
>> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
>>
>>
>>
>> OK, I think the approach you're proposing is fine. As an orthogonal idea, is there any way to do a one-time audit to narrow down files that we need to inspect, or any way to automate confirmation of a file containing tests? For example:
>>
>> * Start with all of the files under the `test` directory (is it safe to assume that this is the only location for test classes?) => 1457 files
>> * Remove any files that already end in "Test.java" => down to 396 files
>> * Is it safe to assume that only files having "Test" somewhere in their name are test classes? That reduces the list down to 60 files
>>
>> Alternatively, are we only talking about unit tests? Would it be sufficient to look for files that don't end in "Test.java" but contain "@Test" annotations? That's a significantly smaller set, so maybe we could fix unit tests first:
>>
>> ❯❯❯ for fl in test/unit/**/*.java; do if [[ $fl != *Test.java ]]; then if grep -qF '@Test' $fl; then print $fl; fi; fi; done                                                                                                                                                              circleci-parity ✱ ◼
>> test/unit/org/apache/cassandra/auth/CassandraAuthorizerTruncatingTests.java
>> test/unit/org/apache/cassandra/cql3/BatchTests.java
>> test/unit/org/apache/cassandra/cql3/KeywordTestBase.java
>> test/unit/org/apache/cassandra/db/guardrails/GuardrailAllowUncompressedTables.java
>> test/unit/org/apache/cassandra/db/guardrails/GuardrailConsistencyLevelsTester.java
>> test/unit/org/apache/cassandra/db/guardrails/GuardrailSecondaryIndexTester.java
>> test/unit/org/apache/cassandra/db/guardrails/GuardrailSecondaryIndexesPerTable.java
>> test/unit/org/apache/cassandra/db/guardrails/ThresholdTester.java
>> test/unit/org/apache/cassandra/dht/PartitionerTestCase.java
>> test/unit/org/apache/cassandra/utils/UUIDTests.java
>>
>> Cheers,
>>
>> Derek
>>
>>
>> On Mon, Oct 24, 2022 at 9:00 AM Miklosovic, Stefan <St...@netapp.com>>>>>> wrote:
>> One more point as I was reading your email in a hurry. The ultimate goal is to run all tests. So even if we apply the proposed regexp in checkstyle and there is an unfortunate case that MyTestSplit1 would slip through on a review and it would not be executed in CI, if we do not figure out some better regexp, we might relax "test.name<http://test.name><http://test.name>" property to include Split*.java as well. Yes, this might be also done.
>>
>> Enforcing "all tests to end on "Test" " means that we would need to know which .java files contain tests and which not without looking into them. I do not think that is possible under circumstances we have where (checkstyle module).
>>
>> ________________________________________
>> From: Derek Chen-Becker <de...@chen-becker.org>>>>>>
>> Sent: Monday, October 24, 2022 15:53
>> To: dev@cassandra.apache.org<ma...@cassandra.apache.org>>>>>
>> Subject: Re: Some tests are never executed in CI due to their name
>>
>> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
>>
>>
>>
>> OK, that makes sense, and I missed the rename of "TestBaseImpl" (which I agree is in need of fixing). When you say:
>>
>>> What we are trying to achieve by that is to have a test which ends on "Test" and nothing else and it may contain "Test" only in case it does not start with it.
>> Should the regex enforce the "which ends on Test" part? Otherwise, if we want to allow tests like MyTestSplit1, do we need to update the "test.name<http://test.name><http://test.name><http://test.name>" parameter to allow that in the test runner?
>>
>> Thanks,
>>
>> Derek
>>
>>
>> On Mon, Oct 24, 2022 at 7:27 AM Miklosovic, Stefan <St...@netapp.com>>>>>>> wrote:
>> Hi Derek,
>>
>> yes, the regexp is ^(?!(?:Test))(?!.*?(?:Tests)).*$
>>
>> What we are trying to achieve by that is to have a test which ends on "Test" and nothing else and it may contain "Test" only in case it does not start with it.
>>
>> To your second paragraph, I do not think we have any *Base files which are tests (that would be a bug) nor they are standalone (not inherited). By first approach, TestBaseImpl (which we do have) would start to be invalid, we would have to rename it to "DistributedTestBaseImpl" because it was starting on "Test" which is no-no. I agree that "TestBaseImpl" is rather unfortunate name. Currently, TestBaseImpl itself extends "DistributedTestBase" so in the end we would have "DistributedTestBaseImpl -> TestBaseImpl -> DistributedTestBase". Please keep in mind that "DistributedTestBase" is located in dtest jar (separate project) and that is rather inconvenient to rename.
>>
>> Regards
>>
>> ________________________________________
>> From: Derek Chen-Becker <de...@chen-becker.org>>>>>>>
>> Sent: Monday, October 24, 2022 15:09
>> To: dev@cassandra.apache.org<ma...@cassandra.apache.org>>>>>>
>> Subject: Re: Some tests are never executed in CI due to their name
>>
>> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
>>
>>
>>
>> For clarity, what is the final regex you've landed on for the first approach? Is it "^(?!(?:Test))(?!.*?(?:Tests)).*$" ? Are we just trying to reject anything that *starts* with "Test" or contains "Tests" somewhere in the name? It might be more clear to state what we think a valid test name should be, not in regex form.
>>
>> When it comes to "TestBase" files, are there any of these that are actually intended to be run as a test? The "Base" would indicate to me that it's intended to be inherited, not run directly. Do we have some "TestBase" classes somewhere that aren't inherited and are meant to be run directly, because that definitely seems like a misleading name that should be fixed, invasive or not. I would lean toward the principle of least surprise here, even if it means an involved one-time cleanup.
>>
>> Cheers,
>>
>> Derek
>>
>> On Mon, Oct 24, 2022 at 3:02 AM Miklosovic, Stefan <St...@netapp.com>>>>>>>> wrote:
>> Hi list,
>>
>> it was reported that some tests are not executed as part of CI because of their name (1). There is (2) and (3) so right now it runs only "*Test.java" files.
>>
>> Except of us renaming the affected classes, we should fix this in a more robust way - via checkstyle - so we enforce how test files have to be called.
>>
>> We came up with two approaches and we do not know which one to choose so we want to gather more feedback / opinions.
>>
>> The first approach would make this happen (implemented here (4))
>>
>> class name - passes / fails the checkstyle
>>
>> TestIt - fails
>> TestsThatFeature - fails
>> ThisTestsMyFeature - fails
>> SomeTests - fails
>>
>> SomeHelperClassForTesting - passes
>> SomeTest - passes
>> DistributedTestBase - passes
>>
>> MyTestSplit1 - passes
>>
>> We would fix "SplitN" test files by hand (I think they are not executed right now in CI either) to something like MySplit1Test and MySplit2Test but the regexp we would eventually use would not prevent this from happening in the future. I do not know how to construct such a complex regexp yet which would cover all cases above plus this splitting one. Feel free to look into the ticket where details are discussed.
>>
>> The second approach would forbid to have any occurrence of "test" in the file name but at the end. For example, it would fail on classes like these
>>
>> MyTestSplit1
>> TestHelper
>> BurnTestUtil
>> FuzzTestBase
>> DistributedTestSnitch
>> CASCommonTestCases
>> ServerTestUtils
>> AuthTestUtils
>> TestMemtable
>>
>> While this would also fix "MyTestSplitN" cases, I consider this to be way more invasive to the current codebase because we would have to rename a lot of files (like, hundreds, plus a lot of places they are referenced / used at) and I do not even think this is necessary and desirable. Like, how would we call "FuzzTestBase" to not contain "Test" in it? I think having "test" string in the middle is just fine, specifically for helper classes. One possible fix would be to replace "Test" with "Tester" so we would have "TesterMemtable", "AuthTesterUtils", "TesterHelper", "DistributedTesterSnitch" ... Word "Tester" would be explicitly allowed.
>>
>> I personally lean to the first approach even though cases like "MyTestSplit1" will not be spotted in compile time, but hey, we still do have a review process on top of this. The obvious violations like "TestThisFeature" and "MySuperTests" would be evaluated as violations which I would argue happens the most often.
>>
>> Please keep in mind that we have checkstyle only in 4.1 and trunk. So, while we would fix names of classes in all branches we support so they are picked up by CI, checkstyle for this would be introduced in 4.1 and trunk only.
>>
>> (1) https://issues.apache.org/jira/browse/CASSANDRA-17964
>> (2) https://github.com/apache/cassandra/blob/trunk/build.xml#L61
>> (3) https://github.com/apache/cassandra/blob/trunk/build.xml#L1033
>> (4) https://github.com/instaclustr/cassandra/commits/CASSANDRA-17964-4.1-2
>>
>>
>> --
>> +---------------------------------------------------------------+
>> | Derek Chen-Becker                                             |
>> | GPG Key available at https://keybase.io/dchenbecker and       |
>> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
>> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
>> +---------------------------------------------------------------+
>>
>>
>>
>> --
>> +---------------------------------------------------------------+
>> | Derek Chen-Becker                                             |
>> | GPG Key available at https://keybase.io/dchenbecker and       |
>> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
>> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
>> +---------------------------------------------------------------+
>>
>>
>>
>> --
>> +---------------------------------------------------------------+
>> | Derek Chen-Becker                                             |
>> | GPG Key available at https://keybase.io/dchenbecker and       |
>> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
>> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
>> +---------------------------------------------------------------+
>>







Re: Some tests are never executed in CI due to their name

Posted by Josh McKenzie <jm...@apache.org>.
So to be clear, I don't really feel super strongly on this, I just thought Derek's suggestion seemed clean and elegant. Checkstyle to force a certain naming convention on things is fine enough; I certainly wouldn't stand in the way. Tooling like that self-educates users (see: Guardrails) so it's pretty low friction to the community IMO.

> Secondly, I still do not know how you want to practically implement what you suggest without significantly breaking anything.
It's unclear to me how the combination of "Has @Test, is not an abstract class top level matching file, and is inside the test folder" is going to significantly break anything. Maybe I'm missing something; would love to learn.

> how to actually be sure that a developer knows that his test not only passed but it was selected to be run
Andres' changes to .circleci/generate.sh mean you're going to see a _repeat run of whatever tests you change or add. I think both approaches (heuristic to determine if a file contains tests to run vs. checkstyle enforced naming convention) are more or less comparable here.

On Mon, Oct 31, 2022, at 12:26 PM, Miklosovic, Stefan wrote:
> I think we are already adhering to them without any effort. Overwhelming majority of devs are naming their tests "just right" and I do not think they are doing that with any significant effort, probably none. We are only re-enforcing the unspoken everybody implicitly agreed on. If somebody does not name their tests like the rest, I think it would raise the red flags already during a review.
> 
> Secondly, I still do not know how you want to practically implement what you suggest without significantly breaking anything.
> 
> I think the crucial question we should ask is how to actually be sure that a developer knows that his test not only passed but it was selected to be run (now based on *Test). How did these tests which skipped as of now end up there in the first place? Somebody just committed them, run the CI and "it was green" so they proceed with the commit. But they did not check it was actually run because "why not?".
> 
> So, we are putting a kind of detector in place to signal that a test is not going to be actually even run when the checkstyle fails.
> 
> ________________________________________
> From: Josh McKenzie <jm...@apache.org>
> Sent: Monday, October 31, 2022 17:16
> To: dev
> Subject: Re: Some tests are never executed in CI due to their name
> 
> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
> 
> 
> 
> I do not think that having a test class called "SdsdakASsdadsada" which happen to contain @Test and is not abstract should be considered to be a valid file name / test class
> I mean, I wouldn't recommend it certainly, but I'm not concerned about whether a file is "TestThing" or "ThingTest" or "WeTestThing". The ordering of names on those files adds very little IMO.
> 
> I do not think we are actually winning anything with that approach.
> Well, the name of this thread is "Some tests are never executed in CI due to their name" and it fixes that in a robust way without asking of any behavior changes of the project community to adhere to an arbitrary rigid naming guideline for something. So there's that. :)
> 
> On Mon, Oct 31, 2022, at 8:02 AM, Miklosovic, Stefan wrote:
> I always hit "send" before realizing I have other points to add :)
> 
> I was thinking we want to go with checkstyle and check the name. Not manually parsing the file names. Do not we also want these test files to look consistent? I do not think that having a test class called "SdsdakASsdadsada" which happen to contain @Test and is not abstract should be considered to be a valid file name / test class. Do not we also want to enforce some regularity into this?
> 
> Another point is that even we parsed we so it will be picked up while tests are executed, that means that it will not be checked on the commit?
> 
> Lastly, how do you actually want to do this if we currently have test.name property which is a simple regexp (*Test)? -Dtest.name is pretty much how we do this currently and I am not sure we want to break this in all CI scripts.
> 
> I do not think we are actually winning anything with that approach.
> 
> Regards
> 
> ________________________________________
> From: Josh McKenzie <jm...@apache.org>>
> Sent: Monday, October 31, 2022 12:49
> To: dev
> Subject: Re: Some tests are never executed in CI due to their name
> 
> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
> 
> 
> 
> To Derek's point, why not go with:
> 
> Would it be sufficient to look for files that don't end in "Test.java" but contain "@Test" annotations?
> + exclude files that contain "abstract class" in them that matches the file name?
> 
> Combine that w/checking specific subdirectories and we relax the required naming convention quite a bit; makes it easy to do the right thing and hard to do the wrong thing.
> 
> It's also very simple which is a feature.
> 
> On Fri, Oct 28, 2022, at 4:17 AM, Miklosovic, Stefan wrote:
> Hi Berenguer,
> 
> I agree. We can always go heavy-weight anyway. I will dedicate more time to that the next week. It would be nice if you tagged along and help with the review.
> 
> FYI the ticket is touching all branches we support (there are some skipped tests in 3.0 too), we would just add checkstyle to 4.1 and trunk as discussed, so I expect we fix all 5 branches at once.
> 
> Regards
> 
> Stefan
> 
> ________________________________________
> From: Berenguer Blasi <be...@gmail.com>>>
> Sent: Friday, October 28, 2022 9:43
> To: dev@cassandra.apache.org<ma...@cassandra.apache.org>>
> Subject: Re: Some tests are never executed in CI due to their name
> 
> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
> 
> 
> 
> 
> Hi Stefan,
> 
> seems none of the 2 proposals are strongly opposed to. If you feel more comfortable with option 1 I say we do that, which will always be a step in the right direction. If we see it bites us again in the future then we do 2. But at least we bring this benefit/ticket in. I would also be happy to drive option 2 to completion if needed.
> 
> wdyt?
> 
> On 25/10/22 14:35, Berenguer Blasi wrote:
> 
> Well examples lend themselves to many things. I am sure I could come up with examples in the opposite direction. I'd rather talk in terms of the actual files and tests we have.
> 
> What I was trying to clarify is what I was misunderstanding as I had the impression now we were saying option 1 would not leak again.
> 
> So option 2, given ant and CI seem to rely on the *Test.java notation and also prevents future leaks doesn't seem to me a bad idea. Renaming the offending files seems a reasonable cost to me (Eclipse does it for me).
> 
> My 2cts! But hey, I've been known to be wrong before :shrug: lol
> 
> On 25/10/22 12:04, Andrés de la Peña wrote:
> Note that the test multiplexer also searches for tests ending with "Test", so it will also miss new or modified tests with nonstandard names. The automatically detected tests are listed when running generate.sh, and they are added to the repeated run jobs. That gives us another opportunity to detect missed tests when we introduce something like "MyTestSplit1.java" and see that it's not been included in the mandatory repeated runs.
> 
> On Tue, 25 Oct 2022 at 10:31, Miklosovic, Stefan <St...@netapp.com>>>> wrote:
> Hi Berenguer,
> 
> I am glad you asked. I was expecting this question.
> 
> I think there is a fundamental difference in how we approach this problem. I do not say mine is better, I just find it important to describe it clearly.
> 
> Let's say we are on a ship which leaks. When I detect that, my course of action is to fix the leakage with the biggest patches I can find around with minimal effort necessary so we are not taking water anymore. There might still be occasional leakages which are very rare but I have a crew who is checking the ship constantly anyway (review) and the risk that big leakages like we just fixed happen again are relatively very low.
> 
> You spot a leakage and instead of fixing it with big patches and calling it a day, you are trying to remodel the cabins so they are completely waterproof but while doing so, you renamed them so everyone needs to get used to how these cabins are called and where they are located, because there is this minimal chance that some cadet comes around and starts to live in a cabin he is not supposed to and we need to explain it to him. Thousands of cadets found their cabins just fine. Occasionally, there are few people who just miss the right board completely (Test at start, Tests at the end) and they are automatically navigated around (checkstyle) but once in five years there is this guy who just completely missed it.
> 
> I believe we can just navigate him when that happens. You want to cover that guy too.
> 
> I just find my approach easier.
> 
> You can remodel it all, for sure but I am afraid I will not be a part of that. I just do not find it necessary to do that.
> 
> ________________________________________
> From: Berenguer Blasi <be...@gmail.com>>>>
> Sent: Tuesday, October 25, 2022 11:08
> To: dev@cassandra.apache.org<ma...@cassandra.apache.org>>>
> Subject: Re: Some tests are never executed in CI due to their name
> 
> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
> 
> 
> 
> 
> IIUC we're relying on catching the word 'Split' in the file name for option 1. If somebody named his test i.e. 'MyTestGroup1', 'MyTestGroup2', 'TTLTestPre2038', 'TTLTestPost2038',... I think we would leak tests again? or any other word that is not specifically accounted for. Unless I am missing sthg ofc! :-)
> 
> On 25/10/22 10:39, Miklosovic, Stefan wrote:
> > I think that what you wrote is not entirely correct. It will prevent it from happening again when there are tests ending on "Tests" or starting on "Test". The only case it will not cover is "SplitN" issue we plan to cover with relaxed test.name<http://test.name> property.
> >
> > It seems like what you wrote means that we will fix it and tests will leak again. That is not true.
> >
> > ________________________________________
> > From: Berenguer Blasi <be...@gmail.com>>>>
> > Sent: Tuesday, October 25, 2022 7:26
> > To: dev@cassandra.apache.org<ma...@cassandra.apache.org>>>
> > Subject: Re: Some tests are never executed in CI due to their name
> >
> > NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
> >
> >
> >
> >
> > The problem with using the first approach is that it fixes the current
> > situation but it doesn't prevent it from happening again. The second
> > proposal prevents it from happening again but at the cost of a bigger
> > rename I'd volunteer to if needed.
> >
> > Regards
> >
> > On 24/10/22 20:38, Miklosovic, Stefan wrote:
> >> Yeah, that is what the branch in my original email actually already solved. I mean ...
> >>
> >> CassandraAuthorizerTruncatingTests
> >> BatchTests
> >> UUIDTests
> >>
> >> these are ending on Tests which is illegal and they are fixed there.
> >>
> >> Other cases are either "TestBase" or "Tester" which should be legal.
> >>
> >> I think using the first approach and fixing all "SplitN" PLUS adding Split* to test.name<http://test.name> regexp should do it.
> >>
> >> I think we can "automate" at most like you suggest and scan it manually, fix the stuff and from then incorporate checkstyle to take care of that.
> >>
> >> There are also some classes which do include @Test methods but I think they are abstract as they are meant to be extended as the real test is just wrapping that. This might happen when there are slight variations across test classes. This is fine as well.
> >>
> >> ________________________________________
> >> From: Derek Chen-Becker <de...@chen-becker.org>>>>
> >> Sent: Monday, October 24, 2022 19:18
> >> To: dev@cassandra.apache.org<ma...@cassandra.apache.org>>>
> >> Subject: Re: Some tests are never executed in CI due to their name
> >>
> >> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
> >>
> >>
> >>
> >> OK, I think the approach you're proposing is fine. As an orthogonal idea, is there any way to do a one-time audit to narrow down files that we need to inspect, or any way to automate confirmation of a file containing tests? For example:
> >>
> >> * Start with all of the files under the `test` directory (is it safe to assume that this is the only location for test classes?) => 1457 files
> >> * Remove any files that already end in "Test.java" => down to 396 files
> >> * Is it safe to assume that only files having "Test" somewhere in their name are test classes? That reduces the list down to 60 files
> >>
> >> Alternatively, are we only talking about unit tests? Would it be sufficient to look for files that don't end in "Test.java" but contain "@Test" annotations? That's a significantly smaller set, so maybe we could fix unit tests first:
> >>
> >> ❯❯❯ for fl in test/unit/**/*.java; do if [[ $fl != *Test.java ]]; then if grep -qF '@Test' $fl; then print $fl; fi; fi; done                                                                                                                                                              circleci-parity ✱ ◼
> >> test/unit/org/apache/cassandra/auth/CassandraAuthorizerTruncatingTests.java
> >> test/unit/org/apache/cassandra/cql3/BatchTests.java
> >> test/unit/org/apache/cassandra/cql3/KeywordTestBase.java
> >> test/unit/org/apache/cassandra/db/guardrails/GuardrailAllowUncompressedTables.java
> >> test/unit/org/apache/cassandra/db/guardrails/GuardrailConsistencyLevelsTester.java
> >> test/unit/org/apache/cassandra/db/guardrails/GuardrailSecondaryIndexTester.java
> >> test/unit/org/apache/cassandra/db/guardrails/GuardrailSecondaryIndexesPerTable.java
> >> test/unit/org/apache/cassandra/db/guardrails/ThresholdTester.java
> >> test/unit/org/apache/cassandra/dht/PartitionerTestCase.java
> >> test/unit/org/apache/cassandra/utils/UUIDTests.java
> >>
> >> Cheers,
> >>
> >> Derek
> >>
> >>
> >> On Mon, Oct 24, 2022 at 9:00 AM Miklosovic, Stefan <St...@netapp.com>>>>> wrote:
> >> One more point as I was reading your email in a hurry. The ultimate goal is to run all tests. So even if we apply the proposed regexp in checkstyle and there is an unfortunate case that MyTestSplit1 would slip through on a review and it would not be executed in CI, if we do not figure out some better regexp, we might relax "test.name<http://test.name><http://test.name>" property to include Split*.java as well. Yes, this might be also done.
> >>
> >> Enforcing "all tests to end on "Test" " means that we would need to know which .java files contain tests and which not without looking into them. I do not think that is possible under circumstances we have where (checkstyle module).
> >>
> >> ________________________________________
> >> From: Derek Chen-Becker <de...@chen-becker.org>>>>>
> >> Sent: Monday, October 24, 2022 15:53
> >> To: dev@cassandra.apache.org<ma...@cassandra.apache.org>>>>
> >> Subject: Re: Some tests are never executed in CI due to their name
> >>
> >> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
> >>
> >>
> >>
> >> OK, that makes sense, and I missed the rename of "TestBaseImpl" (which I agree is in need of fixing). When you say:
> >>
> >>> What we are trying to achieve by that is to have a test which ends on "Test" and nothing else and it may contain "Test" only in case it does not start with it.
> >> Should the regex enforce the "which ends on Test" part? Otherwise, if we want to allow tests like MyTestSplit1, do we need to update the "test.name<http://test.name><http://test.name><http://test.name>" parameter to allow that in the test runner?
> >>
> >> Thanks,
> >>
> >> Derek
> >>
> >>
> >> On Mon, Oct 24, 2022 at 7:27 AM Miklosovic, Stefan <St...@netapp.com>>>>>> wrote:
> >> Hi Derek,
> >>
> >> yes, the regexp is ^(?!(?:Test))(?!.*?(?:Tests)).*$
> >>
> >> What we are trying to achieve by that is to have a test which ends on "Test" and nothing else and it may contain "Test" only in case it does not start with it.
> >>
> >> To your second paragraph, I do not think we have any *Base files which are tests (that would be a bug) nor they are standalone (not inherited). By first approach, TestBaseImpl (which we do have) would start to be invalid, we would have to rename it to "DistributedTestBaseImpl" because it was starting on "Test" which is no-no. I agree that "TestBaseImpl" is rather unfortunate name. Currently, TestBaseImpl itself extends "DistributedTestBase" so in the end we would have "DistributedTestBaseImpl -> TestBaseImpl -> DistributedTestBase". Please keep in mind that "DistributedTestBase" is located in dtest jar (separate project) and that is rather inconvenient to rename.
> >>
> >> Regards
> >>
> >> ________________________________________
> >> From: Derek Chen-Becker <de...@chen-becker.org>>>>>>
> >> Sent: Monday, October 24, 2022 15:09
> >> To: dev@cassandra.apache.org<ma...@cassandra.apache.org>>>>>
> >> Subject: Re: Some tests are never executed in CI due to their name
> >>
> >> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
> >>
> >>
> >>
> >> For clarity, what is the final regex you've landed on for the first approach? Is it "^(?!(?:Test))(?!.*?(?:Tests)).*$" ? Are we just trying to reject anything that *starts* with "Test" or contains "Tests" somewhere in the name? It might be more clear to state what we think a valid test name should be, not in regex form.
> >>
> >> When it comes to "TestBase" files, are there any of these that are actually intended to be run as a test? The "Base" would indicate to me that it's intended to be inherited, not run directly. Do we have some "TestBase" classes somewhere that aren't inherited and are meant to be run directly, because that definitely seems like a misleading name that should be fixed, invasive or not. I would lean toward the principle of least surprise here, even if it means an involved one-time cleanup.
> >>
> >> Cheers,
> >>
> >> Derek
> >>
> >> On Mon, Oct 24, 2022 at 3:02 AM Miklosovic, Stefan <St...@netapp.com>>>>>>> wrote:
> >> Hi list,
> >>
> >> it was reported that some tests are not executed as part of CI because of their name (1). There is (2) and (3) so right now it runs only "*Test.java" files.
> >>
> >> Except of us renaming the affected classes, we should fix this in a more robust way - via checkstyle - so we enforce how test files have to be called.
> >>
> >> We came up with two approaches and we do not know which one to choose so we want to gather more feedback / opinions.
> >>
> >> The first approach would make this happen (implemented here (4))
> >>
> >> class name - passes / fails the checkstyle
> >>
> >> TestIt - fails
> >> TestsThatFeature - fails
> >> ThisTestsMyFeature - fails
> >> SomeTests - fails
> >>
> >> SomeHelperClassForTesting - passes
> >> SomeTest - passes
> >> DistributedTestBase - passes
> >>
> >> MyTestSplit1 - passes
> >>
> >> We would fix "SplitN" test files by hand (I think they are not executed right now in CI either) to something like MySplit1Test and MySplit2Test but the regexp we would eventually use would not prevent this from happening in the future. I do not know how to construct such a complex regexp yet which would cover all cases above plus this splitting one. Feel free to look into the ticket where details are discussed.
> >>
> >> The second approach would forbid to have any occurrence of "test" in the file name but at the end. For example, it would fail on classes like these
> >>
> >> MyTestSplit1
> >> TestHelper
> >> BurnTestUtil
> >> FuzzTestBase
> >> DistributedTestSnitch
> >> CASCommonTestCases
> >> ServerTestUtils
> >> AuthTestUtils
> >> TestMemtable
> >>
> >> While this would also fix "MyTestSplitN" cases, I consider this to be way more invasive to the current codebase because we would have to rename a lot of files (like, hundreds, plus a lot of places they are referenced / used at) and I do not even think this is necessary and desirable. Like, how would we call "FuzzTestBase" to not contain "Test" in it? I think having "test" string in the middle is just fine, specifically for helper classes. One possible fix would be to replace "Test" with "Tester" so we would have "TesterMemtable", "AuthTesterUtils", "TesterHelper", "DistributedTesterSnitch" ... Word "Tester" would be explicitly allowed.
> >>
> >> I personally lean to the first approach even though cases like "MyTestSplit1" will not be spotted in compile time, but hey, we still do have a review process on top of this. The obvious violations like "TestThisFeature" and "MySuperTests" would be evaluated as violations which I would argue happens the most often.
> >>
> >> Please keep in mind that we have checkstyle only in 4.1 and trunk. So, while we would fix names of classes in all branches we support so they are picked up by CI, checkstyle for this would be introduced in 4.1 and trunk only.
> >>
> >> (1) https://issues.apache.org/jira/browse/CASSANDRA-17964
> >> (2) https://github.com/apache/cassandra/blob/trunk/build.xml#L61
> >> (3) https://github.com/apache/cassandra/blob/trunk/build.xml#L1033
> >> (4) https://github.com/instaclustr/cassandra/commits/CASSANDRA-17964-4.1-2
> >>
> >>
> >> --
> >> +---------------------------------------------------------------+
> >> | Derek Chen-Becker                                             |
> >> | GPG Key available at https://keybase.io/dchenbecker and       |
> >> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
> >> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
> >> +---------------------------------------------------------------+
> >>
> >>
> >>
> >> --
> >> +---------------------------------------------------------------+
> >> | Derek Chen-Becker                                             |
> >> | GPG Key available at https://keybase.io/dchenbecker and       |
> >> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
> >> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
> >> +---------------------------------------------------------------+
> >>
> >>
> >>
> >> --
> >> +---------------------------------------------------------------+
> >> | Derek Chen-Becker                                             |
> >> | GPG Key available at https://keybase.io/dchenbecker and       |
> >> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
> >> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
> >> +---------------------------------------------------------------+
> >>
> 
> 
> 
> 
> 

Re: Some tests are never executed in CI due to their name

Posted by "Miklosovic, Stefan" <St...@netapp.com>.
I think we are already adhering to them without any effort. Overwhelming majority of devs are naming their tests "just right" and I do not think they are doing that with any significant effort, probably none. We are only re-enforcing the unspoken everybody implicitly agreed on. If somebody does not name their tests like the rest, I think it would raise the red flags already during a review.

Secondly, I still do not know how you want to practically implement what you suggest without significantly breaking anything.

I think the crucial question we should ask is how to actually be sure that a developer knows that his test not only passed but it was selected to be run (now based on *Test). How did these tests which skipped as of now end up there in the first place? Somebody just committed them, run the CI and "it was green" so they proceed with the commit. But they did not check it was actually run because "why not?".

So, we are putting a kind of detector in place to signal that a test is not going to be actually even run when the checkstyle fails.

________________________________________
From: Josh McKenzie <jm...@apache.org>
Sent: Monday, October 31, 2022 17:16
To: dev
Subject: Re: Some tests are never executed in CI due to their name

NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.



I do not think that having a test class called "SdsdakASsdadsada" which happen to contain @Test and is not abstract should be considered to be a valid file name / test class
I mean, I wouldn't recommend it certainly, but I'm not concerned about whether a file is "TestThing" or "ThingTest" or "WeTestThing". The ordering of names on those files adds very little IMO.

I do not think we are actually winning anything with that approach.
Well, the name of this thread is "Some tests are never executed in CI due to their name" and it fixes that in a robust way without asking of any behavior changes of the project community to adhere to an arbitrary rigid naming guideline for something. So there's that. :)

On Mon, Oct 31, 2022, at 8:02 AM, Miklosovic, Stefan wrote:
I always hit "send" before realizing I have other points to add :)

I was thinking we want to go with checkstyle and check the name. Not manually parsing the file names. Do not we also want these test files to look consistent? I do not think that having a test class called "SdsdakASsdadsada" which happen to contain @Test and is not abstract should be considered to be a valid file name / test class. Do not we also want to enforce some regularity into this?

Another point is that even we parsed we so it will be picked up while tests are executed, that means that it will not be checked on the commit?

Lastly, how do you actually want to do this if we currently have test.name property which is a simple regexp (*Test)? -Dtest.name is pretty much how we do this currently and I am not sure we want to break this in all CI scripts.

I do not think we are actually winning anything with that approach.

Regards

________________________________________
From: Josh McKenzie <jm...@apache.org>>
Sent: Monday, October 31, 2022 12:49
To: dev
Subject: Re: Some tests are never executed in CI due to their name

NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.



To Derek's point, why not go with:

Would it be sufficient to look for files that don't end in "Test.java" but contain "@Test" annotations?
+ exclude files that contain "abstract class" in them that matches the file name?

Combine that w/checking specific subdirectories and we relax the required naming convention quite a bit; makes it easy to do the right thing and hard to do the wrong thing.

It's also very simple which is a feature.

On Fri, Oct 28, 2022, at 4:17 AM, Miklosovic, Stefan wrote:
Hi Berenguer,

I agree. We can always go heavy-weight anyway. I will dedicate more time to that the next week. It would be nice if you tagged along and help with the review.

FYI the ticket is touching all branches we support (there are some skipped tests in 3.0 too), we would just add checkstyle to 4.1 and trunk as discussed, so I expect we fix all 5 branches at once.

Regards

Stefan

________________________________________
From: Berenguer Blasi <be...@gmail.com>>>
Sent: Friday, October 28, 2022 9:43
To: dev@cassandra.apache.org<ma...@cassandra.apache.org>>
Subject: Re: Some tests are never executed in CI due to their name

NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.




Hi Stefan,

seems none of the 2 proposals are strongly opposed to. If you feel more comfortable with option 1 I say we do that, which will always be a step in the right direction. If we see it bites us again in the future then we do 2. But at least we bring this benefit/ticket in. I would also be happy to drive option 2 to completion if needed.

wdyt?

On 25/10/22 14:35, Berenguer Blasi wrote:

Well examples lend themselves to many things. I am sure I could come up with examples in the opposite direction. I'd rather talk in terms of the actual files and tests we have.

What I was trying to clarify is what I was misunderstanding as I had the impression now we were saying option 1 would not leak again.

So option 2, given ant and CI seem to rely on the *Test.java notation and also prevents future leaks doesn't seem to me a bad idea. Renaming the offending files seems a reasonable cost to me (Eclipse does it for me).

My 2cts! But hey, I've been known to be wrong before :shrug: lol

On 25/10/22 12:04, Andrés de la Peña wrote:
Note that the test multiplexer also searches for tests ending with "Test", so it will also miss new or modified tests with nonstandard names. The automatically detected tests are listed when running generate.sh, and they are added to the repeated run jobs. That gives us another opportunity to detect missed tests when we introduce something like "MyTestSplit1.java" and see that it's not been included in the mandatory repeated runs.

On Tue, 25 Oct 2022 at 10:31, Miklosovic, Stefan <St...@netapp.com>>>> wrote:
Hi Berenguer,

I am glad you asked. I was expecting this question.

I think there is a fundamental difference in how we approach this problem. I do not say mine is better, I just find it important to describe it clearly.

Let's say we are on a ship which leaks. When I detect that, my course of action is to fix the leakage with the biggest patches I can find around with minimal effort necessary so we are not taking water anymore. There might still be occasional leakages which are very rare but I have a crew who is checking the ship constantly anyway (review) and the risk that big leakages like we just fixed happen again are relatively very low.

You spot a leakage and instead of fixing it with big patches and calling it a day, you are trying to remodel the cabins so they are completely waterproof but while doing so, you renamed them so everyone needs to get used to how these cabins are called and where they are located, because there is this minimal chance that some cadet comes around and starts to live in a cabin he is not supposed to and we need to explain it to him. Thousands of cadets found their cabins just fine. Occasionally, there are few people who just miss the right board completely (Test at start, Tests at the end) and they are automatically navigated around (checkstyle) but once in five years there is this guy who just completely missed it.

I believe we can just navigate him when that happens. You want to cover that guy too.

I just find my approach easier.

You can remodel it all, for sure but I am afraid I will not be a part of that. I just do not find it necessary to do that.

________________________________________
From: Berenguer Blasi <be...@gmail.com>>>>
Sent: Tuesday, October 25, 2022 11:08
To: dev@cassandra.apache.org<ma...@cassandra.apache.org>>>
Subject: Re: Some tests are never executed in CI due to their name

NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.




IIUC we're relying on catching the word 'Split' in the file name for option 1. If somebody named his test i.e. 'MyTestGroup1', 'MyTestGroup2', 'TTLTestPre2038', 'TTLTestPost2038',... I think we would leak tests again? or any other word that is not specifically accounted for. Unless I am missing sthg ofc! :-)

On 25/10/22 10:39, Miklosovic, Stefan wrote:
> I think that what you wrote is not entirely correct. It will prevent it from happening again when there are tests ending on "Tests" or starting on "Test". The only case it will not cover is "SplitN" issue we plan to cover with relaxed test.name<http://test.name> property.
>
> It seems like what you wrote means that we will fix it and tests will leak again. That is not true.
>
> ________________________________________
> From: Berenguer Blasi <be...@gmail.com>>>>
> Sent: Tuesday, October 25, 2022 7:26
> To: dev@cassandra.apache.org<ma...@cassandra.apache.org>>>
> Subject: Re: Some tests are never executed in CI due to their name
>
> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
>
>
>
>
> The problem with using the first approach is that it fixes the current
> situation but it doesn't prevent it from happening again. The second
> proposal prevents it from happening again but at the cost of a bigger
> rename I'd volunteer to if needed.
>
> Regards
>
> On 24/10/22 20:38, Miklosovic, Stefan wrote:
>> Yeah, that is what the branch in my original email actually already solved. I mean ...
>>
>> CassandraAuthorizerTruncatingTests
>> BatchTests
>> UUIDTests
>>
>> these are ending on Tests which is illegal and they are fixed there.
>>
>> Other cases are either "TestBase" or "Tester" which should be legal.
>>
>> I think using the first approach and fixing all "SplitN" PLUS adding Split* to test.name<http://test.name> regexp should do it.
>>
>> I think we can "automate" at most like you suggest and scan it manually, fix the stuff and from then incorporate checkstyle to take care of that.
>>
>> There are also some classes which do include @Test methods but I think they are abstract as they are meant to be extended as the real test is just wrapping that. This might happen when there are slight variations across test classes. This is fine as well.
>>
>> ________________________________________
>> From: Derek Chen-Becker <de...@chen-becker.org>>>>
>> Sent: Monday, October 24, 2022 19:18
>> To: dev@cassandra.apache.org<ma...@cassandra.apache.org>>>
>> Subject: Re: Some tests are never executed in CI due to their name
>>
>> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
>>
>>
>>
>> OK, I think the approach you're proposing is fine. As an orthogonal idea, is there any way to do a one-time audit to narrow down files that we need to inspect, or any way to automate confirmation of a file containing tests? For example:
>>
>> * Start with all of the files under the `test` directory (is it safe to assume that this is the only location for test classes?) => 1457 files
>> * Remove any files that already end in "Test.java" => down to 396 files
>> * Is it safe to assume that only files having "Test" somewhere in their name are test classes? That reduces the list down to 60 files
>>
>> Alternatively, are we only talking about unit tests? Would it be sufficient to look for files that don't end in "Test.java" but contain "@Test" annotations? That's a significantly smaller set, so maybe we could fix unit tests first:
>>
>> ❯❯❯ for fl in test/unit/**/*.java; do if [[ $fl != *Test.java ]]; then if grep -qF '@Test' $fl; then print $fl; fi; fi; done                                                                                                                                                              circleci-parity ✱ ◼
>> test/unit/org/apache/cassandra/auth/CassandraAuthorizerTruncatingTests.java
>> test/unit/org/apache/cassandra/cql3/BatchTests.java
>> test/unit/org/apache/cassandra/cql3/KeywordTestBase.java
>> test/unit/org/apache/cassandra/db/guardrails/GuardrailAllowUncompressedTables.java
>> test/unit/org/apache/cassandra/db/guardrails/GuardrailConsistencyLevelsTester.java
>> test/unit/org/apache/cassandra/db/guardrails/GuardrailSecondaryIndexTester.java
>> test/unit/org/apache/cassandra/db/guardrails/GuardrailSecondaryIndexesPerTable.java
>> test/unit/org/apache/cassandra/db/guardrails/ThresholdTester.java
>> test/unit/org/apache/cassandra/dht/PartitionerTestCase.java
>> test/unit/org/apache/cassandra/utils/UUIDTests.java
>>
>> Cheers,
>>
>> Derek
>>
>>
>> On Mon, Oct 24, 2022 at 9:00 AM Miklosovic, Stefan <St...@netapp.com>>>>> wrote:
>> One more point as I was reading your email in a hurry. The ultimate goal is to run all tests. So even if we apply the proposed regexp in checkstyle and there is an unfortunate case that MyTestSplit1 would slip through on a review and it would not be executed in CI, if we do not figure out some better regexp, we might relax "test.name<http://test.name><http://test.name>" property to include Split*.java as well. Yes, this might be also done.
>>
>> Enforcing "all tests to end on "Test" " means that we would need to know which .java files contain tests and which not without looking into them. I do not think that is possible under circumstances we have where (checkstyle module).
>>
>> ________________________________________
>> From: Derek Chen-Becker <de...@chen-becker.org>>>>>
>> Sent: Monday, October 24, 2022 15:53
>> To: dev@cassandra.apache.org<ma...@cassandra.apache.org>>>>
>> Subject: Re: Some tests are never executed in CI due to their name
>>
>> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
>>
>>
>>
>> OK, that makes sense, and I missed the rename of "TestBaseImpl" (which I agree is in need of fixing). When you say:
>>
>>> What we are trying to achieve by that is to have a test which ends on "Test" and nothing else and it may contain "Test" only in case it does not start with it.
>> Should the regex enforce the "which ends on Test" part? Otherwise, if we want to allow tests like MyTestSplit1, do we need to update the "test.name<http://test.name><http://test.name><http://test.name>" parameter to allow that in the test runner?
>>
>> Thanks,
>>
>> Derek
>>
>>
>> On Mon, Oct 24, 2022 at 7:27 AM Miklosovic, Stefan <St...@netapp.com>>>>>> wrote:
>> Hi Derek,
>>
>> yes, the regexp is ^(?!(?:Test))(?!.*?(?:Tests)).*$
>>
>> What we are trying to achieve by that is to have a test which ends on "Test" and nothing else and it may contain "Test" only in case it does not start with it.
>>
>> To your second paragraph, I do not think we have any *Base files which are tests (that would be a bug) nor they are standalone (not inherited). By first approach, TestBaseImpl (which we do have) would start to be invalid, we would have to rename it to "DistributedTestBaseImpl" because it was starting on "Test" which is no-no. I agree that "TestBaseImpl" is rather unfortunate name. Currently, TestBaseImpl itself extends "DistributedTestBase" so in the end we would have "DistributedTestBaseImpl -> TestBaseImpl -> DistributedTestBase". Please keep in mind that "DistributedTestBase" is located in dtest jar (separate project) and that is rather inconvenient to rename.
>>
>> Regards
>>
>> ________________________________________
>> From: Derek Chen-Becker <de...@chen-becker.org>>>>>>
>> Sent: Monday, October 24, 2022 15:09
>> To: dev@cassandra.apache.org<ma...@cassandra.apache.org>>>>>
>> Subject: Re: Some tests are never executed in CI due to their name
>>
>> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
>>
>>
>>
>> For clarity, what is the final regex you've landed on for the first approach? Is it "^(?!(?:Test))(?!.*?(?:Tests)).*$" ? Are we just trying to reject anything that *starts* with "Test" or contains "Tests" somewhere in the name? It might be more clear to state what we think a valid test name should be, not in regex form.
>>
>> When it comes to "TestBase" files, are there any of these that are actually intended to be run as a test? The "Base" would indicate to me that it's intended to be inherited, not run directly. Do we have some "TestBase" classes somewhere that aren't inherited and are meant to be run directly, because that definitely seems like a misleading name that should be fixed, invasive or not. I would lean toward the principle of least surprise here, even if it means an involved one-time cleanup.
>>
>> Cheers,
>>
>> Derek
>>
>> On Mon, Oct 24, 2022 at 3:02 AM Miklosovic, Stefan <St...@netapp.com>>>>>>> wrote:
>> Hi list,
>>
>> it was reported that some tests are not executed as part of CI because of their name (1). There is (2) and (3) so right now it runs only "*Test.java" files.
>>
>> Except of us renaming the affected classes, we should fix this in a more robust way - via checkstyle - so we enforce how test files have to be called.
>>
>> We came up with two approaches and we do not know which one to choose so we want to gather more feedback / opinions.
>>
>> The first approach would make this happen (implemented here (4))
>>
>> class name - passes / fails the checkstyle
>>
>> TestIt - fails
>> TestsThatFeature - fails
>> ThisTestsMyFeature - fails
>> SomeTests - fails
>>
>> SomeHelperClassForTesting - passes
>> SomeTest - passes
>> DistributedTestBase - passes
>>
>> MyTestSplit1 - passes
>>
>> We would fix "SplitN" test files by hand (I think they are not executed right now in CI either) to something like MySplit1Test and MySplit2Test but the regexp we would eventually use would not prevent this from happening in the future. I do not know how to construct such a complex regexp yet which would cover all cases above plus this splitting one. Feel free to look into the ticket where details are discussed.
>>
>> The second approach would forbid to have any occurrence of "test" in the file name but at the end. For example, it would fail on classes like these
>>
>> MyTestSplit1
>> TestHelper
>> BurnTestUtil
>> FuzzTestBase
>> DistributedTestSnitch
>> CASCommonTestCases
>> ServerTestUtils
>> AuthTestUtils
>> TestMemtable
>>
>> While this would also fix "MyTestSplitN" cases, I consider this to be way more invasive to the current codebase because we would have to rename a lot of files (like, hundreds, plus a lot of places they are referenced / used at) and I do not even think this is necessary and desirable. Like, how would we call "FuzzTestBase" to not contain "Test" in it? I think having "test" string in the middle is just fine, specifically for helper classes. One possible fix would be to replace "Test" with "Tester" so we would have "TesterMemtable", "AuthTesterUtils", "TesterHelper", "DistributedTesterSnitch" ... Word "Tester" would be explicitly allowed.
>>
>> I personally lean to the first approach even though cases like "MyTestSplit1" will not be spotted in compile time, but hey, we still do have a review process on top of this. The obvious violations like "TestThisFeature" and "MySuperTests" would be evaluated as violations which I would argue happens the most often.
>>
>> Please keep in mind that we have checkstyle only in 4.1 and trunk. So, while we would fix names of classes in all branches we support so they are picked up by CI, checkstyle for this would be introduced in 4.1 and trunk only.
>>
>> (1) https://issues.apache.org/jira/browse/CASSANDRA-17964
>> (2) https://github.com/apache/cassandra/blob/trunk/build.xml#L61
>> (3) https://github.com/apache/cassandra/blob/trunk/build.xml#L1033
>> (4) https://github.com/instaclustr/cassandra/commits/CASSANDRA-17964-4.1-2
>>
>>
>> --
>> +---------------------------------------------------------------+
>> | Derek Chen-Becker                                             |
>> | GPG Key available at https://keybase.io/dchenbecker and       |
>> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
>> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
>> +---------------------------------------------------------------+
>>
>>
>>
>> --
>> +---------------------------------------------------------------+
>> | Derek Chen-Becker                                             |
>> | GPG Key available at https://keybase.io/dchenbecker and       |
>> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
>> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
>> +---------------------------------------------------------------+
>>
>>
>>
>> --
>> +---------------------------------------------------------------+
>> | Derek Chen-Becker                                             |
>> | GPG Key available at https://keybase.io/dchenbecker and       |
>> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
>> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
>> +---------------------------------------------------------------+
>>





Re: Some tests are never executed in CI due to their name

Posted by Josh McKenzie <jm...@apache.org>.
> I do not think that having a test class called "SdsdakASsdadsada" which happen to contain @Test and is not abstract should be considered to be a valid file name / test class
I mean, I wouldn't recommend it certainly, but I'm not concerned about whether a file is "TestThing" or "ThingTest" or "WeTestThing". The ordering of names on those files adds very little IMO.

> I do not think we are actually winning anything with that approach.
Well, the name of this thread is "Some tests are never executed in CI due to their name" and it fixes that in a robust way without asking of any behavior changes of the project community to adhere to an arbitrary rigid naming guideline for something. So there's that. :)

On Mon, Oct 31, 2022, at 8:02 AM, Miklosovic, Stefan wrote:
> I always hit "send" before realizing I have other points to add :)
> 
> I was thinking we want to go with checkstyle and check the name. Not manually parsing the file names. Do not we also want these test files to look consistent? I do not think that having a test class called "SdsdakASsdadsada" which happen to contain @Test and is not abstract should be considered to be a valid file name / test class. Do not we also want to enforce some regularity into this?
> 
> Another point is that even we parsed we so it will be picked up while tests are executed, that means that it will not be checked on the commit?
> 
> Lastly, how do you actually want to do this if we currently have test.name property which is a simple regexp (*Test)? -Dtest.name is pretty much how we do this currently and I am not sure we want to break this in all CI scripts.
> 
> I do not think we are actually winning anything with that approach.
> 
> Regards
> 
> ________________________________________
> From: Josh McKenzie <jm...@apache.org>
> Sent: Monday, October 31, 2022 12:49
> To: dev
> Subject: Re: Some tests are never executed in CI due to their name
> 
> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
> 
> 
> 
> To Derek's point, why not go with:
> 
> Would it be sufficient to look for files that don't end in "Test.java" but contain "@Test" annotations?
> + exclude files that contain "abstract class" in them that matches the file name?
> 
> Combine that w/checking specific subdirectories and we relax the required naming convention quite a bit; makes it easy to do the right thing and hard to do the wrong thing.
> 
> It's also very simple which is a feature.
> 
> On Fri, Oct 28, 2022, at 4:17 AM, Miklosovic, Stefan wrote:
> Hi Berenguer,
> 
> I agree. We can always go heavy-weight anyway. I will dedicate more time to that the next week. It would be nice if you tagged along and help with the review.
> 
> FYI the ticket is touching all branches we support (there are some skipped tests in 3.0 too), we would just add checkstyle to 4.1 and trunk as discussed, so I expect we fix all 5 branches at once.
> 
> Regards
> 
> Stefan
> 
> ________________________________________
> From: Berenguer Blasi <be...@gmail.com>>
> Sent: Friday, October 28, 2022 9:43
> To: dev@cassandra.apache.org<ma...@cassandra.apache.org>
> Subject: Re: Some tests are never executed in CI due to their name
> 
> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
> 
> 
> 
> 
> Hi Stefan,
> 
> seems none of the 2 proposals are strongly opposed to. If you feel more comfortable with option 1 I say we do that, which will always be a step in the right direction. If we see it bites us again in the future then we do 2. But at least we bring this benefit/ticket in. I would also be happy to drive option 2 to completion if needed.
> 
> wdyt?
> 
> On 25/10/22 14:35, Berenguer Blasi wrote:
> 
> Well examples lend themselves to many things. I am sure I could come up with examples in the opposite direction. I'd rather talk in terms of the actual files and tests we have.
> 
> What I was trying to clarify is what I was misunderstanding as I had the impression now we were saying option 1 would not leak again.
> 
> So option 2, given ant and CI seem to rely on the *Test.java notation and also prevents future leaks doesn't seem to me a bad idea. Renaming the offending files seems a reasonable cost to me (Eclipse does it for me).
> 
> My 2cts! But hey, I've been known to be wrong before :shrug: lol
> 
> On 25/10/22 12:04, Andrés de la Peña wrote:
> Note that the test multiplexer also searches for tests ending with "Test", so it will also miss new or modified tests with nonstandard names. The automatically detected tests are listed when running generate.sh, and they are added to the repeated run jobs. That gives us another opportunity to detect missed tests when we introduce something like "MyTestSplit1.java" and see that it's not been included in the mandatory repeated runs.
> 
> On Tue, 25 Oct 2022 at 10:31, Miklosovic, Stefan <St...@netapp.com>>> wrote:
> Hi Berenguer,
> 
> I am glad you asked. I was expecting this question.
> 
> I think there is a fundamental difference in how we approach this problem. I do not say mine is better, I just find it important to describe it clearly.
> 
> Let's say we are on a ship which leaks. When I detect that, my course of action is to fix the leakage with the biggest patches I can find around with minimal effort necessary so we are not taking water anymore. There might still be occasional leakages which are very rare but I have a crew who is checking the ship constantly anyway (review) and the risk that big leakages like we just fixed happen again are relatively very low.
> 
> You spot a leakage and instead of fixing it with big patches and calling it a day, you are trying to remodel the cabins so they are completely waterproof but while doing so, you renamed them so everyone needs to get used to how these cabins are called and where they are located, because there is this minimal chance that some cadet comes around and starts to live in a cabin he is not supposed to and we need to explain it to him. Thousands of cadets found their cabins just fine. Occasionally, there are few people who just miss the right board completely (Test at start, Tests at the end) and they are automatically navigated around (checkstyle) but once in five years there is this guy who just completely missed it.
> 
> I believe we can just navigate him when that happens. You want to cover that guy too.
> 
> I just find my approach easier.
> 
> You can remodel it all, for sure but I am afraid I will not be a part of that. I just do not find it necessary to do that.
> 
> ________________________________________
> From: Berenguer Blasi <be...@gmail.com>>>
> Sent: Tuesday, October 25, 2022 11:08
> To: dev@cassandra.apache.org<ma...@cassandra.apache.org>>
> Subject: Re: Some tests are never executed in CI due to their name
> 
> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
> 
> 
> 
> 
> IIUC we're relying on catching the word 'Split' in the file name for option 1. If somebody named his test i.e. 'MyTestGroup1', 'MyTestGroup2', 'TTLTestPre2038', 'TTLTestPost2038',... I think we would leak tests again? or any other word that is not specifically accounted for. Unless I am missing sthg ofc! :-)
> 
> On 25/10/22 10:39, Miklosovic, Stefan wrote:
> > I think that what you wrote is not entirely correct. It will prevent it from happening again when there are tests ending on "Tests" or starting on "Test". The only case it will not cover is "SplitN" issue we plan to cover with relaxed test.name<http://test.name> property.
> >
> > It seems like what you wrote means that we will fix it and tests will leak again. That is not true.
> >
> > ________________________________________
> > From: Berenguer Blasi <be...@gmail.com>>>
> > Sent: Tuesday, October 25, 2022 7:26
> > To: dev@cassandra.apache.org<ma...@cassandra.apache.org>>
> > Subject: Re: Some tests are never executed in CI due to their name
> >
> > NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
> >
> >
> >
> >
> > The problem with using the first approach is that it fixes the current
> > situation but it doesn't prevent it from happening again. The second
> > proposal prevents it from happening again but at the cost of a bigger
> > rename I'd volunteer to if needed.
> >
> > Regards
> >
> > On 24/10/22 20:38, Miklosovic, Stefan wrote:
> >> Yeah, that is what the branch in my original email actually already solved. I mean ...
> >>
> >> CassandraAuthorizerTruncatingTests
> >> BatchTests
> >> UUIDTests
> >>
> >> these are ending on Tests which is illegal and they are fixed there.
> >>
> >> Other cases are either "TestBase" or "Tester" which should be legal.
> >>
> >> I think using the first approach and fixing all "SplitN" PLUS adding Split* to test.name<http://test.name> regexp should do it.
> >>
> >> I think we can "automate" at most like you suggest and scan it manually, fix the stuff and from then incorporate checkstyle to take care of that.
> >>
> >> There are also some classes which do include @Test methods but I think they are abstract as they are meant to be extended as the real test is just wrapping that. This might happen when there are slight variations across test classes. This is fine as well.
> >>
> >> ________________________________________
> >> From: Derek Chen-Becker <de...@chen-becker.org>>>
> >> Sent: Monday, October 24, 2022 19:18
> >> To: dev@cassandra.apache.org<ma...@cassandra.apache.org>>
> >> Subject: Re: Some tests are never executed in CI due to their name
> >>
> >> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
> >>
> >>
> >>
> >> OK, I think the approach you're proposing is fine. As an orthogonal idea, is there any way to do a one-time audit to narrow down files that we need to inspect, or any way to automate confirmation of a file containing tests? For example:
> >>
> >> * Start with all of the files under the `test` directory (is it safe to assume that this is the only location for test classes?) => 1457 files
> >> * Remove any files that already end in "Test.java" => down to 396 files
> >> * Is it safe to assume that only files having "Test" somewhere in their name are test classes? That reduces the list down to 60 files
> >>
> >> Alternatively, are we only talking about unit tests? Would it be sufficient to look for files that don't end in "Test.java" but contain "@Test" annotations? That's a significantly smaller set, so maybe we could fix unit tests first:
> >>
> >> ❯❯❯ for fl in test/unit/**/*.java; do if [[ $fl != *Test.java ]]; then if grep -qF '@Test' $fl; then print $fl; fi; fi; done                                                                                                                                                              circleci-parity ✱ ◼
> >> test/unit/org/apache/cassandra/auth/CassandraAuthorizerTruncatingTests.java
> >> test/unit/org/apache/cassandra/cql3/BatchTests.java
> >> test/unit/org/apache/cassandra/cql3/KeywordTestBase.java
> >> test/unit/org/apache/cassandra/db/guardrails/GuardrailAllowUncompressedTables.java
> >> test/unit/org/apache/cassandra/db/guardrails/GuardrailConsistencyLevelsTester.java
> >> test/unit/org/apache/cassandra/db/guardrails/GuardrailSecondaryIndexTester.java
> >> test/unit/org/apache/cassandra/db/guardrails/GuardrailSecondaryIndexesPerTable.java
> >> test/unit/org/apache/cassandra/db/guardrails/ThresholdTester.java
> >> test/unit/org/apache/cassandra/dht/PartitionerTestCase.java
> >> test/unit/org/apache/cassandra/utils/UUIDTests.java
> >>
> >> Cheers,
> >>
> >> Derek
> >>
> >>
> >> On Mon, Oct 24, 2022 at 9:00 AM Miklosovic, Stefan <St...@netapp.com>>>> wrote:
> >> One more point as I was reading your email in a hurry. The ultimate goal is to run all tests. So even if we apply the proposed regexp in checkstyle and there is an unfortunate case that MyTestSplit1 would slip through on a review and it would not be executed in CI, if we do not figure out some better regexp, we might relax "test.name<http://test.name><http://test.name>" property to include Split*.java as well. Yes, this might be also done.
> >>
> >> Enforcing "all tests to end on "Test" " means that we would need to know which .java files contain tests and which not without looking into them. I do not think that is possible under circumstances we have where (checkstyle module).
> >>
> >> ________________________________________
> >> From: Derek Chen-Becker <de...@chen-becker.org>>>>
> >> Sent: Monday, October 24, 2022 15:53
> >> To: dev@cassandra.apache.org<ma...@cassandra.apache.org>>>
> >> Subject: Re: Some tests are never executed in CI due to their name
> >>
> >> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
> >>
> >>
> >>
> >> OK, that makes sense, and I missed the rename of "TestBaseImpl" (which I agree is in need of fixing). When you say:
> >>
> >>> What we are trying to achieve by that is to have a test which ends on "Test" and nothing else and it may contain "Test" only in case it does not start with it.
> >> Should the regex enforce the "which ends on Test" part? Otherwise, if we want to allow tests like MyTestSplit1, do we need to update the "test.name<http://test.name><http://test.name><http://test.name>" parameter to allow that in the test runner?
> >>
> >> Thanks,
> >>
> >> Derek
> >>
> >>
> >> On Mon, Oct 24, 2022 at 7:27 AM Miklosovic, Stefan <St...@netapp.com>>>>> wrote:
> >> Hi Derek,
> >>
> >> yes, the regexp is ^(?!(?:Test))(?!.*?(?:Tests)).*$
> >>
> >> What we are trying to achieve by that is to have a test which ends on "Test" and nothing else and it may contain "Test" only in case it does not start with it.
> >>
> >> To your second paragraph, I do not think we have any *Base files which are tests (that would be a bug) nor they are standalone (not inherited). By first approach, TestBaseImpl (which we do have) would start to be invalid, we would have to rename it to "DistributedTestBaseImpl" because it was starting on "Test" which is no-no. I agree that "TestBaseImpl" is rather unfortunate name. Currently, TestBaseImpl itself extends "DistributedTestBase" so in the end we would have "DistributedTestBaseImpl -> TestBaseImpl -> DistributedTestBase". Please keep in mind that "DistributedTestBase" is located in dtest jar (separate project) and that is rather inconvenient to rename.
> >>
> >> Regards
> >>
> >> ________________________________________
> >> From: Derek Chen-Becker <de...@chen-becker.org>>>>>
> >> Sent: Monday, October 24, 2022 15:09
> >> To: dev@cassandra.apache.org<ma...@cassandra.apache.org>>>>
> >> Subject: Re: Some tests are never executed in CI due to their name
> >>
> >> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
> >>
> >>
> >>
> >> For clarity, what is the final regex you've landed on for the first approach? Is it "^(?!(?:Test))(?!.*?(?:Tests)).*$" ? Are we just trying to reject anything that *starts* with "Test" or contains "Tests" somewhere in the name? It might be more clear to state what we think a valid test name should be, not in regex form.
> >>
> >> When it comes to "TestBase" files, are there any of these that are actually intended to be run as a test? The "Base" would indicate to me that it's intended to be inherited, not run directly. Do we have some "TestBase" classes somewhere that aren't inherited and are meant to be run directly, because that definitely seems like a misleading name that should be fixed, invasive or not. I would lean toward the principle of least surprise here, even if it means an involved one-time cleanup.
> >>
> >> Cheers,
> >>
> >> Derek
> >>
> >> On Mon, Oct 24, 2022 at 3:02 AM Miklosovic, Stefan <St...@netapp.com>>>>>> wrote:
> >> Hi list,
> >>
> >> it was reported that some tests are not executed as part of CI because of their name (1). There is (2) and (3) so right now it runs only "*Test.java" files.
> >>
> >> Except of us renaming the affected classes, we should fix this in a more robust way - via checkstyle - so we enforce how test files have to be called.
> >>
> >> We came up with two approaches and we do not know which one to choose so we want to gather more feedback / opinions.
> >>
> >> The first approach would make this happen (implemented here (4))
> >>
> >> class name - passes / fails the checkstyle
> >>
> >> TestIt - fails
> >> TestsThatFeature - fails
> >> ThisTestsMyFeature - fails
> >> SomeTests - fails
> >>
> >> SomeHelperClassForTesting - passes
> >> SomeTest - passes
> >> DistributedTestBase - passes
> >>
> >> MyTestSplit1 - passes
> >>
> >> We would fix "SplitN" test files by hand (I think they are not executed right now in CI either) to something like MySplit1Test and MySplit2Test but the regexp we would eventually use would not prevent this from happening in the future. I do not know how to construct such a complex regexp yet which would cover all cases above plus this splitting one. Feel free to look into the ticket where details are discussed.
> >>
> >> The second approach would forbid to have any occurrence of "test" in the file name but at the end. For example, it would fail on classes like these
> >>
> >> MyTestSplit1
> >> TestHelper
> >> BurnTestUtil
> >> FuzzTestBase
> >> DistributedTestSnitch
> >> CASCommonTestCases
> >> ServerTestUtils
> >> AuthTestUtils
> >> TestMemtable
> >>
> >> While this would also fix "MyTestSplitN" cases, I consider this to be way more invasive to the current codebase because we would have to rename a lot of files (like, hundreds, plus a lot of places they are referenced / used at) and I do not even think this is necessary and desirable. Like, how would we call "FuzzTestBase" to not contain "Test" in it? I think having "test" string in the middle is just fine, specifically for helper classes. One possible fix would be to replace "Test" with "Tester" so we would have "TesterMemtable", "AuthTesterUtils", "TesterHelper", "DistributedTesterSnitch" ... Word "Tester" would be explicitly allowed.
> >>
> >> I personally lean to the first approach even though cases like "MyTestSplit1" will not be spotted in compile time, but hey, we still do have a review process on top of this. The obvious violations like "TestThisFeature" and "MySuperTests" would be evaluated as violations which I would argue happens the most often.
> >>
> >> Please keep in mind that we have checkstyle only in 4.1 and trunk. So, while we would fix names of classes in all branches we support so they are picked up by CI, checkstyle for this would be introduced in 4.1 and trunk only.
> >>
> >> (1) https://issues.apache.org/jira/browse/CASSANDRA-17964
> >> (2) https://github.com/apache/cassandra/blob/trunk/build.xml#L61
> >> (3) https://github.com/apache/cassandra/blob/trunk/build.xml#L1033
> >> (4) https://github.com/instaclustr/cassandra/commits/CASSANDRA-17964-4.1-2
> >>
> >>
> >> --
> >> +---------------------------------------------------------------+
> >> | Derek Chen-Becker                                             |
> >> | GPG Key available at https://keybase.io/dchenbecker and       |
> >> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
> >> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
> >> +---------------------------------------------------------------+
> >>
> >>
> >>
> >> --
> >> +---------------------------------------------------------------+
> >> | Derek Chen-Becker                                             |
> >> | GPG Key available at https://keybase.io/dchenbecker and       |
> >> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
> >> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
> >> +---------------------------------------------------------------+
> >>
> >>
> >>
> >> --
> >> +---------------------------------------------------------------+
> >> | Derek Chen-Becker                                             |
> >> | GPG Key available at https://keybase.io/dchenbecker and       |
> >> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
> >> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
> >> +---------------------------------------------------------------+
> >>
> 
> 
> 

Re: Some tests are never executed in CI due to their name

Posted by "Miklosovic, Stefan" <St...@netapp.com>.
I always hit "send" before realizing I have other points to add :)

I was thinking we want to go with checkstyle and check the name. Not manually parsing the file names. Do not we also want these test files to look consistent? I do not think that having a test class called "SdsdakASsdadsada" which happen to contain @Test and is not abstract should be considered to be a valid file name / test class. Do not we also want to enforce some regularity into this?

Another point is that even we parsed we so it will be picked up while tests are executed, that means that it will not be checked on the commit?

Lastly, how do you actually want to do this if we currently have test.name property which is a simple regexp (*Test)? -Dtest.name is pretty much how we do this currently and I am not sure we want to break this in all CI scripts.

I do not think we are actually winning anything with that approach.

Regards

________________________________________
From: Josh McKenzie <jm...@apache.org>
Sent: Monday, October 31, 2022 12:49
To: dev
Subject: Re: Some tests are never executed in CI due to their name

NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.



To Derek's point, why not go with:

Would it be sufficient to look for files that don't end in "Test.java" but contain "@Test" annotations?
+ exclude files that contain "abstract class" in them that matches the file name?

Combine that w/checking specific subdirectories and we relax the required naming convention quite a bit; makes it easy to do the right thing and hard to do the wrong thing.

It's also very simple which is a feature.

On Fri, Oct 28, 2022, at 4:17 AM, Miklosovic, Stefan wrote:
Hi Berenguer,

I agree. We can always go heavy-weight anyway. I will dedicate more time to that the next week. It would be nice if you tagged along and help with the review.

FYI the ticket is touching all branches we support (there are some skipped tests in 3.0 too), we would just add checkstyle to 4.1 and trunk as discussed, so I expect we fix all 5 branches at once.

Regards

Stefan

________________________________________
From: Berenguer Blasi <be...@gmail.com>>
Sent: Friday, October 28, 2022 9:43
To: dev@cassandra.apache.org<ma...@cassandra.apache.org>
Subject: Re: Some tests are never executed in CI due to their name

NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.




Hi Stefan,

seems none of the 2 proposals are strongly opposed to. If you feel more comfortable with option 1 I say we do that, which will always be a step in the right direction. If we see it bites us again in the future then we do 2. But at least we bring this benefit/ticket in. I would also be happy to drive option 2 to completion if needed.

wdyt?

On 25/10/22 14:35, Berenguer Blasi wrote:

Well examples lend themselves to many things. I am sure I could come up with examples in the opposite direction. I'd rather talk in terms of the actual files and tests we have.

What I was trying to clarify is what I was misunderstanding as I had the impression now we were saying option 1 would not leak again.

So option 2, given ant and CI seem to rely on the *Test.java notation and also prevents future leaks doesn't seem to me a bad idea. Renaming the offending files seems a reasonable cost to me (Eclipse does it for me).

My 2cts! But hey, I've been known to be wrong before :shrug: lol

On 25/10/22 12:04, Andrés de la Peña wrote:
Note that the test multiplexer also searches for tests ending with "Test", so it will also miss new or modified tests with nonstandard names. The automatically detected tests are listed when running generate.sh, and they are added to the repeated run jobs. That gives us another opportunity to detect missed tests when we introduce something like "MyTestSplit1.java" and see that it's not been included in the mandatory repeated runs.

On Tue, 25 Oct 2022 at 10:31, Miklosovic, Stefan <St...@netapp.com>>> wrote:
Hi Berenguer,

I am glad you asked. I was expecting this question.

I think there is a fundamental difference in how we approach this problem. I do not say mine is better, I just find it important to describe it clearly.

Let's say we are on a ship which leaks. When I detect that, my course of action is to fix the leakage with the biggest patches I can find around with minimal effort necessary so we are not taking water anymore. There might still be occasional leakages which are very rare but I have a crew who is checking the ship constantly anyway (review) and the risk that big leakages like we just fixed happen again are relatively very low.

You spot a leakage and instead of fixing it with big patches and calling it a day, you are trying to remodel the cabins so they are completely waterproof but while doing so, you renamed them so everyone needs to get used to how these cabins are called and where they are located, because there is this minimal chance that some cadet comes around and starts to live in a cabin he is not supposed to and we need to explain it to him. Thousands of cadets found their cabins just fine. Occasionally, there are few people who just miss the right board completely (Test at start, Tests at the end) and they are automatically navigated around (checkstyle) but once in five years there is this guy who just completely missed it.

I believe we can just navigate him when that happens. You want to cover that guy too.

I just find my approach easier.

You can remodel it all, for sure but I am afraid I will not be a part of that. I just do not find it necessary to do that.

________________________________________
From: Berenguer Blasi <be...@gmail.com>>>
Sent: Tuesday, October 25, 2022 11:08
To: dev@cassandra.apache.org<ma...@cassandra.apache.org>>
Subject: Re: Some tests are never executed in CI due to their name

NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.




IIUC we're relying on catching the word 'Split' in the file name for option 1. If somebody named his test i.e. 'MyTestGroup1', 'MyTestGroup2', 'TTLTestPre2038', 'TTLTestPost2038',... I think we would leak tests again? or any other word that is not specifically accounted for. Unless I am missing sthg ofc! :-)

On 25/10/22 10:39, Miklosovic, Stefan wrote:
> I think that what you wrote is not entirely correct. It will prevent it from happening again when there are tests ending on "Tests" or starting on "Test". The only case it will not cover is "SplitN" issue we plan to cover with relaxed test.name<http://test.name> property.
>
> It seems like what you wrote means that we will fix it and tests will leak again. That is not true.
>
> ________________________________________
> From: Berenguer Blasi <be...@gmail.com>>>
> Sent: Tuesday, October 25, 2022 7:26
> To: dev@cassandra.apache.org<ma...@cassandra.apache.org>>
> Subject: Re: Some tests are never executed in CI due to their name
>
> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
>
>
>
>
> The problem with using the first approach is that it fixes the current
> situation but it doesn't prevent it from happening again. The second
> proposal prevents it from happening again but at the cost of a bigger
> rename I'd volunteer to if needed.
>
> Regards
>
> On 24/10/22 20:38, Miklosovic, Stefan wrote:
>> Yeah, that is what the branch in my original email actually already solved. I mean ...
>>
>> CassandraAuthorizerTruncatingTests
>> BatchTests
>> UUIDTests
>>
>> these are ending on Tests which is illegal and they are fixed there.
>>
>> Other cases are either "TestBase" or "Tester" which should be legal.
>>
>> I think using the first approach and fixing all "SplitN" PLUS adding Split* to test.name<http://test.name> regexp should do it.
>>
>> I think we can "automate" at most like you suggest and scan it manually, fix the stuff and from then incorporate checkstyle to take care of that.
>>
>> There are also some classes which do include @Test methods but I think they are abstract as they are meant to be extended as the real test is just wrapping that. This might happen when there are slight variations across test classes. This is fine as well.
>>
>> ________________________________________
>> From: Derek Chen-Becker <de...@chen-becker.org>>>
>> Sent: Monday, October 24, 2022 19:18
>> To: dev@cassandra.apache.org<ma...@cassandra.apache.org>>
>> Subject: Re: Some tests are never executed in CI due to their name
>>
>> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
>>
>>
>>
>> OK, I think the approach you're proposing is fine. As an orthogonal idea, is there any way to do a one-time audit to narrow down files that we need to inspect, or any way to automate confirmation of a file containing tests? For example:
>>
>> * Start with all of the files under the `test` directory (is it safe to assume that this is the only location for test classes?) => 1457 files
>> * Remove any files that already end in "Test.java" => down to 396 files
>> * Is it safe to assume that only files having "Test" somewhere in their name are test classes? That reduces the list down to 60 files
>>
>> Alternatively, are we only talking about unit tests? Would it be sufficient to look for files that don't end in "Test.java" but contain "@Test" annotations? That's a significantly smaller set, so maybe we could fix unit tests first:
>>
>> ❯❯❯ for fl in test/unit/**/*.java; do if [[ $fl != *Test.java ]]; then if grep -qF '@Test' $fl; then print $fl; fi; fi; done                                                                                                                                                              circleci-parity ✱ ◼
>> test/unit/org/apache/cassandra/auth/CassandraAuthorizerTruncatingTests.java
>> test/unit/org/apache/cassandra/cql3/BatchTests.java
>> test/unit/org/apache/cassandra/cql3/KeywordTestBase.java
>> test/unit/org/apache/cassandra/db/guardrails/GuardrailAllowUncompressedTables.java
>> test/unit/org/apache/cassandra/db/guardrails/GuardrailConsistencyLevelsTester.java
>> test/unit/org/apache/cassandra/db/guardrails/GuardrailSecondaryIndexTester.java
>> test/unit/org/apache/cassandra/db/guardrails/GuardrailSecondaryIndexesPerTable.java
>> test/unit/org/apache/cassandra/db/guardrails/ThresholdTester.java
>> test/unit/org/apache/cassandra/dht/PartitionerTestCase.java
>> test/unit/org/apache/cassandra/utils/UUIDTests.java
>>
>> Cheers,
>>
>> Derek
>>
>>
>> On Mon, Oct 24, 2022 at 9:00 AM Miklosovic, Stefan <St...@netapp.com>>>> wrote:
>> One more point as I was reading your email in a hurry. The ultimate goal is to run all tests. So even if we apply the proposed regexp in checkstyle and there is an unfortunate case that MyTestSplit1 would slip through on a review and it would not be executed in CI, if we do not figure out some better regexp, we might relax "test.name<http://test.name><http://test.name>" property to include Split*.java as well. Yes, this might be also done.
>>
>> Enforcing "all tests to end on "Test" " means that we would need to know which .java files contain tests and which not without looking into them. I do not think that is possible under circumstances we have where (checkstyle module).
>>
>> ________________________________________
>> From: Derek Chen-Becker <de...@chen-becker.org>>>>
>> Sent: Monday, October 24, 2022 15:53
>> To: dev@cassandra.apache.org<ma...@cassandra.apache.org>>>
>> Subject: Re: Some tests are never executed in CI due to their name
>>
>> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
>>
>>
>>
>> OK, that makes sense, and I missed the rename of "TestBaseImpl" (which I agree is in need of fixing). When you say:
>>
>>> What we are trying to achieve by that is to have a test which ends on "Test" and nothing else and it may contain "Test" only in case it does not start with it.
>> Should the regex enforce the "which ends on Test" part? Otherwise, if we want to allow tests like MyTestSplit1, do we need to update the "test.name<http://test.name><http://test.name><http://test.name>" parameter to allow that in the test runner?
>>
>> Thanks,
>>
>> Derek
>>
>>
>> On Mon, Oct 24, 2022 at 7:27 AM Miklosovic, Stefan <St...@netapp.com>>>>> wrote:
>> Hi Derek,
>>
>> yes, the regexp is ^(?!(?:Test))(?!.*?(?:Tests)).*$
>>
>> What we are trying to achieve by that is to have a test which ends on "Test" and nothing else and it may contain "Test" only in case it does not start with it.
>>
>> To your second paragraph, I do not think we have any *Base files which are tests (that would be a bug) nor they are standalone (not inherited). By first approach, TestBaseImpl (which we do have) would start to be invalid, we would have to rename it to "DistributedTestBaseImpl" because it was starting on "Test" which is no-no. I agree that "TestBaseImpl" is rather unfortunate name. Currently, TestBaseImpl itself extends "DistributedTestBase" so in the end we would have "DistributedTestBaseImpl -> TestBaseImpl -> DistributedTestBase". Please keep in mind that "DistributedTestBase" is located in dtest jar (separate project) and that is rather inconvenient to rename.
>>
>> Regards
>>
>> ________________________________________
>> From: Derek Chen-Becker <de...@chen-becker.org>>>>>
>> Sent: Monday, October 24, 2022 15:09
>> To: dev@cassandra.apache.org<ma...@cassandra.apache.org>>>>
>> Subject: Re: Some tests are never executed in CI due to their name
>>
>> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
>>
>>
>>
>> For clarity, what is the final regex you've landed on for the first approach? Is it "^(?!(?:Test))(?!.*?(?:Tests)).*$" ? Are we just trying to reject anything that *starts* with "Test" or contains "Tests" somewhere in the name? It might be more clear to state what we think a valid test name should be, not in regex form.
>>
>> When it comes to "TestBase" files, are there any of these that are actually intended to be run as a test? The "Base" would indicate to me that it's intended to be inherited, not run directly. Do we have some "TestBase" classes somewhere that aren't inherited and are meant to be run directly, because that definitely seems like a misleading name that should be fixed, invasive or not. I would lean toward the principle of least surprise here, even if it means an involved one-time cleanup.
>>
>> Cheers,
>>
>> Derek
>>
>> On Mon, Oct 24, 2022 at 3:02 AM Miklosovic, Stefan <St...@netapp.com>>>>>> wrote:
>> Hi list,
>>
>> it was reported that some tests are not executed as part of CI because of their name (1). There is (2) and (3) so right now it runs only "*Test.java" files.
>>
>> Except of us renaming the affected classes, we should fix this in a more robust way - via checkstyle - so we enforce how test files have to be called.
>>
>> We came up with two approaches and we do not know which one to choose so we want to gather more feedback / opinions.
>>
>> The first approach would make this happen (implemented here (4))
>>
>> class name - passes / fails the checkstyle
>>
>> TestIt - fails
>> TestsThatFeature - fails
>> ThisTestsMyFeature - fails
>> SomeTests - fails
>>
>> SomeHelperClassForTesting - passes
>> SomeTest - passes
>> DistributedTestBase - passes
>>
>> MyTestSplit1 - passes
>>
>> We would fix "SplitN" test files by hand (I think they are not executed right now in CI either) to something like MySplit1Test and MySplit2Test but the regexp we would eventually use would not prevent this from happening in the future. I do not know how to construct such a complex regexp yet which would cover all cases above plus this splitting one. Feel free to look into the ticket where details are discussed.
>>
>> The second approach would forbid to have any occurrence of "test" in the file name but at the end. For example, it would fail on classes like these
>>
>> MyTestSplit1
>> TestHelper
>> BurnTestUtil
>> FuzzTestBase
>> DistributedTestSnitch
>> CASCommonTestCases
>> ServerTestUtils
>> AuthTestUtils
>> TestMemtable
>>
>> While this would also fix "MyTestSplitN" cases, I consider this to be way more invasive to the current codebase because we would have to rename a lot of files (like, hundreds, plus a lot of places they are referenced / used at) and I do not even think this is necessary and desirable. Like, how would we call "FuzzTestBase" to not contain "Test" in it? I think having "test" string in the middle is just fine, specifically for helper classes. One possible fix would be to replace "Test" with "Tester" so we would have "TesterMemtable", "AuthTesterUtils", "TesterHelper", "DistributedTesterSnitch" ... Word "Tester" would be explicitly allowed.
>>
>> I personally lean to the first approach even though cases like "MyTestSplit1" will not be spotted in compile time, but hey, we still do have a review process on top of this. The obvious violations like "TestThisFeature" and "MySuperTests" would be evaluated as violations which I would argue happens the most often.
>>
>> Please keep in mind that we have checkstyle only in 4.1 and trunk. So, while we would fix names of classes in all branches we support so they are picked up by CI, checkstyle for this would be introduced in 4.1 and trunk only.
>>
>> (1) https://issues.apache.org/jira/browse/CASSANDRA-17964
>> (2) https://github.com/apache/cassandra/blob/trunk/build.xml#L61
>> (3) https://github.com/apache/cassandra/blob/trunk/build.xml#L1033
>> (4) https://github.com/instaclustr/cassandra/commits/CASSANDRA-17964-4.1-2
>>
>>
>> --
>> +---------------------------------------------------------------+
>> | Derek Chen-Becker                                             |
>> | GPG Key available at https://keybase.io/dchenbecker and       |
>> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
>> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
>> +---------------------------------------------------------------+
>>
>>
>>
>> --
>> +---------------------------------------------------------------+
>> | Derek Chen-Becker                                             |
>> | GPG Key available at https://keybase.io/dchenbecker and       |
>> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
>> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
>> +---------------------------------------------------------------+
>>
>>
>>
>> --
>> +---------------------------------------------------------------+
>> | Derek Chen-Becker                                             |
>> | GPG Key available at https://keybase.io/dchenbecker and       |
>> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
>> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
>> +---------------------------------------------------------------+
>>



Re: Some tests are never executed in CI due to their name

Posted by "Miklosovic, Stefan" <St...@netapp.com>.
I do not think there is any module in checkstyle which would do what you want.

________________________________________
From: Josh McKenzie <jm...@apache.org>
Sent: Monday, October 31, 2022 12:49
To: dev
Subject: Re: Some tests are never executed in CI due to their name

NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.



To Derek's point, why not go with:

Would it be sufficient to look for files that don't end in "Test.java" but contain "@Test" annotations?
+ exclude files that contain "abstract class" in them that matches the file name?

Combine that w/checking specific subdirectories and we relax the required naming convention quite a bit; makes it easy to do the right thing and hard to do the wrong thing.

It's also very simple which is a feature.

On Fri, Oct 28, 2022, at 4:17 AM, Miklosovic, Stefan wrote:
Hi Berenguer,

I agree. We can always go heavy-weight anyway. I will dedicate more time to that the next week. It would be nice if you tagged along and help with the review.

FYI the ticket is touching all branches we support (there are some skipped tests in 3.0 too), we would just add checkstyle to 4.1 and trunk as discussed, so I expect we fix all 5 branches at once.

Regards

Stefan

________________________________________
From: Berenguer Blasi <be...@gmail.com>>
Sent: Friday, October 28, 2022 9:43
To: dev@cassandra.apache.org<ma...@cassandra.apache.org>
Subject: Re: Some tests are never executed in CI due to their name

NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.




Hi Stefan,

seems none of the 2 proposals are strongly opposed to. If you feel more comfortable with option 1 I say we do that, which will always be a step in the right direction. If we see it bites us again in the future then we do 2. But at least we bring this benefit/ticket in. I would also be happy to drive option 2 to completion if needed.

wdyt?

On 25/10/22 14:35, Berenguer Blasi wrote:

Well examples lend themselves to many things. I am sure I could come up with examples in the opposite direction. I'd rather talk in terms of the actual files and tests we have.

What I was trying to clarify is what I was misunderstanding as I had the impression now we were saying option 1 would not leak again.

So option 2, given ant and CI seem to rely on the *Test.java notation and also prevents future leaks doesn't seem to me a bad idea. Renaming the offending files seems a reasonable cost to me (Eclipse does it for me).

My 2cts! But hey, I've been known to be wrong before :shrug: lol

On 25/10/22 12:04, Andrés de la Peña wrote:
Note that the test multiplexer also searches for tests ending with "Test", so it will also miss new or modified tests with nonstandard names. The automatically detected tests are listed when running generate.sh, and they are added to the repeated run jobs. That gives us another opportunity to detect missed tests when we introduce something like "MyTestSplit1.java" and see that it's not been included in the mandatory repeated runs.

On Tue, 25 Oct 2022 at 10:31, Miklosovic, Stefan <St...@netapp.com>>> wrote:
Hi Berenguer,

I am glad you asked. I was expecting this question.

I think there is a fundamental difference in how we approach this problem. I do not say mine is better, I just find it important to describe it clearly.

Let's say we are on a ship which leaks. When I detect that, my course of action is to fix the leakage with the biggest patches I can find around with minimal effort necessary so we are not taking water anymore. There might still be occasional leakages which are very rare but I have a crew who is checking the ship constantly anyway (review) and the risk that big leakages like we just fixed happen again are relatively very low.

You spot a leakage and instead of fixing it with big patches and calling it a day, you are trying to remodel the cabins so they are completely waterproof but while doing so, you renamed them so everyone needs to get used to how these cabins are called and where they are located, because there is this minimal chance that some cadet comes around and starts to live in a cabin he is not supposed to and we need to explain it to him. Thousands of cadets found their cabins just fine. Occasionally, there are few people who just miss the right board completely (Test at start, Tests at the end) and they are automatically navigated around (checkstyle) but once in five years there is this guy who just completely missed it.

I believe we can just navigate him when that happens. You want to cover that guy too.

I just find my approach easier.

You can remodel it all, for sure but I am afraid I will not be a part of that. I just do not find it necessary to do that.

________________________________________
From: Berenguer Blasi <be...@gmail.com>>>
Sent: Tuesday, October 25, 2022 11:08
To: dev@cassandra.apache.org<ma...@cassandra.apache.org>>
Subject: Re: Some tests are never executed in CI due to their name

NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.




IIUC we're relying on catching the word 'Split' in the file name for option 1. If somebody named his test i.e. 'MyTestGroup1', 'MyTestGroup2', 'TTLTestPre2038', 'TTLTestPost2038',... I think we would leak tests again? or any other word that is not specifically accounted for. Unless I am missing sthg ofc! :-)

On 25/10/22 10:39, Miklosovic, Stefan wrote:
> I think that what you wrote is not entirely correct. It will prevent it from happening again when there are tests ending on "Tests" or starting on "Test". The only case it will not cover is "SplitN" issue we plan to cover with relaxed test.name<http://test.name> property.
>
> It seems like what you wrote means that we will fix it and tests will leak again. That is not true.
>
> ________________________________________
> From: Berenguer Blasi <be...@gmail.com>>>
> Sent: Tuesday, October 25, 2022 7:26
> To: dev@cassandra.apache.org<ma...@cassandra.apache.org>>
> Subject: Re: Some tests are never executed in CI due to their name
>
> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
>
>
>
>
> The problem with using the first approach is that it fixes the current
> situation but it doesn't prevent it from happening again. The second
> proposal prevents it from happening again but at the cost of a bigger
> rename I'd volunteer to if needed.
>
> Regards
>
> On 24/10/22 20:38, Miklosovic, Stefan wrote:
>> Yeah, that is what the branch in my original email actually already solved. I mean ...
>>
>> CassandraAuthorizerTruncatingTests
>> BatchTests
>> UUIDTests
>>
>> these are ending on Tests which is illegal and they are fixed there.
>>
>> Other cases are either "TestBase" or "Tester" which should be legal.
>>
>> I think using the first approach and fixing all "SplitN" PLUS adding Split* to test.name<http://test.name> regexp should do it.
>>
>> I think we can "automate" at most like you suggest and scan it manually, fix the stuff and from then incorporate checkstyle to take care of that.
>>
>> There are also some classes which do include @Test methods but I think they are abstract as they are meant to be extended as the real test is just wrapping that. This might happen when there are slight variations across test classes. This is fine as well.
>>
>> ________________________________________
>> From: Derek Chen-Becker <de...@chen-becker.org>>>
>> Sent: Monday, October 24, 2022 19:18
>> To: dev@cassandra.apache.org<ma...@cassandra.apache.org>>
>> Subject: Re: Some tests are never executed in CI due to their name
>>
>> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
>>
>>
>>
>> OK, I think the approach you're proposing is fine. As an orthogonal idea, is there any way to do a one-time audit to narrow down files that we need to inspect, or any way to automate confirmation of a file containing tests? For example:
>>
>> * Start with all of the files under the `test` directory (is it safe to assume that this is the only location for test classes?) => 1457 files
>> * Remove any files that already end in "Test.java" => down to 396 files
>> * Is it safe to assume that only files having "Test" somewhere in their name are test classes? That reduces the list down to 60 files
>>
>> Alternatively, are we only talking about unit tests? Would it be sufficient to look for files that don't end in "Test.java" but contain "@Test" annotations? That's a significantly smaller set, so maybe we could fix unit tests first:
>>
>> ❯❯❯ for fl in test/unit/**/*.java; do if [[ $fl != *Test.java ]]; then if grep -qF '@Test' $fl; then print $fl; fi; fi; done                                                                                                                                                              circleci-parity ✱ ◼
>> test/unit/org/apache/cassandra/auth/CassandraAuthorizerTruncatingTests.java
>> test/unit/org/apache/cassandra/cql3/BatchTests.java
>> test/unit/org/apache/cassandra/cql3/KeywordTestBase.java
>> test/unit/org/apache/cassandra/db/guardrails/GuardrailAllowUncompressedTables.java
>> test/unit/org/apache/cassandra/db/guardrails/GuardrailConsistencyLevelsTester.java
>> test/unit/org/apache/cassandra/db/guardrails/GuardrailSecondaryIndexTester.java
>> test/unit/org/apache/cassandra/db/guardrails/GuardrailSecondaryIndexesPerTable.java
>> test/unit/org/apache/cassandra/db/guardrails/ThresholdTester.java
>> test/unit/org/apache/cassandra/dht/PartitionerTestCase.java
>> test/unit/org/apache/cassandra/utils/UUIDTests.java
>>
>> Cheers,
>>
>> Derek
>>
>>
>> On Mon, Oct 24, 2022 at 9:00 AM Miklosovic, Stefan <St...@netapp.com>>>> wrote:
>> One more point as I was reading your email in a hurry. The ultimate goal is to run all tests. So even if we apply the proposed regexp in checkstyle and there is an unfortunate case that MyTestSplit1 would slip through on a review and it would not be executed in CI, if we do not figure out some better regexp, we might relax "test.name<http://test.name><http://test.name>" property to include Split*.java as well. Yes, this might be also done.
>>
>> Enforcing "all tests to end on "Test" " means that we would need to know which .java files contain tests and which not without looking into them. I do not think that is possible under circumstances we have where (checkstyle module).
>>
>> ________________________________________
>> From: Derek Chen-Becker <de...@chen-becker.org>>>>
>> Sent: Monday, October 24, 2022 15:53
>> To: dev@cassandra.apache.org<ma...@cassandra.apache.org>>>
>> Subject: Re: Some tests are never executed in CI due to their name
>>
>> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
>>
>>
>>
>> OK, that makes sense, and I missed the rename of "TestBaseImpl" (which I agree is in need of fixing). When you say:
>>
>>> What we are trying to achieve by that is to have a test which ends on "Test" and nothing else and it may contain "Test" only in case it does not start with it.
>> Should the regex enforce the "which ends on Test" part? Otherwise, if we want to allow tests like MyTestSplit1, do we need to update the "test.name<http://test.name><http://test.name><http://test.name>" parameter to allow that in the test runner?
>>
>> Thanks,
>>
>> Derek
>>
>>
>> On Mon, Oct 24, 2022 at 7:27 AM Miklosovic, Stefan <St...@netapp.com>>>>> wrote:
>> Hi Derek,
>>
>> yes, the regexp is ^(?!(?:Test))(?!.*?(?:Tests)).*$
>>
>> What we are trying to achieve by that is to have a test which ends on "Test" and nothing else and it may contain "Test" only in case it does not start with it.
>>
>> To your second paragraph, I do not think we have any *Base files which are tests (that would be a bug) nor they are standalone (not inherited). By first approach, TestBaseImpl (which we do have) would start to be invalid, we would have to rename it to "DistributedTestBaseImpl" because it was starting on "Test" which is no-no. I agree that "TestBaseImpl" is rather unfortunate name. Currently, TestBaseImpl itself extends "DistributedTestBase" so in the end we would have "DistributedTestBaseImpl -> TestBaseImpl -> DistributedTestBase". Please keep in mind that "DistributedTestBase" is located in dtest jar (separate project) and that is rather inconvenient to rename.
>>
>> Regards
>>
>> ________________________________________
>> From: Derek Chen-Becker <de...@chen-becker.org>>>>>
>> Sent: Monday, October 24, 2022 15:09
>> To: dev@cassandra.apache.org<ma...@cassandra.apache.org>>>>
>> Subject: Re: Some tests are never executed in CI due to their name
>>
>> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
>>
>>
>>
>> For clarity, what is the final regex you've landed on for the first approach? Is it "^(?!(?:Test))(?!.*?(?:Tests)).*$" ? Are we just trying to reject anything that *starts* with "Test" or contains "Tests" somewhere in the name? It might be more clear to state what we think a valid test name should be, not in regex form.
>>
>> When it comes to "TestBase" files, are there any of these that are actually intended to be run as a test? The "Base" would indicate to me that it's intended to be inherited, not run directly. Do we have some "TestBase" classes somewhere that aren't inherited and are meant to be run directly, because that definitely seems like a misleading name that should be fixed, invasive or not. I would lean toward the principle of least surprise here, even if it means an involved one-time cleanup.
>>
>> Cheers,
>>
>> Derek
>>
>> On Mon, Oct 24, 2022 at 3:02 AM Miklosovic, Stefan <St...@netapp.com>>>>>> wrote:
>> Hi list,
>>
>> it was reported that some tests are not executed as part of CI because of their name (1). There is (2) and (3) so right now it runs only "*Test.java" files.
>>
>> Except of us renaming the affected classes, we should fix this in a more robust way - via checkstyle - so we enforce how test files have to be called.
>>
>> We came up with two approaches and we do not know which one to choose so we want to gather more feedback / opinions.
>>
>> The first approach would make this happen (implemented here (4))
>>
>> class name - passes / fails the checkstyle
>>
>> TestIt - fails
>> TestsThatFeature - fails
>> ThisTestsMyFeature - fails
>> SomeTests - fails
>>
>> SomeHelperClassForTesting - passes
>> SomeTest - passes
>> DistributedTestBase - passes
>>
>> MyTestSplit1 - passes
>>
>> We would fix "SplitN" test files by hand (I think they are not executed right now in CI either) to something like MySplit1Test and MySplit2Test but the regexp we would eventually use would not prevent this from happening in the future. I do not know how to construct such a complex regexp yet which would cover all cases above plus this splitting one. Feel free to look into the ticket where details are discussed.
>>
>> The second approach would forbid to have any occurrence of "test" in the file name but at the end. For example, it would fail on classes like these
>>
>> MyTestSplit1
>> TestHelper
>> BurnTestUtil
>> FuzzTestBase
>> DistributedTestSnitch
>> CASCommonTestCases
>> ServerTestUtils
>> AuthTestUtils
>> TestMemtable
>>
>> While this would also fix "MyTestSplitN" cases, I consider this to be way more invasive to the current codebase because we would have to rename a lot of files (like, hundreds, plus a lot of places they are referenced / used at) and I do not even think this is necessary and desirable. Like, how would we call "FuzzTestBase" to not contain "Test" in it? I think having "test" string in the middle is just fine, specifically for helper classes. One possible fix would be to replace "Test" with "Tester" so we would have "TesterMemtable", "AuthTesterUtils", "TesterHelper", "DistributedTesterSnitch" ... Word "Tester" would be explicitly allowed.
>>
>> I personally lean to the first approach even though cases like "MyTestSplit1" will not be spotted in compile time, but hey, we still do have a review process on top of this. The obvious violations like "TestThisFeature" and "MySuperTests" would be evaluated as violations which I would argue happens the most often.
>>
>> Please keep in mind that we have checkstyle only in 4.1 and trunk. So, while we would fix names of classes in all branches we support so they are picked up by CI, checkstyle for this would be introduced in 4.1 and trunk only.
>>
>> (1) https://issues.apache.org/jira/browse/CASSANDRA-17964
>> (2) https://github.com/apache/cassandra/blob/trunk/build.xml#L61
>> (3) https://github.com/apache/cassandra/blob/trunk/build.xml#L1033
>> (4) https://github.com/instaclustr/cassandra/commits/CASSANDRA-17964-4.1-2
>>
>>
>> --
>> +---------------------------------------------------------------+
>> | Derek Chen-Becker                                             |
>> | GPG Key available at https://keybase.io/dchenbecker and       |
>> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
>> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
>> +---------------------------------------------------------------+
>>
>>
>>
>> --
>> +---------------------------------------------------------------+
>> | Derek Chen-Becker                                             |
>> | GPG Key available at https://keybase.io/dchenbecker and       |
>> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
>> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
>> +---------------------------------------------------------------+
>>
>>
>>
>> --
>> +---------------------------------------------------------------+
>> | Derek Chen-Becker                                             |
>> | GPG Key available at https://keybase.io/dchenbecker and       |
>> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
>> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
>> +---------------------------------------------------------------+
>>



Re: Some tests are never executed in CI due to their name

Posted by Josh McKenzie <jm...@apache.org>.
To Derek's point, why not go with:

> Would it be sufficient to look for files that don't end in "Test.java" but contain "@Test" annotations?
+ exclude files that contain "abstract class" in them that matches the file name?

Combine that w/checking specific subdirectories and we relax the required naming convention quite a bit; makes it easy to do the right thing and hard to do the wrong thing.

It's also very simple which is a feature.

On Fri, Oct 28, 2022, at 4:17 AM, Miklosovic, Stefan wrote:
> Hi Berenguer,
> 
> I agree. We can always go heavy-weight anyway. I will dedicate more time to that the next week. It would be nice if you tagged along and help with the review.
> 
> FYI the ticket is touching all branches we support (there are some skipped tests in 3.0 too), we would just add checkstyle to 4.1 and trunk as discussed, so I expect we fix all 5 branches at once.
> 
> Regards
> 
> Stefan
> 
> ________________________________________
> From: Berenguer Blasi <be...@gmail.com>
> Sent: Friday, October 28, 2022 9:43
> To: dev@cassandra.apache.org
> Subject: Re: Some tests are never executed in CI due to their name
> 
> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
> 
> 
> 
> 
> Hi Stefan,
> 
> seems none of the 2 proposals are strongly opposed to. If you feel more comfortable with option 1 I say we do that, which will always be a step in the right direction. If we see it bites us again in the future then we do 2. But at least we bring this benefit/ticket in. I would also be happy to drive option 2 to completion if needed.
> 
> wdyt?
> 
> On 25/10/22 14:35, Berenguer Blasi wrote:
> 
> Well examples lend themselves to many things. I am sure I could come up with examples in the opposite direction. I'd rather talk in terms of the actual files and tests we have.
> 
> What I was trying to clarify is what I was misunderstanding as I had the impression now we were saying option 1 would not leak again.
> 
> So option 2, given ant and CI seem to rely on the *Test.java notation and also prevents future leaks doesn't seem to me a bad idea. Renaming the offending files seems a reasonable cost to me (Eclipse does it for me).
> 
> My 2cts! But hey, I've been known to be wrong before :shrug: lol
> 
> On 25/10/22 12:04, Andrés de la Peña wrote:
> Note that the test multiplexer also searches for tests ending with "Test", so it will also miss new or modified tests with nonstandard names. The automatically detected tests are listed when running generate.sh, and they are added to the repeated run jobs. That gives us another opportunity to detect missed tests when we introduce something like "MyTestSplit1.java" and see that it's not been included in the mandatory repeated runs.
> 
> On Tue, 25 Oct 2022 at 10:31, Miklosovic, Stefan <St...@netapp.com>> wrote:
> Hi Berenguer,
> 
> I am glad you asked. I was expecting this question.
> 
> I think there is a fundamental difference in how we approach this problem. I do not say mine is better, I just find it important to describe it clearly.
> 
> Let's say we are on a ship which leaks. When I detect that, my course of action is to fix the leakage with the biggest patches I can find around with minimal effort necessary so we are not taking water anymore. There might still be occasional leakages which are very rare but I have a crew who is checking the ship constantly anyway (review) and the risk that big leakages like we just fixed happen again are relatively very low.
> 
> You spot a leakage and instead of fixing it with big patches and calling it a day, you are trying to remodel the cabins so they are completely waterproof but while doing so, you renamed them so everyone needs to get used to how these cabins are called and where they are located, because there is this minimal chance that some cadet comes around and starts to live in a cabin he is not supposed to and we need to explain it to him. Thousands of cadets found their cabins just fine. Occasionally, there are few people who just miss the right board completely (Test at start, Tests at the end) and they are automatically navigated around (checkstyle) but once in five years there is this guy who just completely missed it.
> 
> I believe we can just navigate him when that happens. You want to cover that guy too.
> 
> I just find my approach easier.
> 
> You can remodel it all, for sure but I am afraid I will not be a part of that. I just do not find it necessary to do that.
> 
> ________________________________________
> From: Berenguer Blasi <be...@gmail.com>>
> Sent: Tuesday, October 25, 2022 11:08
> To: dev@cassandra.apache.org<ma...@cassandra.apache.org>
> Subject: Re: Some tests are never executed in CI due to their name
> 
> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
> 
> 
> 
> 
> IIUC we're relying on catching the word 'Split' in the file name for option 1. If somebody named his test i.e. 'MyTestGroup1', 'MyTestGroup2', 'TTLTestPre2038', 'TTLTestPost2038',... I think we would leak tests again? or any other word that is not specifically accounted for. Unless I am missing sthg ofc! :-)
> 
> On 25/10/22 10:39, Miklosovic, Stefan wrote:
> > I think that what you wrote is not entirely correct. It will prevent it from happening again when there are tests ending on "Tests" or starting on "Test". The only case it will not cover is "SplitN" issue we plan to cover with relaxed test.name<http://test.name> property.
> >
> > It seems like what you wrote means that we will fix it and tests will leak again. That is not true.
> >
> > ________________________________________
> > From: Berenguer Blasi <be...@gmail.com>>
> > Sent: Tuesday, October 25, 2022 7:26
> > To: dev@cassandra.apache.org<ma...@cassandra.apache.org>
> > Subject: Re: Some tests are never executed in CI due to their name
> >
> > NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
> >
> >
> >
> >
> > The problem with using the first approach is that it fixes the current
> > situation but it doesn't prevent it from happening again. The second
> > proposal prevents it from happening again but at the cost of a bigger
> > rename I'd volunteer to if needed.
> >
> > Regards
> >
> > On 24/10/22 20:38, Miklosovic, Stefan wrote:
> >> Yeah, that is what the branch in my original email actually already solved. I mean ...
> >>
> >> CassandraAuthorizerTruncatingTests
> >> BatchTests
> >> UUIDTests
> >>
> >> these are ending on Tests which is illegal and they are fixed there.
> >>
> >> Other cases are either "TestBase" or "Tester" which should be legal.
> >>
> >> I think using the first approach and fixing all "SplitN" PLUS adding Split* to test.name<http://test.name> regexp should do it.
> >>
> >> I think we can "automate" at most like you suggest and scan it manually, fix the stuff and from then incorporate checkstyle to take care of that.
> >>
> >> There are also some classes which do include @Test methods but I think they are abstract as they are meant to be extended as the real test is just wrapping that. This might happen when there are slight variations across test classes. This is fine as well.
> >>
> >> ________________________________________
> >> From: Derek Chen-Becker <de...@chen-becker.org>>
> >> Sent: Monday, October 24, 2022 19:18
> >> To: dev@cassandra.apache.org<ma...@cassandra.apache.org>
> >> Subject: Re: Some tests are never executed in CI due to their name
> >>
> >> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
> >>
> >>
> >>
> >> OK, I think the approach you're proposing is fine. As an orthogonal idea, is there any way to do a one-time audit to narrow down files that we need to inspect, or any way to automate confirmation of a file containing tests? For example:
> >>
> >> * Start with all of the files under the `test` directory (is it safe to assume that this is the only location for test classes?) => 1457 files
> >> * Remove any files that already end in "Test.java" => down to 396 files
> >> * Is it safe to assume that only files having "Test" somewhere in their name are test classes? That reduces the list down to 60 files
> >>
> >> Alternatively, are we only talking about unit tests? Would it be sufficient to look for files that don't end in "Test.java" but contain "@Test" annotations? That's a significantly smaller set, so maybe we could fix unit tests first:
> >>
> >> ❯❯❯ for fl in test/unit/**/*.java; do if [[ $fl != *Test.java ]]; then if grep -qF '@Test' $fl; then print $fl; fi; fi; done                                                                                                                                                              circleci-parity ✱ ◼
> >> test/unit/org/apache/cassandra/auth/CassandraAuthorizerTruncatingTests.java
> >> test/unit/org/apache/cassandra/cql3/BatchTests.java
> >> test/unit/org/apache/cassandra/cql3/KeywordTestBase.java
> >> test/unit/org/apache/cassandra/db/guardrails/GuardrailAllowUncompressedTables.java
> >> test/unit/org/apache/cassandra/db/guardrails/GuardrailConsistencyLevelsTester.java
> >> test/unit/org/apache/cassandra/db/guardrails/GuardrailSecondaryIndexTester.java
> >> test/unit/org/apache/cassandra/db/guardrails/GuardrailSecondaryIndexesPerTable.java
> >> test/unit/org/apache/cassandra/db/guardrails/ThresholdTester.java
> >> test/unit/org/apache/cassandra/dht/PartitionerTestCase.java
> >> test/unit/org/apache/cassandra/utils/UUIDTests.java
> >>
> >> Cheers,
> >>
> >> Derek
> >>
> >>
> >> On Mon, Oct 24, 2022 at 9:00 AM Miklosovic, Stefan <St...@netapp.com>>> wrote:
> >> One more point as I was reading your email in a hurry. The ultimate goal is to run all tests. So even if we apply the proposed regexp in checkstyle and there is an unfortunate case that MyTestSplit1 would slip through on a review and it would not be executed in CI, if we do not figure out some better regexp, we might relax "test.name<http://test.name><http://test.name>" property to include Split*.java as well. Yes, this might be also done.
> >>
> >> Enforcing "all tests to end on "Test" " means that we would need to know which .java files contain tests and which not without looking into them. I do not think that is possible under circumstances we have where (checkstyle module).
> >>
> >> ________________________________________
> >> From: Derek Chen-Becker <de...@chen-becker.org>>>
> >> Sent: Monday, October 24, 2022 15:53
> >> To: dev@cassandra.apache.org<ma...@cassandra.apache.org>>
> >> Subject: Re: Some tests are never executed in CI due to their name
> >>
> >> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
> >>
> >>
> >>
> >> OK, that makes sense, and I missed the rename of "TestBaseImpl" (which I agree is in need of fixing). When you say:
> >>
> >>> What we are trying to achieve by that is to have a test which ends on "Test" and nothing else and it may contain "Test" only in case it does not start with it.
> >> Should the regex enforce the "which ends on Test" part? Otherwise, if we want to allow tests like MyTestSplit1, do we need to update the "test.name<http://test.name><http://test.name><http://test.name>" parameter to allow that in the test runner?
> >>
> >> Thanks,
> >>
> >> Derek
> >>
> >>
> >> On Mon, Oct 24, 2022 at 7:27 AM Miklosovic, Stefan <St...@netapp.com>>>> wrote:
> >> Hi Derek,
> >>
> >> yes, the regexp is ^(?!(?:Test))(?!.*?(?:Tests)).*$
> >>
> >> What we are trying to achieve by that is to have a test which ends on "Test" and nothing else and it may contain "Test" only in case it does not start with it.
> >>
> >> To your second paragraph, I do not think we have any *Base files which are tests (that would be a bug) nor they are standalone (not inherited). By first approach, TestBaseImpl (which we do have) would start to be invalid, we would have to rename it to "DistributedTestBaseImpl" because it was starting on "Test" which is no-no. I agree that "TestBaseImpl" is rather unfortunate name. Currently, TestBaseImpl itself extends "DistributedTestBase" so in the end we would have "DistributedTestBaseImpl -> TestBaseImpl -> DistributedTestBase". Please keep in mind that "DistributedTestBase" is located in dtest jar (separate project) and that is rather inconvenient to rename.
> >>
> >> Regards
> >>
> >> ________________________________________
> >> From: Derek Chen-Becker <de...@chen-becker.org>>>>
> >> Sent: Monday, October 24, 2022 15:09
> >> To: dev@cassandra.apache.org<ma...@cassandra.apache.org>>>
> >> Subject: Re: Some tests are never executed in CI due to their name
> >>
> >> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
> >>
> >>
> >>
> >> For clarity, what is the final regex you've landed on for the first approach? Is it "^(?!(?:Test))(?!.*?(?:Tests)).*$" ? Are we just trying to reject anything that *starts* with "Test" or contains "Tests" somewhere in the name? It might be more clear to state what we think a valid test name should be, not in regex form.
> >>
> >> When it comes to "TestBase" files, are there any of these that are actually intended to be run as a test? The "Base" would indicate to me that it's intended to be inherited, not run directly. Do we have some "TestBase" classes somewhere that aren't inherited and are meant to be run directly, because that definitely seems like a misleading name that should be fixed, invasive or not. I would lean toward the principle of least surprise here, even if it means an involved one-time cleanup.
> >>
> >> Cheers,
> >>
> >> Derek
> >>
> >> On Mon, Oct 24, 2022 at 3:02 AM Miklosovic, Stefan <St...@netapp.com>>>>> wrote:
> >> Hi list,
> >>
> >> it was reported that some tests are not executed as part of CI because of their name (1). There is (2) and (3) so right now it runs only "*Test.java" files.
> >>
> >> Except of us renaming the affected classes, we should fix this in a more robust way - via checkstyle - so we enforce how test files have to be called.
> >>
> >> We came up with two approaches and we do not know which one to choose so we want to gather more feedback / opinions.
> >>
> >> The first approach would make this happen (implemented here (4))
> >>
> >> class name - passes / fails the checkstyle
> >>
> >> TestIt - fails
> >> TestsThatFeature - fails
> >> ThisTestsMyFeature - fails
> >> SomeTests - fails
> >>
> >> SomeHelperClassForTesting - passes
> >> SomeTest - passes
> >> DistributedTestBase - passes
> >>
> >> MyTestSplit1 - passes
> >>
> >> We would fix "SplitN" test files by hand (I think they are not executed right now in CI either) to something like MySplit1Test and MySplit2Test but the regexp we would eventually use would not prevent this from happening in the future. I do not know how to construct such a complex regexp yet which would cover all cases above plus this splitting one. Feel free to look into the ticket where details are discussed.
> >>
> >> The second approach would forbid to have any occurrence of "test" in the file name but at the end. For example, it would fail on classes like these
> >>
> >> MyTestSplit1
> >> TestHelper
> >> BurnTestUtil
> >> FuzzTestBase
> >> DistributedTestSnitch
> >> CASCommonTestCases
> >> ServerTestUtils
> >> AuthTestUtils
> >> TestMemtable
> >>
> >> While this would also fix "MyTestSplitN" cases, I consider this to be way more invasive to the current codebase because we would have to rename a lot of files (like, hundreds, plus a lot of places they are referenced / used at) and I do not even think this is necessary and desirable. Like, how would we call "FuzzTestBase" to not contain "Test" in it? I think having "test" string in the middle is just fine, specifically for helper classes. One possible fix would be to replace "Test" with "Tester" so we would have "TesterMemtable", "AuthTesterUtils", "TesterHelper", "DistributedTesterSnitch" ... Word "Tester" would be explicitly allowed.
> >>
> >> I personally lean to the first approach even though cases like "MyTestSplit1" will not be spotted in compile time, but hey, we still do have a review process on top of this. The obvious violations like "TestThisFeature" and "MySuperTests" would be evaluated as violations which I would argue happens the most often.
> >>
> >> Please keep in mind that we have checkstyle only in 4.1 and trunk. So, while we would fix names of classes in all branches we support so they are picked up by CI, checkstyle for this would be introduced in 4.1 and trunk only.
> >>
> >> (1) https://issues.apache.org/jira/browse/CASSANDRA-17964
> >> (2) https://github.com/apache/cassandra/blob/trunk/build.xml#L61
> >> (3) https://github.com/apache/cassandra/blob/trunk/build.xml#L1033
> >> (4) https://github.com/instaclustr/cassandra/commits/CASSANDRA-17964-4.1-2
> >>
> >>
> >> --
> >> +---------------------------------------------------------------+
> >> | Derek Chen-Becker                                             |
> >> | GPG Key available at https://keybase.io/dchenbecker and       |
> >> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
> >> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
> >> +---------------------------------------------------------------+
> >>
> >>
> >>
> >> --
> >> +---------------------------------------------------------------+
> >> | Derek Chen-Becker                                             |
> >> | GPG Key available at https://keybase.io/dchenbecker and       |
> >> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
> >> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
> >> +---------------------------------------------------------------+
> >>
> >>
> >>
> >> --
> >> +---------------------------------------------------------------+
> >> | Derek Chen-Becker                                             |
> >> | GPG Key available at https://keybase.io/dchenbecker and       |
> >> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
> >> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
> >> +---------------------------------------------------------------+
> >>
> 

Re: Some tests are never executed in CI due to their name

Posted by "Miklosovic, Stefan" <St...@netapp.com>.
Hi Berenguer,

I agree. We can always go heavy-weight anyway. I will dedicate more time to that the next week. It would be nice if you tagged along and help with the review.

FYI the ticket is touching all branches we support (there are some skipped tests in 3.0 too), we would just add checkstyle to 4.1 and trunk as discussed, so I expect we fix all 5 branches at once.

Regards

Stefan

________________________________________
From: Berenguer Blasi <be...@gmail.com>
Sent: Friday, October 28, 2022 9:43
To: dev@cassandra.apache.org
Subject: Re: Some tests are never executed in CI due to their name

NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.




Hi Stefan,

seems none of the 2 proposals are strongly opposed to. If you feel more comfortable with option 1 I say we do that, which will always be a step in the right direction. If we see it bites us again in the future then we do 2. But at least we bring this benefit/ticket in. I would also be happy to drive option 2 to completion if needed.

wdyt?

On 25/10/22 14:35, Berenguer Blasi wrote:

Well examples lend themselves to many things. I am sure I could come up with examples in the opposite direction. I'd rather talk in terms of the actual files and tests we have.

What I was trying to clarify is what I was misunderstanding as I had the impression now we were saying option 1 would not leak again.

So option 2, given ant and CI seem to rely on the *Test.java notation and also prevents future leaks doesn't seem to me a bad idea. Renaming the offending files seems a reasonable cost to me (Eclipse does it for me).

My 2cts! But hey, I've been known to be wrong before :shrug: lol

On 25/10/22 12:04, Andrés de la Peña wrote:
Note that the test multiplexer also searches for tests ending with "Test", so it will also miss new or modified tests with nonstandard names. The automatically detected tests are listed when running generate.sh, and they are added to the repeated run jobs. That gives us another opportunity to detect missed tests when we introduce something like "MyTestSplit1.java" and see that it's not been included in the mandatory repeated runs.

On Tue, 25 Oct 2022 at 10:31, Miklosovic, Stefan <St...@netapp.com>> wrote:
Hi Berenguer,

I am glad you asked. I was expecting this question.

I think there is a fundamental difference in how we approach this problem. I do not say mine is better, I just find it important to describe it clearly.

Let's say we are on a ship which leaks. When I detect that, my course of action is to fix the leakage with the biggest patches I can find around with minimal effort necessary so we are not taking water anymore. There might still be occasional leakages which are very rare but I have a crew who is checking the ship constantly anyway (review) and the risk that big leakages like we just fixed happen again are relatively very low.

You spot a leakage and instead of fixing it with big patches and calling it a day, you are trying to remodel the cabins so they are completely waterproof but while doing so, you renamed them so everyone needs to get used to how these cabins are called and where they are located, because there is this minimal chance that some cadet comes around and starts to live in a cabin he is not supposed to and we need to explain it to him. Thousands of cadets found their cabins just fine. Occasionally, there are few people who just miss the right board completely (Test at start, Tests at the end) and they are automatically navigated around (checkstyle) but once in five years there is this guy who just completely missed it.

I believe we can just navigate him when that happens. You want to cover that guy too.

I just find my approach easier.

You can remodel it all, for sure but I am afraid I will not be a part of that. I just do not find it necessary to do that.

________________________________________
From: Berenguer Blasi <be...@gmail.com>>
Sent: Tuesday, October 25, 2022 11:08
To: dev@cassandra.apache.org<ma...@cassandra.apache.org>
Subject: Re: Some tests are never executed in CI due to their name

NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.




IIUC we're relying on catching the word 'Split' in the file name for option 1. If somebody named his test i.e. 'MyTestGroup1', 'MyTestGroup2', 'TTLTestPre2038', 'TTLTestPost2038',... I think we would leak tests again? or any other word that is not specifically accounted for. Unless I am missing sthg ofc! :-)

On 25/10/22 10:39, Miklosovic, Stefan wrote:
> I think that what you wrote is not entirely correct. It will prevent it from happening again when there are tests ending on "Tests" or starting on "Test". The only case it will not cover is "SplitN" issue we plan to cover with relaxed test.name<http://test.name> property.
>
> It seems like what you wrote means that we will fix it and tests will leak again. That is not true.
>
> ________________________________________
> From: Berenguer Blasi <be...@gmail.com>>
> Sent: Tuesday, October 25, 2022 7:26
> To: dev@cassandra.apache.org<ma...@cassandra.apache.org>
> Subject: Re: Some tests are never executed in CI due to their name
>
> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
>
>
>
>
> The problem with using the first approach is that it fixes the current
> situation but it doesn't prevent it from happening again. The second
> proposal prevents it from happening again but at the cost of a bigger
> rename I'd volunteer to if needed.
>
> Regards
>
> On 24/10/22 20:38, Miklosovic, Stefan wrote:
>> Yeah, that is what the branch in my original email actually already solved. I mean ...
>>
>> CassandraAuthorizerTruncatingTests
>> BatchTests
>> UUIDTests
>>
>> these are ending on Tests which is illegal and they are fixed there.
>>
>> Other cases are either "TestBase" or "Tester" which should be legal.
>>
>> I think using the first approach and fixing all "SplitN" PLUS adding Split* to test.name<http://test.name> regexp should do it.
>>
>> I think we can "automate" at most like you suggest and scan it manually, fix the stuff and from then incorporate checkstyle to take care of that.
>>
>> There are also some classes which do include @Test methods but I think they are abstract as they are meant to be extended as the real test is just wrapping that. This might happen when there are slight variations across test classes. This is fine as well.
>>
>> ________________________________________
>> From: Derek Chen-Becker <de...@chen-becker.org>>
>> Sent: Monday, October 24, 2022 19:18
>> To: dev@cassandra.apache.org<ma...@cassandra.apache.org>
>> Subject: Re: Some tests are never executed in CI due to their name
>>
>> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
>>
>>
>>
>> OK, I think the approach you're proposing is fine. As an orthogonal idea, is there any way to do a one-time audit to narrow down files that we need to inspect, or any way to automate confirmation of a file containing tests? For example:
>>
>> * Start with all of the files under the `test` directory (is it safe to assume that this is the only location for test classes?) => 1457 files
>> * Remove any files that already end in "Test.java" => down to 396 files
>> * Is it safe to assume that only files having "Test" somewhere in their name are test classes? That reduces the list down to 60 files
>>
>> Alternatively, are we only talking about unit tests? Would it be sufficient to look for files that don't end in "Test.java" but contain "@Test" annotations? That's a significantly smaller set, so maybe we could fix unit tests first:
>>
>> ❯❯❯ for fl in test/unit/**/*.java; do if [[ $fl != *Test.java ]]; then if grep -qF '@Test' $fl; then print $fl; fi; fi; done                                                                                                                                                              circleci-parity ✱ ◼
>> test/unit/org/apache/cassandra/auth/CassandraAuthorizerTruncatingTests.java
>> test/unit/org/apache/cassandra/cql3/BatchTests.java
>> test/unit/org/apache/cassandra/cql3/KeywordTestBase.java
>> test/unit/org/apache/cassandra/db/guardrails/GuardrailAllowUncompressedTables.java
>> test/unit/org/apache/cassandra/db/guardrails/GuardrailConsistencyLevelsTester.java
>> test/unit/org/apache/cassandra/db/guardrails/GuardrailSecondaryIndexTester.java
>> test/unit/org/apache/cassandra/db/guardrails/GuardrailSecondaryIndexesPerTable.java
>> test/unit/org/apache/cassandra/db/guardrails/ThresholdTester.java
>> test/unit/org/apache/cassandra/dht/PartitionerTestCase.java
>> test/unit/org/apache/cassandra/utils/UUIDTests.java
>>
>> Cheers,
>>
>> Derek
>>
>>
>> On Mon, Oct 24, 2022 at 9:00 AM Miklosovic, Stefan <St...@netapp.com>>> wrote:
>> One more point as I was reading your email in a hurry. The ultimate goal is to run all tests. So even if we apply the proposed regexp in checkstyle and there is an unfortunate case that MyTestSplit1 would slip through on a review and it would not be executed in CI, if we do not figure out some better regexp, we might relax "test.name<http://test.name><http://test.name>" property to include Split*.java as well. Yes, this might be also done.
>>
>> Enforcing "all tests to end on "Test" " means that we would need to know which .java files contain tests and which not without looking into them. I do not think that is possible under circumstances we have where (checkstyle module).
>>
>> ________________________________________
>> From: Derek Chen-Becker <de...@chen-becker.org>>>
>> Sent: Monday, October 24, 2022 15:53
>> To: dev@cassandra.apache.org<ma...@cassandra.apache.org>>
>> Subject: Re: Some tests are never executed in CI due to their name
>>
>> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
>>
>>
>>
>> OK, that makes sense, and I missed the rename of "TestBaseImpl" (which I agree is in need of fixing). When you say:
>>
>>> What we are trying to achieve by that is to have a test which ends on "Test" and nothing else and it may contain "Test" only in case it does not start with it.
>> Should the regex enforce the "which ends on Test" part? Otherwise, if we want to allow tests like MyTestSplit1, do we need to update the "test.name<http://test.name><http://test.name><http://test.name>" parameter to allow that in the test runner?
>>
>> Thanks,
>>
>> Derek
>>
>>
>> On Mon, Oct 24, 2022 at 7:27 AM Miklosovic, Stefan <St...@netapp.com>>>> wrote:
>> Hi Derek,
>>
>> yes, the regexp is ^(?!(?:Test))(?!.*?(?:Tests)).*$
>>
>> What we are trying to achieve by that is to have a test which ends on "Test" and nothing else and it may contain "Test" only in case it does not start with it.
>>
>> To your second paragraph, I do not think we have any *Base files which are tests (that would be a bug) nor they are standalone (not inherited). By first approach, TestBaseImpl (which we do have) would start to be invalid, we would have to rename it to "DistributedTestBaseImpl" because it was starting on "Test" which is no-no. I agree that "TestBaseImpl" is rather unfortunate name. Currently, TestBaseImpl itself extends "DistributedTestBase" so in the end we would have "DistributedTestBaseImpl -> TestBaseImpl -> DistributedTestBase". Please keep in mind that "DistributedTestBase" is located in dtest jar (separate project) and that is rather inconvenient to rename.
>>
>> Regards
>>
>> ________________________________________
>> From: Derek Chen-Becker <de...@chen-becker.org>>>>
>> Sent: Monday, October 24, 2022 15:09
>> To: dev@cassandra.apache.org<ma...@cassandra.apache.org>>>
>> Subject: Re: Some tests are never executed in CI due to their name
>>
>> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
>>
>>
>>
>> For clarity, what is the final regex you've landed on for the first approach? Is it "^(?!(?:Test))(?!.*?(?:Tests)).*$" ? Are we just trying to reject anything that *starts* with "Test" or contains "Tests" somewhere in the name? It might be more clear to state what we think a valid test name should be, not in regex form.
>>
>> When it comes to "TestBase" files, are there any of these that are actually intended to be run as a test? The "Base" would indicate to me that it's intended to be inherited, not run directly. Do we have some "TestBase" classes somewhere that aren't inherited and are meant to be run directly, because that definitely seems like a misleading name that should be fixed, invasive or not. I would lean toward the principle of least surprise here, even if it means an involved one-time cleanup.
>>
>> Cheers,
>>
>> Derek
>>
>> On Mon, Oct 24, 2022 at 3:02 AM Miklosovic, Stefan <St...@netapp.com>>>>> wrote:
>> Hi list,
>>
>> it was reported that some tests are not executed as part of CI because of their name (1). There is (2) and (3) so right now it runs only "*Test.java" files.
>>
>> Except of us renaming the affected classes, we should fix this in a more robust way - via checkstyle - so we enforce how test files have to be called.
>>
>> We came up with two approaches and we do not know which one to choose so we want to gather more feedback / opinions.
>>
>> The first approach would make this happen (implemented here (4))
>>
>> class name - passes / fails the checkstyle
>>
>> TestIt - fails
>> TestsThatFeature - fails
>> ThisTestsMyFeature - fails
>> SomeTests - fails
>>
>> SomeHelperClassForTesting - passes
>> SomeTest - passes
>> DistributedTestBase - passes
>>
>> MyTestSplit1 - passes
>>
>> We would fix "SplitN" test files by hand (I think they are not executed right now in CI either) to something like MySplit1Test and MySplit2Test but the regexp we would eventually use would not prevent this from happening in the future. I do not know how to construct such a complex regexp yet which would cover all cases above plus this splitting one. Feel free to look into the ticket where details are discussed.
>>
>> The second approach would forbid to have any occurrence of "test" in the file name but at the end. For example, it would fail on classes like these
>>
>> MyTestSplit1
>> TestHelper
>> BurnTestUtil
>> FuzzTestBase
>> DistributedTestSnitch
>> CASCommonTestCases
>> ServerTestUtils
>> AuthTestUtils
>> TestMemtable
>>
>> While this would also fix "MyTestSplitN" cases, I consider this to be way more invasive to the current codebase because we would have to rename a lot of files (like, hundreds, plus a lot of places they are referenced / used at) and I do not even think this is necessary and desirable. Like, how would we call "FuzzTestBase" to not contain "Test" in it? I think having "test" string in the middle is just fine, specifically for helper classes. One possible fix would be to replace "Test" with "Tester" so we would have "TesterMemtable", "AuthTesterUtils", "TesterHelper", "DistributedTesterSnitch" ... Word "Tester" would be explicitly allowed.
>>
>> I personally lean to the first approach even though cases like "MyTestSplit1" will not be spotted in compile time, but hey, we still do have a review process on top of this. The obvious violations like "TestThisFeature" and "MySuperTests" would be evaluated as violations which I would argue happens the most often.
>>
>> Please keep in mind that we have checkstyle only in 4.1 and trunk. So, while we would fix names of classes in all branches we support so they are picked up by CI, checkstyle for this would be introduced in 4.1 and trunk only.
>>
>> (1) https://issues.apache.org/jira/browse/CASSANDRA-17964
>> (2) https://github.com/apache/cassandra/blob/trunk/build.xml#L61
>> (3) https://github.com/apache/cassandra/blob/trunk/build.xml#L1033
>> (4) https://github.com/instaclustr/cassandra/commits/CASSANDRA-17964-4.1-2
>>
>>
>> --
>> +---------------------------------------------------------------+
>> | Derek Chen-Becker                                             |
>> | GPG Key available at https://keybase.io/dchenbecker and       |
>> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
>> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
>> +---------------------------------------------------------------+
>>
>>
>>
>> --
>> +---------------------------------------------------------------+
>> | Derek Chen-Becker                                             |
>> | GPG Key available at https://keybase.io/dchenbecker and       |
>> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
>> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
>> +---------------------------------------------------------------+
>>
>>
>>
>> --
>> +---------------------------------------------------------------+
>> | Derek Chen-Becker                                             |
>> | GPG Key available at https://keybase.io/dchenbecker and       |
>> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
>> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
>> +---------------------------------------------------------------+
>>

Re: Some tests are never executed in CI due to their name

Posted by Berenguer Blasi <be...@gmail.com>.
Hi Stefan,

seems none of the 2 proposals are strongly opposed to. If you feel more 
comfortable with option 1 I say we do that, which will always be a step 
in the right direction. If we see it bites us again in the future then 
we do 2. But at least we bring this benefit/ticket in. I would also be 
happy to drive option 2 to completion if needed.

wdyt?

On 25/10/22 14:35, Berenguer Blasi wrote:
>
> Well examples lend themselves to many things. I am sure I could come 
> up with examples in the opposite direction. I'd rather talk in terms 
> of the actual files and tests we have.
>
> What I was trying to clarify is what I was misunderstanding as I had 
> the impression now we were saying option 1 would not leak again.
>
> So option 2, given ant and CI seem to rely on the *Test.java notation 
> and also prevents future leaks doesn't seem to me a bad idea. Renaming 
> the offending files seems a reasonable cost to me (Eclipse does it for 
> me).
>
> My 2cts! But hey, I've been known to be wrong before :shrug: lol
>
> On 25/10/22 12:04, Andrés de la Peña wrote:
>> Note that the test multiplexer also searches for tests ending with 
>> "Test", so it will also miss new or modified tests with nonstandard 
>> names. The automatically detected tests are listed when running 
>> generate.sh, and they are added to the repeated run jobs. That gives 
>> us another opportunity to detect missed tests when we introduce 
>> something like "MyTestSplit1.java" and see that it's not been 
>> included in the mandatory repeated runs.
>>
>> On Tue, 25 Oct 2022 at 10:31, Miklosovic, Stefan 
>> <St...@netapp.com> wrote:
>>
>>     Hi Berenguer,
>>
>>     I am glad you asked. I was expecting this question.
>>
>>     I think there is a fundamental difference in how we approach this
>>     problem. I do not say mine is better, I just find it important to
>>     describe it clearly.
>>
>>     Let's say we are on a ship which leaks. When I detect that, my
>>     course of action is to fix the leakage with the biggest patches I
>>     can find around with minimal effort necessary so we are not
>>     taking water anymore. There might still be occasional leakages
>>     which are very rare but I have a crew who is checking the ship
>>     constantly anyway (review) and the risk that big leakages like we
>>     just fixed happen again are relatively very low.
>>
>>     You spot a leakage and instead of fixing it with big patches and
>>     calling it a day, you are trying to remodel the cabins so they
>>     are completely waterproof but while doing so, you renamed them so
>>     everyone needs to get used to how these cabins are called and
>>     where they are located, because there is this minimal chance that
>>     some cadet comes around and starts to live in a cabin he is not
>>     supposed to and we need to explain it to him. Thousands of cadets
>>     found their cabins just fine. Occasionally, there are few people
>>     who just miss the right board completely (Test at start, Tests at
>>     the end) and they are automatically navigated around (checkstyle)
>>     but once in five years there is this guy who just completely
>>     missed it.
>>
>>     I believe we can just navigate him when that happens. You want to
>>     cover that guy too.
>>
>>     I just find my approach easier.
>>
>>     You can remodel it all, for sure but I am afraid I will not be a
>>     part of that. I just do not find it necessary to do that.
>>
>>     ________________________________________
>>     From: Berenguer Blasi <be...@gmail.com>
>>     Sent: Tuesday, October 25, 2022 11:08
>>     To: dev@cassandra.apache.org
>>     Subject: Re: Some tests are never executed in CI due to their name
>>
>>     NetApp Security WARNING: This is an external email. Do not click
>>     links or open attachments unless you recognize the sender and
>>     know the content is safe.
>>
>>
>>
>>
>>     IIUC we're relying on catching the word 'Split' in the file name
>>     for option 1. If somebody named his test i.e. 'MyTestGroup1',
>>     'MyTestGroup2', 'TTLTestPre2038', 'TTLTestPost2038',... I think
>>     we would leak tests again? or any other word that is not
>>     specifically accounted for. Unless I am missing sthg ofc! :-)
>>
>>     On 25/10/22 10:39, Miklosovic, Stefan wrote:
>>     > I think that what you wrote is not entirely correct. It will
>>     prevent it from happening again when there are tests ending on
>>     "Tests" or starting on "Test". The only case it will not cover is
>>     "SplitN" issue we plan to cover with relaxed test.name
>>     <http://test.name> property.
>>     >
>>     > It seems like what you wrote means that we will fix it and
>>     tests will leak again. That is not true.
>>     >
>>     > ________________________________________
>>     > From: Berenguer Blasi <be...@gmail.com>
>>     > Sent: Tuesday, October 25, 2022 7:26
>>     > To: dev@cassandra.apache.org
>>     > Subject: Re: Some tests are never executed in CI due to their name
>>     >
>>     > NetApp Security WARNING: This is an external email. Do not
>>     click links or open attachments unless you recognize the sender
>>     and know the content is safe.
>>     >
>>     >
>>     >
>>     >
>>     > The problem with using the first approach is that it fixes the
>>     current
>>     > situation but it doesn't prevent it from happening again. The
>>     second
>>     > proposal prevents it from happening again but at the cost of a
>>     bigger
>>     > rename I'd volunteer to if needed.
>>     >
>>     > Regards
>>     >
>>     > On 24/10/22 20:38, Miklosovic, Stefan wrote:
>>     >> Yeah, that is what the branch in my original email actually
>>     already solved. I mean ...
>>     >>
>>     >> CassandraAuthorizerTruncatingTests
>>     >> BatchTests
>>     >> UUIDTests
>>     >>
>>     >> these are ending on Tests which is illegal and they are fixed
>>     there.
>>     >>
>>     >> Other cases are either "TestBase" or "Tester" which should be
>>     legal.
>>     >>
>>     >> I think using the first approach and fixing all "SplitN" PLUS
>>     adding Split* to test.name <http://test.name> regexp should do it.
>>     >>
>>     >> I think we can "automate" at most like you suggest and scan it
>>     manually, fix the stuff and from then incorporate checkstyle to
>>     take care of that.
>>     >>
>>     >> There are also some classes which do include @Test methods but
>>     I think they are abstract as they are meant to be extended as the
>>     real test is just wrapping that. This might happen when there are
>>     slight variations across test classes. This is fine as well.
>>     >>
>>     >> ________________________________________
>>     >> From: Derek Chen-Becker <de...@chen-becker.org>
>>     >> Sent: Monday, October 24, 2022 19:18
>>     >> To: dev@cassandra.apache.org
>>     >> Subject: Re: Some tests are never executed in CI due to their name
>>     >>
>>     >> NetApp Security WARNING: This is an external email. Do not
>>     click links or open attachments unless you recognize the sender
>>     and know the content is safe.
>>     >>
>>     >>
>>     >>
>>     >> OK, I think the approach you're proposing is fine. As an
>>     orthogonal idea, is there any way to do a one-time audit to
>>     narrow down files that we need to inspect, or any way to automate
>>     confirmation of a file containing tests? For example:
>>     >>
>>     >> * Start with all of the files under the `test` directory (is
>>     it safe to assume that this is the only location for test
>>     classes?) => 1457 files
>>     >> * Remove any files that already end in "Test.java" => down to
>>     396 files
>>     >> * Is it safe to assume that only files having "Test" somewhere
>>     in their name are test classes? That reduces the list down to 60
>>     files
>>     >>
>>     >> Alternatively, are we only talking about unit tests? Would it
>>     be sufficient to look for files that don't end in "Test.java" but
>>     contain "@Test" annotations? That's a significantly smaller set,
>>     so maybe we could fix unit tests first:
>>     >>
>>     >> ❯❯❯ for fl in test/unit/**/*.java; do if [[ $fl != *Test.java
>>     ]]; then if grep -qF '@Test' $fl; then print $fl; fi; fi; done
>>     circleci-parity ✱ ◼
>>     >>
>>     test/unit/org/apache/cassandra/auth/CassandraAuthorizerTruncatingTests.java
>>     >> test/unit/org/apache/cassandra/cql3/BatchTests.java
>>     >> test/unit/org/apache/cassandra/cql3/KeywordTestBase.java
>>     >>
>>     test/unit/org/apache/cassandra/db/guardrails/GuardrailAllowUncompressedTables.java
>>     >>
>>     test/unit/org/apache/cassandra/db/guardrails/GuardrailConsistencyLevelsTester.java
>>     >>
>>     test/unit/org/apache/cassandra/db/guardrails/GuardrailSecondaryIndexTester.java
>>     >>
>>     test/unit/org/apache/cassandra/db/guardrails/GuardrailSecondaryIndexesPerTable.java
>>     >> test/unit/org/apache/cassandra/db/guardrails/ThresholdTester.java
>>     >> test/unit/org/apache/cassandra/dht/PartitionerTestCase.java
>>     >> test/unit/org/apache/cassandra/utils/UUIDTests.java
>>     >>
>>     >> Cheers,
>>     >>
>>     >> Derek
>>     >>
>>     >>
>>     >> On Mon, Oct 24, 2022 at 9:00 AM Miklosovic, Stefan
>>     <St...@netapp.com>>
>>     wrote:
>>     >> One more point as I was reading your email in a hurry. The
>>     ultimate goal is to run all tests. So even if we apply the
>>     proposed regexp in checkstyle and there is an unfortunate case
>>     that MyTestSplit1 would slip through on a review and it would not
>>     be executed in CI, if we do not figure out some better regexp, we
>>     might relax "test.name <http://test.name><http://test.name>"
>>     property to include Split*.java as well. Yes, this might be also
>>     done.
>>     >>
>>     >> Enforcing "all tests to end on "Test" " means that we would
>>     need to know which .java files contain tests and which not
>>     without looking into them. I do not think that is possible under
>>     circumstances we have where (checkstyle module).
>>     >>
>>     >> ________________________________________
>>     >> From: Derek Chen-Becker
>>     <de...@chen-becker.org>>
>>     >> Sent: Monday, October 24, 2022 15:53
>>     >> To: dev@cassandra.apache.org<ma...@cassandra.apache.org>
>>     >> Subject: Re: Some tests are never executed in CI due to their name
>>     >>
>>     >> NetApp Security WARNING: This is an external email. Do not
>>     click links or open attachments unless you recognize the sender
>>     and know the content is safe.
>>     >>
>>     >>
>>     >>
>>     >> OK, that makes sense, and I missed the rename of
>>     "TestBaseImpl" (which I agree is in need of fixing). When you say:
>>     >>
>>     >>> What we are trying to achieve by that is to have a test which
>>     ends on "Test" and nothing else and it may contain "Test" only in
>>     case it does not start with it.
>>     >> Should the regex enforce the "which ends on Test" part?
>>     Otherwise, if we want to allow tests like MyTestSplit1, do we
>>     need to update the "test.name
>>     <http://test.name><http://test.name><http://test.name>" parameter
>>     to allow that in the test runner?
>>     >>
>>     >> Thanks,
>>     >>
>>     >> Derek
>>     >>
>>     >>
>>     >> On Mon, Oct 24, 2022 at 7:27 AM Miklosovic, Stefan
>>     <St...@netapp.com>>>
>>     wrote:
>>     >> Hi Derek,
>>     >>
>>     >> yes, the regexp is ^(?!(?:Test))(?!.*?(?:Tests)).*$
>>     >>
>>     >> What we are trying to achieve by that is to have a test which
>>     ends on "Test" and nothing else and it may contain "Test" only in
>>     case it does not start with it.
>>     >>
>>     >> To your second paragraph, I do not think we have any *Base
>>     files which are tests (that would be a bug) nor they are
>>     standalone (not inherited). By first approach, TestBaseImpl
>>     (which we do have) would start to be invalid, we would have to
>>     rename it to "DistributedTestBaseImpl" because it was starting on
>>     "Test" which is no-no. I agree that "TestBaseImpl" is rather
>>     unfortunate name. Currently, TestBaseImpl itself extends
>>     "DistributedTestBase" so in the end we would have
>>     "DistributedTestBaseImpl -> TestBaseImpl -> DistributedTestBase".
>>     Please keep in mind that "DistributedTestBase" is located in
>>     dtest jar (separate project) and that is rather inconvenient to
>>     rename.
>>     >>
>>     >> Regards
>>     >>
>>     >> ________________________________________
>>     >> From: Derek Chen-Becker
>>     <de...@chen-becker.org>>>
>>     >> Sent: Monday, October 24, 2022 15:09
>>     >> To:
>>     dev@cassandra.apache.org<ma...@cassandra.apache.org>>
>>     >> Subject: Re: Some tests are never executed in CI due to their name
>>     >>
>>     >> NetApp Security WARNING: This is an external email. Do not
>>     click links or open attachments unless you recognize the sender
>>     and know the content is safe.
>>     >>
>>     >>
>>     >>
>>     >> For clarity, what is the final regex you've landed on for the
>>     first approach? Is it "^(?!(?:Test))(?!.*?(?:Tests)).*$" ? Are we
>>     just trying to reject anything that *starts* with "Test" or
>>     contains "Tests" somewhere in the name? It might be more clear to
>>     state what we think a valid test name should be, not in regex form.
>>     >>
>>     >> When it comes to "TestBase" files, are there any of these that
>>     are actually intended to be run as a test? The "Base" would
>>     indicate to me that it's intended to be inherited, not run
>>     directly. Do we have some "TestBase" classes somewhere that
>>     aren't inherited and are meant to be run directly, because that
>>     definitely seems like a misleading name that should be fixed,
>>     invasive or not. I would lean toward the principle of least
>>     surprise here, even if it means an involved one-time cleanup.
>>     >>
>>     >> Cheers,
>>     >>
>>     >> Derek
>>     >>
>>     >> On Mon, Oct 24, 2022 at 3:02 AM Miklosovic, Stefan
>>     <St...@netapp.com>>>>
>>     wrote:
>>     >> Hi list,
>>     >>
>>     >> it was reported that some tests are not executed as part of CI
>>     because of their name (1). There is (2) and (3) so right now it
>>     runs only "*Test.java" files.
>>     >>
>>     >> Except of us renaming the affected classes, we should fix this
>>     in a more robust way - via checkstyle - so we enforce how test
>>     files have to be called.
>>     >>
>>     >> We came up with two approaches and we do not know which one to
>>     choose so we want to gather more feedback / opinions.
>>     >>
>>     >> The first approach would make this happen (implemented here (4))
>>     >>
>>     >> class name - passes / fails the checkstyle
>>     >>
>>     >> TestIt - fails
>>     >> TestsThatFeature - fails
>>     >> ThisTestsMyFeature - fails
>>     >> SomeTests - fails
>>     >>
>>     >> SomeHelperClassForTesting - passes
>>     >> SomeTest - passes
>>     >> DistributedTestBase - passes
>>     >>
>>     >> MyTestSplit1 - passes
>>     >>
>>     >> We would fix "SplitN" test files by hand (I think they are not
>>     executed right now in CI either) to something like MySplit1Test
>>     and MySplit2Test but the regexp we would eventually use would not
>>     prevent this from happening in the future. I do not know how to
>>     construct such a complex regexp yet which would cover all cases
>>     above plus this splitting one. Feel free to look into the ticket
>>     where details are discussed.
>>     >>
>>     >> The second approach would forbid to have any occurrence of
>>     "test" in the file name but at the end. For example, it would
>>     fail on classes like these
>>     >>
>>     >> MyTestSplit1
>>     >> TestHelper
>>     >> BurnTestUtil
>>     >> FuzzTestBase
>>     >> DistributedTestSnitch
>>     >> CASCommonTestCases
>>     >> ServerTestUtils
>>     >> AuthTestUtils
>>     >> TestMemtable
>>     >>
>>     >> While this would also fix "MyTestSplitN" cases, I consider
>>     this to be way more invasive to the current codebase because we
>>     would have to rename a lot of files (like, hundreds, plus a lot
>>     of places they are referenced / used at) and I do not even think
>>     this is necessary and desirable. Like, how would we call
>>     "FuzzTestBase" to not contain "Test" in it? I think having "test"
>>     string in the middle is just fine, specifically for helper
>>     classes. One possible fix would be to replace "Test" with
>>     "Tester" so we would have "TesterMemtable", "AuthTesterUtils",
>>     "TesterHelper", "DistributedTesterSnitch" ... Word "Tester" would
>>     be explicitly allowed.
>>     >>
>>     >> I personally lean to the first approach even though cases like
>>     "MyTestSplit1" will not be spotted in compile time, but hey, we
>>     still do have a review process on top of this. The obvious
>>     violations like "TestThisFeature" and "MySuperTests" would be
>>     evaluated as violations which I would argue happens the most often.
>>     >>
>>     >> Please keep in mind that we have checkstyle only in 4.1 and
>>     trunk. So, while we would fix names of classes in all branches we
>>     support so they are picked up by CI, checkstyle for this would be
>>     introduced in 4.1 and trunk only.
>>     >>
>>     >> (1) https://issues.apache.org/jira/browse/CASSANDRA-17964
>>     >> (2) https://github.com/apache/cassandra/blob/trunk/build.xml#L61
>>     >> (3) https://github.com/apache/cassandra/blob/trunk/build.xml#L1033
>>     >> (4)
>>     https://github.com/instaclustr/cassandra/commits/CASSANDRA-17964-4.1-2
>>     >>
>>     >>
>>     >> --
>>     >> +---------------------------------------------------------------+
>>     >> | Derek Chen-Becker              |
>>     >> | GPG Key available at https://keybase.io/dchenbecker and       |
>>     >> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
>>     >> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
>>     >> +---------------------------------------------------------------+
>>     >>
>>     >>
>>     >>
>>     >> --
>>     >> +---------------------------------------------------------------+
>>     >> | Derek Chen-Becker              |
>>     >> | GPG Key available at https://keybase.io/dchenbecker and       |
>>     >> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
>>     >> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
>>     >> +---------------------------------------------------------------+
>>     >>
>>     >>
>>     >>
>>     >> --
>>     >> +---------------------------------------------------------------+
>>     >> | Derek Chen-Becker              |
>>     >> | GPG Key available at https://keybase.io/dchenbecker and       |
>>     >> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
>>     >> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
>>     >> +---------------------------------------------------------------+
>>     >>
>>

Re: Some tests are never executed in CI due to their name

Posted by Berenguer Blasi <be...@gmail.com>.
Well examples lend themselves to many things. I am sure I could come up 
with examples in the opposite direction. I'd rather talk in terms of the 
actual files and tests we have.

What I was trying to clarify is what I was misunderstanding as I had the 
impression now we were saying option 1 would not leak again.

So option 2, given ant and CI seem to rely on the *Test.java notation 
and also prevents future leaks doesn't seem to me a bad idea. Renaming 
the offending files seems a reasonable cost to me (Eclipse does it for me).

My 2cts! But hey, I've been known to be wrong before :shrug: lol

On 25/10/22 12:04, Andrés de la Peña wrote:
> Note that the test multiplexer also searches for tests ending with 
> "Test", so it will also miss new or modified tests with nonstandard 
> names. The automatically detected tests are listed when running 
> generate.sh, and they are added to the repeated run jobs. That gives 
> us another opportunity to detect missed tests when we introduce 
> something like "MyTestSplit1.java" and see that it's not been included 
> in the mandatory repeated runs.
>
> On Tue, 25 Oct 2022 at 10:31, Miklosovic, Stefan 
> <St...@netapp.com> wrote:
>
>     Hi Berenguer,
>
>     I am glad you asked. I was expecting this question.
>
>     I think there is a fundamental difference in how we approach this
>     problem. I do not say mine is better, I just find it important to
>     describe it clearly.
>
>     Let's say we are on a ship which leaks. When I detect that, my
>     course of action is to fix the leakage with the biggest patches I
>     can find around with minimal effort necessary so we are not taking
>     water anymore. There might still be occasional leakages which are
>     very rare but I have a crew who is checking the ship constantly
>     anyway (review) and the risk that big leakages like we just fixed
>     happen again are relatively very low.
>
>     You spot a leakage and instead of fixing it with big patches and
>     calling it a day, you are trying to remodel the cabins so they are
>     completely waterproof but while doing so, you renamed them so
>     everyone needs to get used to how these cabins are called and
>     where they are located, because there is this minimal chance that
>     some cadet comes around and starts to live in a cabin he is not
>     supposed to and we need to explain it to him. Thousands of cadets
>     found their cabins just fine. Occasionally, there are few people
>     who just miss the right board completely (Test at start, Tests at
>     the end) and they are automatically navigated around (checkstyle)
>     but once in five years there is this guy who just completely
>     missed it.
>
>     I believe we can just navigate him when that happens. You want to
>     cover that guy too.
>
>     I just find my approach easier.
>
>     You can remodel it all, for sure but I am afraid I will not be a
>     part of that. I just do not find it necessary to do that.
>
>     ________________________________________
>     From: Berenguer Blasi <be...@gmail.com>
>     Sent: Tuesday, October 25, 2022 11:08
>     To: dev@cassandra.apache.org
>     Subject: Re: Some tests are never executed in CI due to their name
>
>     NetApp Security WARNING: This is an external email. Do not click
>     links or open attachments unless you recognize the sender and know
>     the content is safe.
>
>
>
>
>     IIUC we're relying on catching the word 'Split' in the file name
>     for option 1. If somebody named his test i.e. 'MyTestGroup1',
>     'MyTestGroup2', 'TTLTestPre2038', 'TTLTestPost2038',... I think we
>     would leak tests again? or any other word that is not specifically
>     accounted for. Unless I am missing sthg ofc! :-)
>
>     On 25/10/22 10:39, Miklosovic, Stefan wrote:
>     > I think that what you wrote is not entirely correct. It will
>     prevent it from happening again when there are tests ending on
>     "Tests" or starting on "Test". The only case it will not cover is
>     "SplitN" issue we plan to cover with relaxed test.name
>     <http://test.name> property.
>     >
>     > It seems like what you wrote means that we will fix it and tests
>     will leak again. That is not true.
>     >
>     > ________________________________________
>     > From: Berenguer Blasi <be...@gmail.com>
>     > Sent: Tuesday, October 25, 2022 7:26
>     > To: dev@cassandra.apache.org
>     > Subject: Re: Some tests are never executed in CI due to their name
>     >
>     > NetApp Security WARNING: This is an external email. Do not click
>     links or open attachments unless you recognize the sender and know
>     the content is safe.
>     >
>     >
>     >
>     >
>     > The problem with using the first approach is that it fixes the
>     current
>     > situation but it doesn't prevent it from happening again. The second
>     > proposal prevents it from happening again but at the cost of a
>     bigger
>     > rename I'd volunteer to if needed.
>     >
>     > Regards
>     >
>     > On 24/10/22 20:38, Miklosovic, Stefan wrote:
>     >> Yeah, that is what the branch in my original email actually
>     already solved. I mean ...
>     >>
>     >> CassandraAuthorizerTruncatingTests
>     >> BatchTests
>     >> UUIDTests
>     >>
>     >> these are ending on Tests which is illegal and they are fixed
>     there.
>     >>
>     >> Other cases are either "TestBase" or "Tester" which should be
>     legal.
>     >>
>     >> I think using the first approach and fixing all "SplitN" PLUS
>     adding Split* to test.name <http://test.name> regexp should do it.
>     >>
>     >> I think we can "automate" at most like you suggest and scan it
>     manually, fix the stuff and from then incorporate checkstyle to
>     take care of that.
>     >>
>     >> There are also some classes which do include @Test methods but
>     I think they are abstract as they are meant to be extended as the
>     real test is just wrapping that. This might happen when there are
>     slight variations across test classes. This is fine as well.
>     >>
>     >> ________________________________________
>     >> From: Derek Chen-Becker <de...@chen-becker.org>
>     >> Sent: Monday, October 24, 2022 19:18
>     >> To: dev@cassandra.apache.org
>     >> Subject: Re: Some tests are never executed in CI due to their name
>     >>
>     >> NetApp Security WARNING: This is an external email. Do not
>     click links or open attachments unless you recognize the sender
>     and know the content is safe.
>     >>
>     >>
>     >>
>     >> OK, I think the approach you're proposing is fine. As an
>     orthogonal idea, is there any way to do a one-time audit to narrow
>     down files that we need to inspect, or any way to automate
>     confirmation of a file containing tests? For example:
>     >>
>     >> * Start with all of the files under the `test` directory (is it
>     safe to assume that this is the only location for test classes?)
>     => 1457 files
>     >> * Remove any files that already end in "Test.java" => down to
>     396 files
>     >> * Is it safe to assume that only files having "Test" somewhere
>     in their name are test classes? That reduces the list down to 60 files
>     >>
>     >> Alternatively, are we only talking about unit tests? Would it
>     be sufficient to look for files that don't end in "Test.java" but
>     contain "@Test" annotations? That's a significantly smaller set,
>     so maybe we could fix unit tests first:
>     >>
>     >> ❯❯❯ for fl in test/unit/**/*.java; do if [[ $fl != *Test.java
>     ]]; then if grep -qF '@Test' $fl; then print $fl; fi; fi; done    
>                                               circleci-parity ✱ ◼
>     >>
>     test/unit/org/apache/cassandra/auth/CassandraAuthorizerTruncatingTests.java
>     >> test/unit/org/apache/cassandra/cql3/BatchTests.java
>     >> test/unit/org/apache/cassandra/cql3/KeywordTestBase.java
>     >>
>     test/unit/org/apache/cassandra/db/guardrails/GuardrailAllowUncompressedTables.java
>     >>
>     test/unit/org/apache/cassandra/db/guardrails/GuardrailConsistencyLevelsTester.java
>     >>
>     test/unit/org/apache/cassandra/db/guardrails/GuardrailSecondaryIndexTester.java
>     >>
>     test/unit/org/apache/cassandra/db/guardrails/GuardrailSecondaryIndexesPerTable.java
>     >> test/unit/org/apache/cassandra/db/guardrails/ThresholdTester.java
>     >> test/unit/org/apache/cassandra/dht/PartitionerTestCase.java
>     >> test/unit/org/apache/cassandra/utils/UUIDTests.java
>     >>
>     >> Cheers,
>     >>
>     >> Derek
>     >>
>     >>
>     >> On Mon, Oct 24, 2022 at 9:00 AM Miklosovic, Stefan
>     <St...@netapp.com>>
>     wrote:
>     >> One more point as I was reading your email in a hurry. The
>     ultimate goal is to run all tests. So even if we apply the
>     proposed regexp in checkstyle and there is an unfortunate case
>     that MyTestSplit1 would slip through on a review and it would not
>     be executed in CI, if we do not figure out some better regexp, we
>     might relax "test.name <http://test.name><http://test.name>"
>     property to include Split*.java as well. Yes, this might be also done.
>     >>
>     >> Enforcing "all tests to end on "Test" " means that we would
>     need to know which .java files contain tests and which not without
>     looking into them. I do not think that is possible under
>     circumstances we have where (checkstyle module).
>     >>
>     >> ________________________________________
>     >> From: Derek Chen-Becker
>     <de...@chen-becker.org>>
>     >> Sent: Monday, October 24, 2022 15:53
>     >> To: dev@cassandra.apache.org<ma...@cassandra.apache.org>
>     >> Subject: Re: Some tests are never executed in CI due to their name
>     >>
>     >> NetApp Security WARNING: This is an external email. Do not
>     click links or open attachments unless you recognize the sender
>     and know the content is safe.
>     >>
>     >>
>     >>
>     >> OK, that makes sense, and I missed the rename of "TestBaseImpl"
>     (which I agree is in need of fixing). When you say:
>     >>
>     >>> What we are trying to achieve by that is to have a test which
>     ends on "Test" and nothing else and it may contain "Test" only in
>     case it does not start with it.
>     >> Should the regex enforce the "which ends on Test" part?
>     Otherwise, if we want to allow tests like MyTestSplit1, do we need
>     to update the "test.name
>     <http://test.name><http://test.name><http://test.name>" parameter
>     to allow that in the test runner?
>     >>
>     >> Thanks,
>     >>
>     >> Derek
>     >>
>     >>
>     >> On Mon, Oct 24, 2022 at 7:27 AM Miklosovic, Stefan
>     <St...@netapp.com>>>
>     wrote:
>     >> Hi Derek,
>     >>
>     >> yes, the regexp is ^(?!(?:Test))(?!.*?(?:Tests)).*$
>     >>
>     >> What we are trying to achieve by that is to have a test which
>     ends on "Test" and nothing else and it may contain "Test" only in
>     case it does not start with it.
>     >>
>     >> To your second paragraph, I do not think we have any *Base
>     files which are tests (that would be a bug) nor they are
>     standalone (not inherited). By first approach, TestBaseImpl (which
>     we do have) would start to be invalid, we would have to rename it
>     to "DistributedTestBaseImpl" because it was starting on "Test"
>     which is no-no. I agree that "TestBaseImpl" is rather unfortunate
>     name. Currently, TestBaseImpl itself extends "DistributedTestBase"
>     so in the end we would have "DistributedTestBaseImpl ->
>     TestBaseImpl -> DistributedTestBase". Please keep in mind that
>     "DistributedTestBase" is located in dtest jar (separate project)
>     and that is rather inconvenient to rename.
>     >>
>     >> Regards
>     >>
>     >> ________________________________________
>     >> From: Derek Chen-Becker
>     <de...@chen-becker.org>>>
>     >> Sent: Monday, October 24, 2022 15:09
>     >> To:
>     dev@cassandra.apache.org<ma...@cassandra.apache.org>>
>     >> Subject: Re: Some tests are never executed in CI due to their name
>     >>
>     >> NetApp Security WARNING: This is an external email. Do not
>     click links or open attachments unless you recognize the sender
>     and know the content is safe.
>     >>
>     >>
>     >>
>     >> For clarity, what is the final regex you've landed on for the
>     first approach? Is it "^(?!(?:Test))(?!.*?(?:Tests)).*$" ? Are we
>     just trying to reject anything that *starts* with "Test" or
>     contains "Tests" somewhere in the name? It might be more clear to
>     state what we think a valid test name should be, not in regex form.
>     >>
>     >> When it comes to "TestBase" files, are there any of these that
>     are actually intended to be run as a test? The "Base" would
>     indicate to me that it's intended to be inherited, not run
>     directly. Do we have some "TestBase" classes somewhere that aren't
>     inherited and are meant to be run directly, because that
>     definitely seems like a misleading name that should be fixed,
>     invasive or not. I would lean toward the principle of least
>     surprise here, even if it means an involved one-time cleanup.
>     >>
>     >> Cheers,
>     >>
>     >> Derek
>     >>
>     >> On Mon, Oct 24, 2022 at 3:02 AM Miklosovic, Stefan
>     <St...@netapp.com>>>>
>     wrote:
>     >> Hi list,
>     >>
>     >> it was reported that some tests are not executed as part of CI
>     because of their name (1). There is (2) and (3) so right now it
>     runs only "*Test.java" files.
>     >>
>     >> Except of us renaming the affected classes, we should fix this
>     in a more robust way - via checkstyle - so we enforce how test
>     files have to be called.
>     >>
>     >> We came up with two approaches and we do not know which one to
>     choose so we want to gather more feedback / opinions.
>     >>
>     >> The first approach would make this happen (implemented here (4))
>     >>
>     >> class name - passes / fails the checkstyle
>     >>
>     >> TestIt - fails
>     >> TestsThatFeature - fails
>     >> ThisTestsMyFeature - fails
>     >> SomeTests - fails
>     >>
>     >> SomeHelperClassForTesting - passes
>     >> SomeTest - passes
>     >> DistributedTestBase - passes
>     >>
>     >> MyTestSplit1 - passes
>     >>
>     >> We would fix "SplitN" test files by hand (I think they are not
>     executed right now in CI either) to something like MySplit1Test
>     and MySplit2Test but the regexp we would eventually use would not
>     prevent this from happening in the future. I do not know how to
>     construct such a complex regexp yet which would cover all cases
>     above plus this splitting one. Feel free to look into the ticket
>     where details are discussed.
>     >>
>     >> The second approach would forbid to have any occurrence of
>     "test" in the file name but at the end. For example, it would fail
>     on classes like these
>     >>
>     >> MyTestSplit1
>     >> TestHelper
>     >> BurnTestUtil
>     >> FuzzTestBase
>     >> DistributedTestSnitch
>     >> CASCommonTestCases
>     >> ServerTestUtils
>     >> AuthTestUtils
>     >> TestMemtable
>     >>
>     >> While this would also fix "MyTestSplitN" cases, I consider this
>     to be way more invasive to the current codebase because we would
>     have to rename a lot of files (like, hundreds, plus a lot of
>     places they are referenced / used at) and I do not even think this
>     is necessary and desirable. Like, how would we call "FuzzTestBase"
>     to not contain "Test" in it? I think having "test" string in the
>     middle is just fine, specifically for helper classes. One possible
>     fix would be to replace "Test" with "Tester" so we would have
>     "TesterMemtable", "AuthTesterUtils", "TesterHelper",
>     "DistributedTesterSnitch" ... Word "Tester" would be explicitly
>     allowed.
>     >>
>     >> I personally lean to the first approach even though cases like
>     "MyTestSplit1" will not be spotted in compile time, but hey, we
>     still do have a review process on top of this. The obvious
>     violations like "TestThisFeature" and "MySuperTests" would be
>     evaluated as violations which I would argue happens the most often.
>     >>
>     >> Please keep in mind that we have checkstyle only in 4.1 and
>     trunk. So, while we would fix names of classes in all branches we
>     support so they are picked up by CI, checkstyle for this would be
>     introduced in 4.1 and trunk only.
>     >>
>     >> (1) https://issues.apache.org/jira/browse/CASSANDRA-17964
>     >> (2) https://github.com/apache/cassandra/blob/trunk/build.xml#L61
>     >> (3) https://github.com/apache/cassandra/blob/trunk/build.xml#L1033
>     >> (4)
>     https://github.com/instaclustr/cassandra/commits/CASSANDRA-17964-4.1-2
>     >>
>     >>
>     >> --
>     >> +---------------------------------------------------------------+
>     >> | Derek Chen-Becker            |
>     >> | GPG Key available at https://keybase.io/dchenbecker and       |
>     >> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
>     >> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
>     >> +---------------------------------------------------------------+
>     >>
>     >>
>     >>
>     >> --
>     >> +---------------------------------------------------------------+
>     >> | Derek Chen-Becker            |
>     >> | GPG Key available at https://keybase.io/dchenbecker and       |
>     >> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
>     >> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
>     >> +---------------------------------------------------------------+
>     >>
>     >>
>     >>
>     >> --
>     >> +---------------------------------------------------------------+
>     >> | Derek Chen-Becker            |
>     >> | GPG Key available at https://keybase.io/dchenbecker and       |
>     >> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
>     >> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
>     >> +---------------------------------------------------------------+
>     >>
>

Re: Some tests are never executed in CI due to their name

Posted by Andrés de la Peña <ad...@apache.org>.
Note that the test multiplexer also searches for tests ending with "Test",
so it will also miss new or modified tests with nonstandard names. The
automatically detected tests are listed when running generate.sh, and they
are added to the repeated run jobs. That gives us another opportunity to
detect missed tests when we introduce something like "MyTestSplit1.java"
and see that it's not been included in the mandatory repeated runs.

On Tue, 25 Oct 2022 at 10:31, Miklosovic, Stefan <
Stefan.Miklosovic@netapp.com> wrote:

> Hi Berenguer,
>
> I am glad you asked. I was expecting this question.
>
> I think there is a fundamental difference in how we approach this problem.
> I do not say mine is better, I just find it important to describe it
> clearly.
>
> Let's say we are on a ship which leaks. When I detect that, my course of
> action is to fix the leakage with the biggest patches I can find around
> with minimal effort necessary so we are not taking water anymore. There
> might still be occasional leakages which are very rare but I have a crew
> who is checking the ship constantly anyway (review) and the risk that big
> leakages like we just fixed happen again are relatively very low.
>
> You spot a leakage and instead of fixing it with big patches and calling
> it a day, you are trying to remodel the cabins so they are completely
> waterproof but while doing so, you renamed them so everyone needs to get
> used to how these cabins are called and where they are located, because
> there is this minimal chance that some cadet comes around and starts to
> live in a cabin he is not supposed to and we need to explain it to him.
> Thousands of cadets found their cabins just fine. Occasionally, there are
> few people who just miss the right board completely (Test at start, Tests
> at the end) and they are automatically navigated around (checkstyle) but
> once in five years there is this guy who just completely missed it.
>
> I believe we can just navigate him when that happens. You want to cover
> that guy too.
>
> I just find my approach easier.
>
> You can remodel it all, for sure but I am afraid I will not be a part of
> that. I just do not find it necessary to do that.
>
> ________________________________________
> From: Berenguer Blasi <be...@gmail.com>
> Sent: Tuesday, October 25, 2022 11:08
> To: dev@cassandra.apache.org
> Subject: Re: Some tests are never executed in CI due to their name
>
> NetApp Security WARNING: This is an external email. Do not click links or
> open attachments unless you recognize the sender and know the content is
> safe.
>
>
>
>
> IIUC we're relying on catching the word 'Split' in the file name for
> option 1. If somebody named his test i.e. 'MyTestGroup1', 'MyTestGroup2',
> 'TTLTestPre2038', 'TTLTestPost2038',... I think we would leak tests again?
> or any other word that is not specifically accounted for. Unless I am
> missing sthg ofc! :-)
>
> On 25/10/22 10:39, Miklosovic, Stefan wrote:
> > I think that what you wrote is not entirely correct. It will prevent it
> from happening again when there are tests ending on "Tests" or starting on
> "Test". The only case it will not cover is "SplitN" issue we plan to cover
> with relaxed test.name property.
> >
> > It seems like what you wrote means that we will fix it and tests will
> leak again. That is not true.
> >
> > ________________________________________
> > From: Berenguer Blasi <be...@gmail.com>
> > Sent: Tuesday, October 25, 2022 7:26
> > To: dev@cassandra.apache.org
> > Subject: Re: Some tests are never executed in CI due to their name
> >
> > NetApp Security WARNING: This is an external email. Do not click links
> or open attachments unless you recognize the sender and know the content is
> safe.
> >
> >
> >
> >
> > The problem with using the first approach is that it fixes the current
> > situation but it doesn't prevent it from happening again. The second
> > proposal prevents it from happening again but at the cost of a bigger
> > rename I'd volunteer to if needed.
> >
> > Regards
> >
> > On 24/10/22 20:38, Miklosovic, Stefan wrote:
> >> Yeah, that is what the branch in my original email actually already
> solved. I mean ...
> >>
> >> CassandraAuthorizerTruncatingTests
> >> BatchTests
> >> UUIDTests
> >>
> >> these are ending on Tests which is illegal and they are fixed there.
> >>
> >> Other cases are either "TestBase" or "Tester" which should be legal.
> >>
> >> I think using the first approach and fixing all "SplitN" PLUS adding
> Split* to test.name regexp should do it.
> >>
> >> I think we can "automate" at most like you suggest and scan it
> manually, fix the stuff and from then incorporate checkstyle to take care
> of that.
> >>
> >> There are also some classes which do include @Test methods but I think
> they are abstract as they are meant to be extended as the real test is just
> wrapping that. This might happen when there are slight variations across
> test classes. This is fine as well.
> >>
> >> ________________________________________
> >> From: Derek Chen-Becker <de...@chen-becker.org>
> >> Sent: Monday, October 24, 2022 19:18
> >> To: dev@cassandra.apache.org
> >> Subject: Re: Some tests are never executed in CI due to their name
> >>
> >> NetApp Security WARNING: This is an external email. Do not click links
> or open attachments unless you recognize the sender and know the content is
> safe.
> >>
> >>
> >>
> >> OK, I think the approach you're proposing is fine. As an orthogonal
> idea, is there any way to do a one-time audit to narrow down files that we
> need to inspect, or any way to automate confirmation of a file containing
> tests? For example:
> >>
> >> * Start with all of the files under the `test` directory (is it safe to
> assume that this is the only location for test classes?) => 1457 files
> >> * Remove any files that already end in "Test.java" => down to 396 files
> >> * Is it safe to assume that only files having "Test" somewhere in their
> name are test classes? That reduces the list down to 60 files
> >>
> >> Alternatively, are we only talking about unit tests? Would it be
> sufficient to look for files that don't end in "Test.java" but contain
> "@Test" annotations? That's a significantly smaller set, so maybe we could
> fix unit tests first:
> >>
> >> ❯❯❯ for fl in test/unit/**/*.java; do if [[ $fl != *Test.java ]]; then
> if grep -qF '@Test' $fl; then print $fl; fi; fi; done
>
>                                                             circleci-parity
> ✱ ◼
> >>
> test/unit/org/apache/cassandra/auth/CassandraAuthorizerTruncatingTests.java
> >> test/unit/org/apache/cassandra/cql3/BatchTests.java
> >> test/unit/org/apache/cassandra/cql3/KeywordTestBase.java
> >>
> test/unit/org/apache/cassandra/db/guardrails/GuardrailAllowUncompressedTables.java
> >>
> test/unit/org/apache/cassandra/db/guardrails/GuardrailConsistencyLevelsTester.java
> >>
> test/unit/org/apache/cassandra/db/guardrails/GuardrailSecondaryIndexTester.java
> >>
> test/unit/org/apache/cassandra/db/guardrails/GuardrailSecondaryIndexesPerTable.java
> >> test/unit/org/apache/cassandra/db/guardrails/ThresholdTester.java
> >> test/unit/org/apache/cassandra/dht/PartitionerTestCase.java
> >> test/unit/org/apache/cassandra/utils/UUIDTests.java
> >>
> >> Cheers,
> >>
> >> Derek
> >>
> >>
> >> On Mon, Oct 24, 2022 at 9:00 AM Miklosovic, Stefan <
> Stefan.Miklosovic@netapp.com<ma...@netapp.com>> wrote:
> >> One more point as I was reading your email in a hurry. The ultimate
> goal is to run all tests. So even if we apply the proposed regexp in
> checkstyle and there is an unfortunate case that MyTestSplit1 would slip
> through on a review and it would not be executed in CI, if we do not figure
> out some better regexp, we might relax "test.name<http://test.name>"
> property to include Split*.java as well. Yes, this might be also done.
> >>
> >> Enforcing "all tests to end on "Test" " means that we would need to
> know which .java files contain tests and which not without looking into
> them. I do not think that is possible under circumstances we have where
> (checkstyle module).
> >>
> >> ________________________________________
> >> From: Derek Chen-Becker <derek@chen-becker.org<mailto:
> derek@chen-becker.org>>
> >> Sent: Monday, October 24, 2022 15:53
> >> To: dev@cassandra.apache.org<ma...@cassandra.apache.org>
> >> Subject: Re: Some tests are never executed in CI due to their name
> >>
> >> NetApp Security WARNING: This is an external email. Do not click links
> or open attachments unless you recognize the sender and know the content is
> safe.
> >>
> >>
> >>
> >> OK, that makes sense, and I missed the rename of "TestBaseImpl" (which
> I agree is in need of fixing). When you say:
> >>
> >>> What we are trying to achieve by that is to have a test which ends on
> "Test" and nothing else and it may contain "Test" only in case it does not
> start with it.
> >> Should the regex enforce the "which ends on Test" part? Otherwise, if
> we want to allow tests like MyTestSplit1, do we need to update the "
> test.name<http://test.name><http://test.name>" parameter to allow that in
> the test runner?
> >>
> >> Thanks,
> >>
> >> Derek
> >>
> >>
> >> On Mon, Oct 24, 2022 at 7:27 AM Miklosovic, Stefan <
> Stefan.Miklosovic@netapp.com<ma...@netapp.com><mailto:
> Stefan.Miklosovic@netapp.com<ma...@netapp.com>>> wrote:
> >> Hi Derek,
> >>
> >> yes, the regexp is ^(?!(?:Test))(?!.*?(?:Tests)).*$
> >>
> >> What we are trying to achieve by that is to have a test which ends on
> "Test" and nothing else and it may contain "Test" only in case it does not
> start with it.
> >>
> >> To your second paragraph, I do not think we have any *Base files which
> are tests (that would be a bug) nor they are standalone (not inherited). By
> first approach, TestBaseImpl (which we do have) would start to be invalid,
> we would have to rename it to "DistributedTestBaseImpl" because it was
> starting on "Test" which is no-no. I agree that "TestBaseImpl" is rather
> unfortunate name. Currently, TestBaseImpl itself extends
> "DistributedTestBase" so in the end we would have "DistributedTestBaseImpl
> -> TestBaseImpl -> DistributedTestBase". Please keep in mind that
> "DistributedTestBase" is located in dtest jar (separate project) and that
> is rather inconvenient to rename.
> >>
> >> Regards
> >>
> >> ________________________________________
> >> From: Derek Chen-Becker <derek@chen-becker.org<mailto:
> derek@chen-becker.org><mailto:derek@chen-becker.org<mailto:
> derek@chen-becker.org>>>
> >> Sent: Monday, October 24, 2022 15:09
> >> To: dev@cassandra.apache.org<ma...@cassandra.apache.org><mailto:
> dev@cassandra.apache.org<ma...@cassandra.apache.org>>
> >> Subject: Re: Some tests are never executed in CI due to their name
> >>
> >> NetApp Security WARNING: This is an external email. Do not click links
> or open attachments unless you recognize the sender and know the content is
> safe.
> >>
> >>
> >>
> >> For clarity, what is the final regex you've landed on for the first
> approach? Is it "^(?!(?:Test))(?!.*?(?:Tests)).*$" ? Are we just trying to
> reject anything that *starts* with "Test" or contains "Tests" somewhere in
> the name? It might be more clear to state what we think a valid test name
> should be, not in regex form.
> >>
> >> When it comes to "TestBase" files, are there any of these that are
> actually intended to be run as a test? The "Base" would indicate to me that
> it's intended to be inherited, not run directly. Do we have some "TestBase"
> classes somewhere that aren't inherited and are meant to be run directly,
> because that definitely seems like a misleading name that should be fixed,
> invasive or not. I would lean toward the principle of least surprise here,
> even if it means an involved one-time cleanup.
> >>
> >> Cheers,
> >>
> >> Derek
> >>
> >> On Mon, Oct 24, 2022 at 3:02 AM Miklosovic, Stefan <
> Stefan.Miklosovic@netapp.com<ma...@netapp.com><mailto:
> Stefan.Miklosovic@netapp.com<ma...@netapp.com>><mailto:
> Stefan.Miklosovic@netapp.com<ma...@netapp.com><mailto:
> Stefan.Miklosovic@netapp.com<ma...@netapp.com>>>>
> wrote:
> >> Hi list,
> >>
> >> it was reported that some tests are not executed as part of CI because
> of their name (1). There is (2) and (3) so right now it runs only
> "*Test.java" files.
> >>
> >> Except of us renaming the affected classes, we should fix this in a
> more robust way - via checkstyle - so we enforce how test files have to be
> called.
> >>
> >> We came up with two approaches and we do not know which one to choose
> so we want to gather more feedback / opinions.
> >>
> >> The first approach would make this happen (implemented here (4))
> >>
> >> class name - passes / fails the checkstyle
> >>
> >> TestIt - fails
> >> TestsThatFeature - fails
> >> ThisTestsMyFeature - fails
> >> SomeTests - fails
> >>
> >> SomeHelperClassForTesting - passes
> >> SomeTest - passes
> >> DistributedTestBase - passes
> >>
> >> MyTestSplit1 - passes
> >>
> >> We would fix "SplitN" test files by hand (I think they are not executed
> right now in CI either) to something like MySplit1Test and MySplit2Test but
> the regexp we would eventually use would not prevent this from happening in
> the future. I do not know how to construct such a complex regexp yet which
> would cover all cases above plus this splitting one. Feel free to look into
> the ticket where details are discussed.
> >>
> >> The second approach would forbid to have any occurrence of "test" in
> the file name but at the end. For example, it would fail on classes like
> these
> >>
> >> MyTestSplit1
> >> TestHelper
> >> BurnTestUtil
> >> FuzzTestBase
> >> DistributedTestSnitch
> >> CASCommonTestCases
> >> ServerTestUtils
> >> AuthTestUtils
> >> TestMemtable
> >>
> >> While this would also fix "MyTestSplitN" cases, I consider this to be
> way more invasive to the current codebase because we would have to rename a
> lot of files (like, hundreds, plus a lot of places they are referenced /
> used at) and I do not even think this is necessary and desirable. Like, how
> would we call "FuzzTestBase" to not contain "Test" in it? I think having
> "test" string in the middle is just fine, specifically for helper classes.
> One possible fix would be to replace "Test" with "Tester" so we would have
> "TesterMemtable", "AuthTesterUtils", "TesterHelper",
> "DistributedTesterSnitch" ... Word "Tester" would be explicitly allowed.
> >>
> >> I personally lean to the first approach even though cases like
> "MyTestSplit1" will not be spotted in compile time, but hey, we still do
> have a review process on top of this. The obvious violations like
> "TestThisFeature" and "MySuperTests" would be evaluated as violations which
> I would argue happens the most often.
> >>
> >> Please keep in mind that we have checkstyle only in 4.1 and trunk. So,
> while we would fix names of classes in all branches we support so they are
> picked up by CI, checkstyle for this would be introduced in 4.1 and trunk
> only.
> >>
> >> (1) https://issues.apache.org/jira/browse/CASSANDRA-17964
> >> (2) https://github.com/apache/cassandra/blob/trunk/build.xml#L61
> >> (3) https://github.com/apache/cassandra/blob/trunk/build.xml#L1033
> >> (4)
> https://github.com/instaclustr/cassandra/commits/CASSANDRA-17964-4.1-2
> >>
> >>
> >> --
> >> +---------------------------------------------------------------+
> >> | Derek Chen-Becker                                             |
> >> | GPG Key available at https://keybase.io/dchenbecker and       |
> >> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
> >> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
> >> +---------------------------------------------------------------+
> >>
> >>
> >>
> >> --
> >> +---------------------------------------------------------------+
> >> | Derek Chen-Becker                                             |
> >> | GPG Key available at https://keybase.io/dchenbecker and       |
> >> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
> >> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
> >> +---------------------------------------------------------------+
> >>
> >>
> >>
> >> --
> >> +---------------------------------------------------------------+
> >> | Derek Chen-Becker                                             |
> >> | GPG Key available at https://keybase.io/dchenbecker and       |
> >> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
> >> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
> >> +---------------------------------------------------------------+
> >>
>

Re: Some tests are never executed in CI due to their name

Posted by "Miklosovic, Stefan" <St...@netapp.com>.
Hi Berenguer,

I am glad you asked. I was expecting this question.

I think there is a fundamental difference in how we approach this problem. I do not say mine is better, I just find it important to describe it clearly.

Let's say we are on a ship which leaks. When I detect that, my course of action is to fix the leakage with the biggest patches I can find around with minimal effort necessary so we are not taking water anymore. There might still be occasional leakages which are very rare but I have a crew who is checking the ship constantly anyway (review) and the risk that big leakages like we just fixed happen again are relatively very low.

You spot a leakage and instead of fixing it with big patches and calling it a day, you are trying to remodel the cabins so they are completely waterproof but while doing so, you renamed them so everyone needs to get used to how these cabins are called and where they are located, because there is this minimal chance that some cadet comes around and starts to live in a cabin he is not supposed to and we need to explain it to him. Thousands of cadets found their cabins just fine. Occasionally, there are few people who just miss the right board completely (Test at start, Tests at the end) and they are automatically navigated around (checkstyle) but once in five years there is this guy who just completely missed it.

I believe we can just navigate him when that happens. You want to cover that guy too.

I just find my approach easier.

You can remodel it all, for sure but I am afraid I will not be a part of that. I just do not find it necessary to do that.

________________________________________
From: Berenguer Blasi <be...@gmail.com>
Sent: Tuesday, October 25, 2022 11:08
To: dev@cassandra.apache.org
Subject: Re: Some tests are never executed in CI due to their name

NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.




IIUC we're relying on catching the word 'Split' in the file name for option 1. If somebody named his test i.e. 'MyTestGroup1', 'MyTestGroup2', 'TTLTestPre2038', 'TTLTestPost2038',... I think we would leak tests again? or any other word that is not specifically accounted for. Unless I am missing sthg ofc! :-)

On 25/10/22 10:39, Miklosovic, Stefan wrote:
> I think that what you wrote is not entirely correct. It will prevent it from happening again when there are tests ending on "Tests" or starting on "Test". The only case it will not cover is "SplitN" issue we plan to cover with relaxed test.name property.
>
> It seems like what you wrote means that we will fix it and tests will leak again. That is not true.
>
> ________________________________________
> From: Berenguer Blasi <be...@gmail.com>
> Sent: Tuesday, October 25, 2022 7:26
> To: dev@cassandra.apache.org
> Subject: Re: Some tests are never executed in CI due to their name
>
> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
>
>
>
>
> The problem with using the first approach is that it fixes the current
> situation but it doesn't prevent it from happening again. The second
> proposal prevents it from happening again but at the cost of a bigger
> rename I'd volunteer to if needed.
>
> Regards
>
> On 24/10/22 20:38, Miklosovic, Stefan wrote:
>> Yeah, that is what the branch in my original email actually already solved. I mean ...
>>
>> CassandraAuthorizerTruncatingTests
>> BatchTests
>> UUIDTests
>>
>> these are ending on Tests which is illegal and they are fixed there.
>>
>> Other cases are either "TestBase" or "Tester" which should be legal.
>>
>> I think using the first approach and fixing all "SplitN" PLUS adding Split* to test.name regexp should do it.
>>
>> I think we can "automate" at most like you suggest and scan it manually, fix the stuff and from then incorporate checkstyle to take care of that.
>>
>> There are also some classes which do include @Test methods but I think they are abstract as they are meant to be extended as the real test is just wrapping that. This might happen when there are slight variations across test classes. This is fine as well.
>>
>> ________________________________________
>> From: Derek Chen-Becker <de...@chen-becker.org>
>> Sent: Monday, October 24, 2022 19:18
>> To: dev@cassandra.apache.org
>> Subject: Re: Some tests are never executed in CI due to their name
>>
>> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
>>
>>
>>
>> OK, I think the approach you're proposing is fine. As an orthogonal idea, is there any way to do a one-time audit to narrow down files that we need to inspect, or any way to automate confirmation of a file containing tests? For example:
>>
>> * Start with all of the files under the `test` directory (is it safe to assume that this is the only location for test classes?) => 1457 files
>> * Remove any files that already end in "Test.java" => down to 396 files
>> * Is it safe to assume that only files having "Test" somewhere in their name are test classes? That reduces the list down to 60 files
>>
>> Alternatively, are we only talking about unit tests? Would it be sufficient to look for files that don't end in "Test.java" but contain "@Test" annotations? That's a significantly smaller set, so maybe we could fix unit tests first:
>>
>> ❯❯❯ for fl in test/unit/**/*.java; do if [[ $fl != *Test.java ]]; then if grep -qF '@Test' $fl; then print $fl; fi; fi; done                                                                                                                                                              circleci-parity ✱ ◼
>> test/unit/org/apache/cassandra/auth/CassandraAuthorizerTruncatingTests.java
>> test/unit/org/apache/cassandra/cql3/BatchTests.java
>> test/unit/org/apache/cassandra/cql3/KeywordTestBase.java
>> test/unit/org/apache/cassandra/db/guardrails/GuardrailAllowUncompressedTables.java
>> test/unit/org/apache/cassandra/db/guardrails/GuardrailConsistencyLevelsTester.java
>> test/unit/org/apache/cassandra/db/guardrails/GuardrailSecondaryIndexTester.java
>> test/unit/org/apache/cassandra/db/guardrails/GuardrailSecondaryIndexesPerTable.java
>> test/unit/org/apache/cassandra/db/guardrails/ThresholdTester.java
>> test/unit/org/apache/cassandra/dht/PartitionerTestCase.java
>> test/unit/org/apache/cassandra/utils/UUIDTests.java
>>
>> Cheers,
>>
>> Derek
>>
>>
>> On Mon, Oct 24, 2022 at 9:00 AM Miklosovic, Stefan <St...@netapp.com>> wrote:
>> One more point as I was reading your email in a hurry. The ultimate goal is to run all tests. So even if we apply the proposed regexp in checkstyle and there is an unfortunate case that MyTestSplit1 would slip through on a review and it would not be executed in CI, if we do not figure out some better regexp, we might relax "test.name<http://test.name>" property to include Split*.java as well. Yes, this might be also done.
>>
>> Enforcing "all tests to end on "Test" " means that we would need to know which .java files contain tests and which not without looking into them. I do not think that is possible under circumstances we have where (checkstyle module).
>>
>> ________________________________________
>> From: Derek Chen-Becker <de...@chen-becker.org>>
>> Sent: Monday, October 24, 2022 15:53
>> To: dev@cassandra.apache.org<ma...@cassandra.apache.org>
>> Subject: Re: Some tests are never executed in CI due to their name
>>
>> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
>>
>>
>>
>> OK, that makes sense, and I missed the rename of "TestBaseImpl" (which I agree is in need of fixing). When you say:
>>
>>> What we are trying to achieve by that is to have a test which ends on "Test" and nothing else and it may contain "Test" only in case it does not start with it.
>> Should the regex enforce the "which ends on Test" part? Otherwise, if we want to allow tests like MyTestSplit1, do we need to update the "test.name<http://test.name><http://test.name>" parameter to allow that in the test runner?
>>
>> Thanks,
>>
>> Derek
>>
>>
>> On Mon, Oct 24, 2022 at 7:27 AM Miklosovic, Stefan <St...@netapp.com>>> wrote:
>> Hi Derek,
>>
>> yes, the regexp is ^(?!(?:Test))(?!.*?(?:Tests)).*$
>>
>> What we are trying to achieve by that is to have a test which ends on "Test" and nothing else and it may contain "Test" only in case it does not start with it.
>>
>> To your second paragraph, I do not think we have any *Base files which are tests (that would be a bug) nor they are standalone (not inherited). By first approach, TestBaseImpl (which we do have) would start to be invalid, we would have to rename it to "DistributedTestBaseImpl" because it was starting on "Test" which is no-no. I agree that "TestBaseImpl" is rather unfortunate name. Currently, TestBaseImpl itself extends "DistributedTestBase" so in the end we would have "DistributedTestBaseImpl -> TestBaseImpl -> DistributedTestBase". Please keep in mind that "DistributedTestBase" is located in dtest jar (separate project) and that is rather inconvenient to rename.
>>
>> Regards
>>
>> ________________________________________
>> From: Derek Chen-Becker <de...@chen-becker.org>>>
>> Sent: Monday, October 24, 2022 15:09
>> To: dev@cassandra.apache.org<ma...@cassandra.apache.org>>
>> Subject: Re: Some tests are never executed in CI due to their name
>>
>> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
>>
>>
>>
>> For clarity, what is the final regex you've landed on for the first approach? Is it "^(?!(?:Test))(?!.*?(?:Tests)).*$" ? Are we just trying to reject anything that *starts* with "Test" or contains "Tests" somewhere in the name? It might be more clear to state what we think a valid test name should be, not in regex form.
>>
>> When it comes to "TestBase" files, are there any of these that are actually intended to be run as a test? The "Base" would indicate to me that it's intended to be inherited, not run directly. Do we have some "TestBase" classes somewhere that aren't inherited and are meant to be run directly, because that definitely seems like a misleading name that should be fixed, invasive or not. I would lean toward the principle of least surprise here, even if it means an involved one-time cleanup.
>>
>> Cheers,
>>
>> Derek
>>
>> On Mon, Oct 24, 2022 at 3:02 AM Miklosovic, Stefan <St...@netapp.com>>>> wrote:
>> Hi list,
>>
>> it was reported that some tests are not executed as part of CI because of their name (1). There is (2) and (3) so right now it runs only "*Test.java" files.
>>
>> Except of us renaming the affected classes, we should fix this in a more robust way - via checkstyle - so we enforce how test files have to be called.
>>
>> We came up with two approaches and we do not know which one to choose so we want to gather more feedback / opinions.
>>
>> The first approach would make this happen (implemented here (4))
>>
>> class name - passes / fails the checkstyle
>>
>> TestIt - fails
>> TestsThatFeature - fails
>> ThisTestsMyFeature - fails
>> SomeTests - fails
>>
>> SomeHelperClassForTesting - passes
>> SomeTest - passes
>> DistributedTestBase - passes
>>
>> MyTestSplit1 - passes
>>
>> We would fix "SplitN" test files by hand (I think they are not executed right now in CI either) to something like MySplit1Test and MySplit2Test but the regexp we would eventually use would not prevent this from happening in the future. I do not know how to construct such a complex regexp yet which would cover all cases above plus this splitting one. Feel free to look into the ticket where details are discussed.
>>
>> The second approach would forbid to have any occurrence of "test" in the file name but at the end. For example, it would fail on classes like these
>>
>> MyTestSplit1
>> TestHelper
>> BurnTestUtil
>> FuzzTestBase
>> DistributedTestSnitch
>> CASCommonTestCases
>> ServerTestUtils
>> AuthTestUtils
>> TestMemtable
>>
>> While this would also fix "MyTestSplitN" cases, I consider this to be way more invasive to the current codebase because we would have to rename a lot of files (like, hundreds, plus a lot of places they are referenced / used at) and I do not even think this is necessary and desirable. Like, how would we call "FuzzTestBase" to not contain "Test" in it? I think having "test" string in the middle is just fine, specifically for helper classes. One possible fix would be to replace "Test" with "Tester" so we would have "TesterMemtable", "AuthTesterUtils", "TesterHelper", "DistributedTesterSnitch" ... Word "Tester" would be explicitly allowed.
>>
>> I personally lean to the first approach even though cases like "MyTestSplit1" will not be spotted in compile time, but hey, we still do have a review process on top of this. The obvious violations like "TestThisFeature" and "MySuperTests" would be evaluated as violations which I would argue happens the most often.
>>
>> Please keep in mind that we have checkstyle only in 4.1 and trunk. So, while we would fix names of classes in all branches we support so they are picked up by CI, checkstyle for this would be introduced in 4.1 and trunk only.
>>
>> (1) https://issues.apache.org/jira/browse/CASSANDRA-17964
>> (2) https://github.com/apache/cassandra/blob/trunk/build.xml#L61
>> (3) https://github.com/apache/cassandra/blob/trunk/build.xml#L1033
>> (4) https://github.com/instaclustr/cassandra/commits/CASSANDRA-17964-4.1-2
>>
>>
>> --
>> +---------------------------------------------------------------+
>> | Derek Chen-Becker                                             |
>> | GPG Key available at https://keybase.io/dchenbecker and       |
>> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
>> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
>> +---------------------------------------------------------------+
>>
>>
>>
>> --
>> +---------------------------------------------------------------+
>> | Derek Chen-Becker                                             |
>> | GPG Key available at https://keybase.io/dchenbecker and       |
>> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
>> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
>> +---------------------------------------------------------------+
>>
>>
>>
>> --
>> +---------------------------------------------------------------+
>> | Derek Chen-Becker                                             |
>> | GPG Key available at https://keybase.io/dchenbecker and       |
>> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
>> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
>> +---------------------------------------------------------------+
>>

Re: Some tests are never executed in CI due to their name

Posted by Berenguer Blasi <be...@gmail.com>.
IIUC we're relying on catching the word 'Split' in the file name for option 1. If somebody named his test i.e. 'MyTestGroup1', 'MyTestGroup2', 'TTLTestPre2038', 'TTLTestPost2038',... I think we would leak tests again? or any other word that is not specifically accounted for. Unless I am missing sthg ofc! :-)

On 25/10/22 10:39, Miklosovic, Stefan wrote:
> I think that what you wrote is not entirely correct. It will prevent it from happening again when there are tests ending on "Tests" or starting on "Test". The only case it will not cover is "SplitN" issue we plan to cover with relaxed test.name property.
>
> It seems like what you wrote means that we will fix it and tests will leak again. That is not true.
>
> ________________________________________
> From: Berenguer Blasi <be...@gmail.com>
> Sent: Tuesday, October 25, 2022 7:26
> To: dev@cassandra.apache.org
> Subject: Re: Some tests are never executed in CI due to their name
>
> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
>
>
>
>
> The problem with using the first approach is that it fixes the current
> situation but it doesn't prevent it from happening again. The second
> proposal prevents it from happening again but at the cost of a bigger
> rename I'd volunteer to if needed.
>
> Regards
>
> On 24/10/22 20:38, Miklosovic, Stefan wrote:
>> Yeah, that is what the branch in my original email actually already solved. I mean ...
>>
>> CassandraAuthorizerTruncatingTests
>> BatchTests
>> UUIDTests
>>
>> these are ending on Tests which is illegal and they are fixed there.
>>
>> Other cases are either "TestBase" or "Tester" which should be legal.
>>
>> I think using the first approach and fixing all "SplitN" PLUS adding Split* to test.name regexp should do it.
>>
>> I think we can "automate" at most like you suggest and scan it manually, fix the stuff and from then incorporate checkstyle to take care of that.
>>
>> There are also some classes which do include @Test methods but I think they are abstract as they are meant to be extended as the real test is just wrapping that. This might happen when there are slight variations across test classes. This is fine as well.
>>
>> ________________________________________
>> From: Derek Chen-Becker <de...@chen-becker.org>
>> Sent: Monday, October 24, 2022 19:18
>> To: dev@cassandra.apache.org
>> Subject: Re: Some tests are never executed in CI due to their name
>>
>> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
>>
>>
>>
>> OK, I think the approach you're proposing is fine. As an orthogonal idea, is there any way to do a one-time audit to narrow down files that we need to inspect, or any way to automate confirmation of a file containing tests? For example:
>>
>> * Start with all of the files under the `test` directory (is it safe to assume that this is the only location for test classes?) => 1457 files
>> * Remove any files that already end in "Test.java" => down to 396 files
>> * Is it safe to assume that only files having "Test" somewhere in their name are test classes? That reduces the list down to 60 files
>>
>> Alternatively, are we only talking about unit tests? Would it be sufficient to look for files that don't end in "Test.java" but contain "@Test" annotations? That's a significantly smaller set, so maybe we could fix unit tests first:
>>
>> ❯❯❯ for fl in test/unit/**/*.java; do if [[ $fl != *Test.java ]]; then if grep -qF '@Test' $fl; then print $fl; fi; fi; done                                                                                                                                                              circleci-parity ✱ ◼
>> test/unit/org/apache/cassandra/auth/CassandraAuthorizerTruncatingTests.java
>> test/unit/org/apache/cassandra/cql3/BatchTests.java
>> test/unit/org/apache/cassandra/cql3/KeywordTestBase.java
>> test/unit/org/apache/cassandra/db/guardrails/GuardrailAllowUncompressedTables.java
>> test/unit/org/apache/cassandra/db/guardrails/GuardrailConsistencyLevelsTester.java
>> test/unit/org/apache/cassandra/db/guardrails/GuardrailSecondaryIndexTester.java
>> test/unit/org/apache/cassandra/db/guardrails/GuardrailSecondaryIndexesPerTable.java
>> test/unit/org/apache/cassandra/db/guardrails/ThresholdTester.java
>> test/unit/org/apache/cassandra/dht/PartitionerTestCase.java
>> test/unit/org/apache/cassandra/utils/UUIDTests.java
>>
>> Cheers,
>>
>> Derek
>>
>>
>> On Mon, Oct 24, 2022 at 9:00 AM Miklosovic, Stefan <St...@netapp.com>> wrote:
>> One more point as I was reading your email in a hurry. The ultimate goal is to run all tests. So even if we apply the proposed regexp in checkstyle and there is an unfortunate case that MyTestSplit1 would slip through on a review and it would not be executed in CI, if we do not figure out some better regexp, we might relax "test.name<http://test.name>" property to include Split*.java as well. Yes, this might be also done.
>>
>> Enforcing "all tests to end on "Test" " means that we would need to know which .java files contain tests and which not without looking into them. I do not think that is possible under circumstances we have where (checkstyle module).
>>
>> ________________________________________
>> From: Derek Chen-Becker <de...@chen-becker.org>>
>> Sent: Monday, October 24, 2022 15:53
>> To: dev@cassandra.apache.org<ma...@cassandra.apache.org>
>> Subject: Re: Some tests are never executed in CI due to their name
>>
>> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
>>
>>
>>
>> OK, that makes sense, and I missed the rename of "TestBaseImpl" (which I agree is in need of fixing). When you say:
>>
>>> What we are trying to achieve by that is to have a test which ends on "Test" and nothing else and it may contain "Test" only in case it does not start with it.
>> Should the regex enforce the "which ends on Test" part? Otherwise, if we want to allow tests like MyTestSplit1, do we need to update the "test.name<http://test.name><http://test.name>" parameter to allow that in the test runner?
>>
>> Thanks,
>>
>> Derek
>>
>>
>> On Mon, Oct 24, 2022 at 7:27 AM Miklosovic, Stefan <St...@netapp.com>>> wrote:
>> Hi Derek,
>>
>> yes, the regexp is ^(?!(?:Test))(?!.*?(?:Tests)).*$
>>
>> What we are trying to achieve by that is to have a test which ends on "Test" and nothing else and it may contain "Test" only in case it does not start with it.
>>
>> To your second paragraph, I do not think we have any *Base files which are tests (that would be a bug) nor they are standalone (not inherited). By first approach, TestBaseImpl (which we do have) would start to be invalid, we would have to rename it to "DistributedTestBaseImpl" because it was starting on "Test" which is no-no. I agree that "TestBaseImpl" is rather unfortunate name. Currently, TestBaseImpl itself extends "DistributedTestBase" so in the end we would have "DistributedTestBaseImpl -> TestBaseImpl -> DistributedTestBase". Please keep in mind that "DistributedTestBase" is located in dtest jar (separate project) and that is rather inconvenient to rename.
>>
>> Regards
>>
>> ________________________________________
>> From: Derek Chen-Becker <de...@chen-becker.org>>>
>> Sent: Monday, October 24, 2022 15:09
>> To: dev@cassandra.apache.org<ma...@cassandra.apache.org>>
>> Subject: Re: Some tests are never executed in CI due to their name
>>
>> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
>>
>>
>>
>> For clarity, what is the final regex you've landed on for the first approach? Is it "^(?!(?:Test))(?!.*?(?:Tests)).*$" ? Are we just trying to reject anything that *starts* with "Test" or contains "Tests" somewhere in the name? It might be more clear to state what we think a valid test name should be, not in regex form.
>>
>> When it comes to "TestBase" files, are there any of these that are actually intended to be run as a test? The "Base" would indicate to me that it's intended to be inherited, not run directly. Do we have some "TestBase" classes somewhere that aren't inherited and are meant to be run directly, because that definitely seems like a misleading name that should be fixed, invasive or not. I would lean toward the principle of least surprise here, even if it means an involved one-time cleanup.
>>
>> Cheers,
>>
>> Derek
>>
>> On Mon, Oct 24, 2022 at 3:02 AM Miklosovic, Stefan <St...@netapp.com>>>> wrote:
>> Hi list,
>>
>> it was reported that some tests are not executed as part of CI because of their name (1). There is (2) and (3) so right now it runs only "*Test.java" files.
>>
>> Except of us renaming the affected classes, we should fix this in a more robust way - via checkstyle - so we enforce how test files have to be called.
>>
>> We came up with two approaches and we do not know which one to choose so we want to gather more feedback / opinions.
>>
>> The first approach would make this happen (implemented here (4))
>>
>> class name - passes / fails the checkstyle
>>
>> TestIt - fails
>> TestsThatFeature - fails
>> ThisTestsMyFeature - fails
>> SomeTests - fails
>>
>> SomeHelperClassForTesting - passes
>> SomeTest - passes
>> DistributedTestBase - passes
>>
>> MyTestSplit1 - passes
>>
>> We would fix "SplitN" test files by hand (I think they are not executed right now in CI either) to something like MySplit1Test and MySplit2Test but the regexp we would eventually use would not prevent this from happening in the future. I do not know how to construct such a complex regexp yet which would cover all cases above plus this splitting one. Feel free to look into the ticket where details are discussed.
>>
>> The second approach would forbid to have any occurrence of "test" in the file name but at the end. For example, it would fail on classes like these
>>
>> MyTestSplit1
>> TestHelper
>> BurnTestUtil
>> FuzzTestBase
>> DistributedTestSnitch
>> CASCommonTestCases
>> ServerTestUtils
>> AuthTestUtils
>> TestMemtable
>>
>> While this would also fix "MyTestSplitN" cases, I consider this to be way more invasive to the current codebase because we would have to rename a lot of files (like, hundreds, plus a lot of places they are referenced / used at) and I do not even think this is necessary and desirable. Like, how would we call "FuzzTestBase" to not contain "Test" in it? I think having "test" string in the middle is just fine, specifically for helper classes. One possible fix would be to replace "Test" with "Tester" so we would have "TesterMemtable", "AuthTesterUtils", "TesterHelper", "DistributedTesterSnitch" ... Word "Tester" would be explicitly allowed.
>>
>> I personally lean to the first approach even though cases like "MyTestSplit1" will not be spotted in compile time, but hey, we still do have a review process on top of this. The obvious violations like "TestThisFeature" and "MySuperTests" would be evaluated as violations which I would argue happens the most often.
>>
>> Please keep in mind that we have checkstyle only in 4.1 and trunk. So, while we would fix names of classes in all branches we support so they are picked up by CI, checkstyle for this would be introduced in 4.1 and trunk only.
>>
>> (1) https://issues.apache.org/jira/browse/CASSANDRA-17964
>> (2) https://github.com/apache/cassandra/blob/trunk/build.xml#L61
>> (3) https://github.com/apache/cassandra/blob/trunk/build.xml#L1033
>> (4) https://github.com/instaclustr/cassandra/commits/CASSANDRA-17964-4.1-2
>>
>>
>> --
>> +---------------------------------------------------------------+
>> | Derek Chen-Becker                                             |
>> | GPG Key available at https://keybase.io/dchenbecker and       |
>> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
>> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
>> +---------------------------------------------------------------+
>>
>>
>>
>> --
>> +---------------------------------------------------------------+
>> | Derek Chen-Becker                                             |
>> | GPG Key available at https://keybase.io/dchenbecker and       |
>> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
>> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
>> +---------------------------------------------------------------+
>>
>>
>>
>> --
>> +---------------------------------------------------------------+
>> | Derek Chen-Becker                                             |
>> | GPG Key available at https://keybase.io/dchenbecker and       |
>> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
>> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
>> +---------------------------------------------------------------+
>>

Re: Some tests are never executed in CI due to their name

Posted by "Miklosovic, Stefan" <St...@netapp.com>.
I think that what you wrote is not entirely correct. It will prevent it from happening again when there are tests ending on "Tests" or starting on "Test". The only case it will not cover is "SplitN" issue we plan to cover with relaxed test.name property.

It seems like what you wrote means that we will fix it and tests will leak again. That is not true.

________________________________________
From: Berenguer Blasi <be...@gmail.com>
Sent: Tuesday, October 25, 2022 7:26
To: dev@cassandra.apache.org
Subject: Re: Some tests are never executed in CI due to their name

NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.




The problem with using the first approach is that it fixes the current
situation but it doesn't prevent it from happening again. The second
proposal prevents it from happening again but at the cost of a bigger
rename I'd volunteer to if needed.

Regards

On 24/10/22 20:38, Miklosovic, Stefan wrote:
> Yeah, that is what the branch in my original email actually already solved. I mean ...
>
> CassandraAuthorizerTruncatingTests
> BatchTests
> UUIDTests
>
> these are ending on Tests which is illegal and they are fixed there.
>
> Other cases are either "TestBase" or "Tester" which should be legal.
>
> I think using the first approach and fixing all "SplitN" PLUS adding Split* to test.name regexp should do it.
>
> I think we can "automate" at most like you suggest and scan it manually, fix the stuff and from then incorporate checkstyle to take care of that.
>
> There are also some classes which do include @Test methods but I think they are abstract as they are meant to be extended as the real test is just wrapping that. This might happen when there are slight variations across test classes. This is fine as well.
>
> ________________________________________
> From: Derek Chen-Becker <de...@chen-becker.org>
> Sent: Monday, October 24, 2022 19:18
> To: dev@cassandra.apache.org
> Subject: Re: Some tests are never executed in CI due to their name
>
> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
>
>
>
> OK, I think the approach you're proposing is fine. As an orthogonal idea, is there any way to do a one-time audit to narrow down files that we need to inspect, or any way to automate confirmation of a file containing tests? For example:
>
> * Start with all of the files under the `test` directory (is it safe to assume that this is the only location for test classes?) => 1457 files
> * Remove any files that already end in "Test.java" => down to 396 files
> * Is it safe to assume that only files having "Test" somewhere in their name are test classes? That reduces the list down to 60 files
>
> Alternatively, are we only talking about unit tests? Would it be sufficient to look for files that don't end in "Test.java" but contain "@Test" annotations? That's a significantly smaller set, so maybe we could fix unit tests first:
>
> ❯❯❯ for fl in test/unit/**/*.java; do if [[ $fl != *Test.java ]]; then if grep -qF '@Test' $fl; then print $fl; fi; fi; done                                                                                                                                                              circleci-parity ✱ ◼
> test/unit/org/apache/cassandra/auth/CassandraAuthorizerTruncatingTests.java
> test/unit/org/apache/cassandra/cql3/BatchTests.java
> test/unit/org/apache/cassandra/cql3/KeywordTestBase.java
> test/unit/org/apache/cassandra/db/guardrails/GuardrailAllowUncompressedTables.java
> test/unit/org/apache/cassandra/db/guardrails/GuardrailConsistencyLevelsTester.java
> test/unit/org/apache/cassandra/db/guardrails/GuardrailSecondaryIndexTester.java
> test/unit/org/apache/cassandra/db/guardrails/GuardrailSecondaryIndexesPerTable.java
> test/unit/org/apache/cassandra/db/guardrails/ThresholdTester.java
> test/unit/org/apache/cassandra/dht/PartitionerTestCase.java
> test/unit/org/apache/cassandra/utils/UUIDTests.java
>
> Cheers,
>
> Derek
>
>
> On Mon, Oct 24, 2022 at 9:00 AM Miklosovic, Stefan <St...@netapp.com>> wrote:
> One more point as I was reading your email in a hurry. The ultimate goal is to run all tests. So even if we apply the proposed regexp in checkstyle and there is an unfortunate case that MyTestSplit1 would slip through on a review and it would not be executed in CI, if we do not figure out some better regexp, we might relax "test.name<http://test.name>" property to include Split*.java as well. Yes, this might be also done.
>
> Enforcing "all tests to end on "Test" " means that we would need to know which .java files contain tests and which not without looking into them. I do not think that is possible under circumstances we have where (checkstyle module).
>
> ________________________________________
> From: Derek Chen-Becker <de...@chen-becker.org>>
> Sent: Monday, October 24, 2022 15:53
> To: dev@cassandra.apache.org<ma...@cassandra.apache.org>
> Subject: Re: Some tests are never executed in CI due to their name
>
> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
>
>
>
> OK, that makes sense, and I missed the rename of "TestBaseImpl" (which I agree is in need of fixing). When you say:
>
>> What we are trying to achieve by that is to have a test which ends on "Test" and nothing else and it may contain "Test" only in case it does not start with it.
> Should the regex enforce the "which ends on Test" part? Otherwise, if we want to allow tests like MyTestSplit1, do we need to update the "test.name<http://test.name><http://test.name>" parameter to allow that in the test runner?
>
> Thanks,
>
> Derek
>
>
> On Mon, Oct 24, 2022 at 7:27 AM Miklosovic, Stefan <St...@netapp.com>>> wrote:
> Hi Derek,
>
> yes, the regexp is ^(?!(?:Test))(?!.*?(?:Tests)).*$
>
> What we are trying to achieve by that is to have a test which ends on "Test" and nothing else and it may contain "Test" only in case it does not start with it.
>
> To your second paragraph, I do not think we have any *Base files which are tests (that would be a bug) nor they are standalone (not inherited). By first approach, TestBaseImpl (which we do have) would start to be invalid, we would have to rename it to "DistributedTestBaseImpl" because it was starting on "Test" which is no-no. I agree that "TestBaseImpl" is rather unfortunate name. Currently, TestBaseImpl itself extends "DistributedTestBase" so in the end we would have "DistributedTestBaseImpl -> TestBaseImpl -> DistributedTestBase". Please keep in mind that "DistributedTestBase" is located in dtest jar (separate project) and that is rather inconvenient to rename.
>
> Regards
>
> ________________________________________
> From: Derek Chen-Becker <de...@chen-becker.org>>>
> Sent: Monday, October 24, 2022 15:09
> To: dev@cassandra.apache.org<ma...@cassandra.apache.org>>
> Subject: Re: Some tests are never executed in CI due to their name
>
> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
>
>
>
> For clarity, what is the final regex you've landed on for the first approach? Is it "^(?!(?:Test))(?!.*?(?:Tests)).*$" ? Are we just trying to reject anything that *starts* with "Test" or contains "Tests" somewhere in the name? It might be more clear to state what we think a valid test name should be, not in regex form.
>
> When it comes to "TestBase" files, are there any of these that are actually intended to be run as a test? The "Base" would indicate to me that it's intended to be inherited, not run directly. Do we have some "TestBase" classes somewhere that aren't inherited and are meant to be run directly, because that definitely seems like a misleading name that should be fixed, invasive or not. I would lean toward the principle of least surprise here, even if it means an involved one-time cleanup.
>
> Cheers,
>
> Derek
>
> On Mon, Oct 24, 2022 at 3:02 AM Miklosovic, Stefan <St...@netapp.com>>>> wrote:
> Hi list,
>
> it was reported that some tests are not executed as part of CI because of their name (1). There is (2) and (3) so right now it runs only "*Test.java" files.
>
> Except of us renaming the affected classes, we should fix this in a more robust way - via checkstyle - so we enforce how test files have to be called.
>
> We came up with two approaches and we do not know which one to choose so we want to gather more feedback / opinions.
>
> The first approach would make this happen (implemented here (4))
>
> class name - passes / fails the checkstyle
>
> TestIt - fails
> TestsThatFeature - fails
> ThisTestsMyFeature - fails
> SomeTests - fails
>
> SomeHelperClassForTesting - passes
> SomeTest - passes
> DistributedTestBase - passes
>
> MyTestSplit1 - passes
>
> We would fix "SplitN" test files by hand (I think they are not executed right now in CI either) to something like MySplit1Test and MySplit2Test but the regexp we would eventually use would not prevent this from happening in the future. I do not know how to construct such a complex regexp yet which would cover all cases above plus this splitting one. Feel free to look into the ticket where details are discussed.
>
> The second approach would forbid to have any occurrence of "test" in the file name but at the end. For example, it would fail on classes like these
>
> MyTestSplit1
> TestHelper
> BurnTestUtil
> FuzzTestBase
> DistributedTestSnitch
> CASCommonTestCases
> ServerTestUtils
> AuthTestUtils
> TestMemtable
>
> While this would also fix "MyTestSplitN" cases, I consider this to be way more invasive to the current codebase because we would have to rename a lot of files (like, hundreds, plus a lot of places they are referenced / used at) and I do not even think this is necessary and desirable. Like, how would we call "FuzzTestBase" to not contain "Test" in it? I think having "test" string in the middle is just fine, specifically for helper classes. One possible fix would be to replace "Test" with "Tester" so we would have "TesterMemtable", "AuthTesterUtils", "TesterHelper", "DistributedTesterSnitch" ... Word "Tester" would be explicitly allowed.
>
> I personally lean to the first approach even though cases like "MyTestSplit1" will not be spotted in compile time, but hey, we still do have a review process on top of this. The obvious violations like "TestThisFeature" and "MySuperTests" would be evaluated as violations which I would argue happens the most often.
>
> Please keep in mind that we have checkstyle only in 4.1 and trunk. So, while we would fix names of classes in all branches we support so they are picked up by CI, checkstyle for this would be introduced in 4.1 and trunk only.
>
> (1) https://issues.apache.org/jira/browse/CASSANDRA-17964
> (2) https://github.com/apache/cassandra/blob/trunk/build.xml#L61
> (3) https://github.com/apache/cassandra/blob/trunk/build.xml#L1033
> (4) https://github.com/instaclustr/cassandra/commits/CASSANDRA-17964-4.1-2
>
>
> --
> +---------------------------------------------------------------+
> | Derek Chen-Becker                                             |
> | GPG Key available at https://keybase.io/dchenbecker and       |
> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
> +---------------------------------------------------------------+
>
>
>
> --
> +---------------------------------------------------------------+
> | Derek Chen-Becker                                             |
> | GPG Key available at https://keybase.io/dchenbecker and       |
> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
> +---------------------------------------------------------------+
>
>
>
> --
> +---------------------------------------------------------------+
> | Derek Chen-Becker                                             |
> | GPG Key available at https://keybase.io/dchenbecker and       |
> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
> +---------------------------------------------------------------+
>

Re: Some tests are never executed in CI due to their name

Posted by "Miklosovic, Stefan" <St...@netapp.com>.
I think that was already communicated and understood.

________________________________________
From: Berenguer Blasi <be...@gmail.com>
Sent: Tuesday, October 25, 2022 7:26
To: dev@cassandra.apache.org
Subject: Re: Some tests are never executed in CI due to their name

NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.




The problem with using the first approach is that it fixes the current
situation but it doesn't prevent it from happening again. The second
proposal prevents it from happening again but at the cost of a bigger
rename I'd volunteer to if needed.

Regards

On 24/10/22 20:38, Miklosovic, Stefan wrote:
> Yeah, that is what the branch in my original email actually already solved. I mean ...
>
> CassandraAuthorizerTruncatingTests
> BatchTests
> UUIDTests
>
> these are ending on Tests which is illegal and they are fixed there.
>
> Other cases are either "TestBase" or "Tester" which should be legal.
>
> I think using the first approach and fixing all "SplitN" PLUS adding Split* to test.name regexp should do it.
>
> I think we can "automate" at most like you suggest and scan it manually, fix the stuff and from then incorporate checkstyle to take care of that.
>
> There are also some classes which do include @Test methods but I think they are abstract as they are meant to be extended as the real test is just wrapping that. This might happen when there are slight variations across test classes. This is fine as well.
>
> ________________________________________
> From: Derek Chen-Becker <de...@chen-becker.org>
> Sent: Monday, October 24, 2022 19:18
> To: dev@cassandra.apache.org
> Subject: Re: Some tests are never executed in CI due to their name
>
> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
>
>
>
> OK, I think the approach you're proposing is fine. As an orthogonal idea, is there any way to do a one-time audit to narrow down files that we need to inspect, or any way to automate confirmation of a file containing tests? For example:
>
> * Start with all of the files under the `test` directory (is it safe to assume that this is the only location for test classes?) => 1457 files
> * Remove any files that already end in "Test.java" => down to 396 files
> * Is it safe to assume that only files having "Test" somewhere in their name are test classes? That reduces the list down to 60 files
>
> Alternatively, are we only talking about unit tests? Would it be sufficient to look for files that don't end in "Test.java" but contain "@Test" annotations? That's a significantly smaller set, so maybe we could fix unit tests first:
>
> ❯❯❯ for fl in test/unit/**/*.java; do if [[ $fl != *Test.java ]]; then if grep -qF '@Test' $fl; then print $fl; fi; fi; done                                                                                                                                                              circleci-parity ✱ ◼
> test/unit/org/apache/cassandra/auth/CassandraAuthorizerTruncatingTests.java
> test/unit/org/apache/cassandra/cql3/BatchTests.java
> test/unit/org/apache/cassandra/cql3/KeywordTestBase.java
> test/unit/org/apache/cassandra/db/guardrails/GuardrailAllowUncompressedTables.java
> test/unit/org/apache/cassandra/db/guardrails/GuardrailConsistencyLevelsTester.java
> test/unit/org/apache/cassandra/db/guardrails/GuardrailSecondaryIndexTester.java
> test/unit/org/apache/cassandra/db/guardrails/GuardrailSecondaryIndexesPerTable.java
> test/unit/org/apache/cassandra/db/guardrails/ThresholdTester.java
> test/unit/org/apache/cassandra/dht/PartitionerTestCase.java
> test/unit/org/apache/cassandra/utils/UUIDTests.java
>
> Cheers,
>
> Derek
>
>
> On Mon, Oct 24, 2022 at 9:00 AM Miklosovic, Stefan <St...@netapp.com>> wrote:
> One more point as I was reading your email in a hurry. The ultimate goal is to run all tests. So even if we apply the proposed regexp in checkstyle and there is an unfortunate case that MyTestSplit1 would slip through on a review and it would not be executed in CI, if we do not figure out some better regexp, we might relax "test.name<http://test.name>" property to include Split*.java as well. Yes, this might be also done.
>
> Enforcing "all tests to end on "Test" " means that we would need to know which .java files contain tests and which not without looking into them. I do not think that is possible under circumstances we have where (checkstyle module).
>
> ________________________________________
> From: Derek Chen-Becker <de...@chen-becker.org>>
> Sent: Monday, October 24, 2022 15:53
> To: dev@cassandra.apache.org<ma...@cassandra.apache.org>
> Subject: Re: Some tests are never executed in CI due to their name
>
> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
>
>
>
> OK, that makes sense, and I missed the rename of "TestBaseImpl" (which I agree is in need of fixing). When you say:
>
>> What we are trying to achieve by that is to have a test which ends on "Test" and nothing else and it may contain "Test" only in case it does not start with it.
> Should the regex enforce the "which ends on Test" part? Otherwise, if we want to allow tests like MyTestSplit1, do we need to update the "test.name<http://test.name><http://test.name>" parameter to allow that in the test runner?
>
> Thanks,
>
> Derek
>
>
> On Mon, Oct 24, 2022 at 7:27 AM Miklosovic, Stefan <St...@netapp.com>>> wrote:
> Hi Derek,
>
> yes, the regexp is ^(?!(?:Test))(?!.*?(?:Tests)).*$
>
> What we are trying to achieve by that is to have a test which ends on "Test" and nothing else and it may contain "Test" only in case it does not start with it.
>
> To your second paragraph, I do not think we have any *Base files which are tests (that would be a bug) nor they are standalone (not inherited). By first approach, TestBaseImpl (which we do have) would start to be invalid, we would have to rename it to "DistributedTestBaseImpl" because it was starting on "Test" which is no-no. I agree that "TestBaseImpl" is rather unfortunate name. Currently, TestBaseImpl itself extends "DistributedTestBase" so in the end we would have "DistributedTestBaseImpl -> TestBaseImpl -> DistributedTestBase". Please keep in mind that "DistributedTestBase" is located in dtest jar (separate project) and that is rather inconvenient to rename.
>
> Regards
>
> ________________________________________
> From: Derek Chen-Becker <de...@chen-becker.org>>>
> Sent: Monday, October 24, 2022 15:09
> To: dev@cassandra.apache.org<ma...@cassandra.apache.org>>
> Subject: Re: Some tests are never executed in CI due to their name
>
> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
>
>
>
> For clarity, what is the final regex you've landed on for the first approach? Is it "^(?!(?:Test))(?!.*?(?:Tests)).*$" ? Are we just trying to reject anything that *starts* with "Test" or contains "Tests" somewhere in the name? It might be more clear to state what we think a valid test name should be, not in regex form.
>
> When it comes to "TestBase" files, are there any of these that are actually intended to be run as a test? The "Base" would indicate to me that it's intended to be inherited, not run directly. Do we have some "TestBase" classes somewhere that aren't inherited and are meant to be run directly, because that definitely seems like a misleading name that should be fixed, invasive or not. I would lean toward the principle of least surprise here, even if it means an involved one-time cleanup.
>
> Cheers,
>
> Derek
>
> On Mon, Oct 24, 2022 at 3:02 AM Miklosovic, Stefan <St...@netapp.com>>>> wrote:
> Hi list,
>
> it was reported that some tests are not executed as part of CI because of their name (1). There is (2) and (3) so right now it runs only "*Test.java" files.
>
> Except of us renaming the affected classes, we should fix this in a more robust way - via checkstyle - so we enforce how test files have to be called.
>
> We came up with two approaches and we do not know which one to choose so we want to gather more feedback / opinions.
>
> The first approach would make this happen (implemented here (4))
>
> class name - passes / fails the checkstyle
>
> TestIt - fails
> TestsThatFeature - fails
> ThisTestsMyFeature - fails
> SomeTests - fails
>
> SomeHelperClassForTesting - passes
> SomeTest - passes
> DistributedTestBase - passes
>
> MyTestSplit1 - passes
>
> We would fix "SplitN" test files by hand (I think they are not executed right now in CI either) to something like MySplit1Test and MySplit2Test but the regexp we would eventually use would not prevent this from happening in the future. I do not know how to construct such a complex regexp yet which would cover all cases above plus this splitting one. Feel free to look into the ticket where details are discussed.
>
> The second approach would forbid to have any occurrence of "test" in the file name but at the end. For example, it would fail on classes like these
>
> MyTestSplit1
> TestHelper
> BurnTestUtil
> FuzzTestBase
> DistributedTestSnitch
> CASCommonTestCases
> ServerTestUtils
> AuthTestUtils
> TestMemtable
>
> While this would also fix "MyTestSplitN" cases, I consider this to be way more invasive to the current codebase because we would have to rename a lot of files (like, hundreds, plus a lot of places they are referenced / used at) and I do not even think this is necessary and desirable. Like, how would we call "FuzzTestBase" to not contain "Test" in it? I think having "test" string in the middle is just fine, specifically for helper classes. One possible fix would be to replace "Test" with "Tester" so we would have "TesterMemtable", "AuthTesterUtils", "TesterHelper", "DistributedTesterSnitch" ... Word "Tester" would be explicitly allowed.
>
> I personally lean to the first approach even though cases like "MyTestSplit1" will not be spotted in compile time, but hey, we still do have a review process on top of this. The obvious violations like "TestThisFeature" and "MySuperTests" would be evaluated as violations which I would argue happens the most often.
>
> Please keep in mind that we have checkstyle only in 4.1 and trunk. So, while we would fix names of classes in all branches we support so they are picked up by CI, checkstyle for this would be introduced in 4.1 and trunk only.
>
> (1) https://issues.apache.org/jira/browse/CASSANDRA-17964
> (2) https://github.com/apache/cassandra/blob/trunk/build.xml#L61
> (3) https://github.com/apache/cassandra/blob/trunk/build.xml#L1033
> (4) https://github.com/instaclustr/cassandra/commits/CASSANDRA-17964-4.1-2
>
>
> --
> +---------------------------------------------------------------+
> | Derek Chen-Becker                                             |
> | GPG Key available at https://keybase.io/dchenbecker and       |
> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
> +---------------------------------------------------------------+
>
>
>
> --
> +---------------------------------------------------------------+
> | Derek Chen-Becker                                             |
> | GPG Key available at https://keybase.io/dchenbecker and       |
> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
> +---------------------------------------------------------------+
>
>
>
> --
> +---------------------------------------------------------------+
> | Derek Chen-Becker                                             |
> | GPG Key available at https://keybase.io/dchenbecker and       |
> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
> +---------------------------------------------------------------+
>

Re: Some tests are never executed in CI due to their name

Posted by Berenguer Blasi <be...@gmail.com>.
The problem with using the first approach is that it fixes the current 
situation but it doesn't prevent it from happening again. The second 
proposal prevents it from happening again but at the cost of a bigger 
rename I'd volunteer to if needed.

Regards

On 24/10/22 20:38, Miklosovic, Stefan wrote:
> Yeah, that is what the branch in my original email actually already solved. I mean ...
>
> CassandraAuthorizerTruncatingTests
> BatchTests
> UUIDTests
>
> these are ending on Tests which is illegal and they are fixed there.
>
> Other cases are either "TestBase" or "Tester" which should be legal.
>
> I think using the first approach and fixing all "SplitN" PLUS adding Split* to test.name regexp should do it.
>
> I think we can "automate" at most like you suggest and scan it manually, fix the stuff and from then incorporate checkstyle to take care of that.
>
> There are also some classes which do include @Test methods but I think they are abstract as they are meant to be extended as the real test is just wrapping that. This might happen when there are slight variations across test classes. This is fine as well.
>
> ________________________________________
> From: Derek Chen-Becker <de...@chen-becker.org>
> Sent: Monday, October 24, 2022 19:18
> To: dev@cassandra.apache.org
> Subject: Re: Some tests are never executed in CI due to their name
>
> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
>
>
>
> OK, I think the approach you're proposing is fine. As an orthogonal idea, is there any way to do a one-time audit to narrow down files that we need to inspect, or any way to automate confirmation of a file containing tests? For example:
>
> * Start with all of the files under the `test` directory (is it safe to assume that this is the only location for test classes?) => 1457 files
> * Remove any files that already end in "Test.java" => down to 396 files
> * Is it safe to assume that only files having "Test" somewhere in their name are test classes? That reduces the list down to 60 files
>
> Alternatively, are we only talking about unit tests? Would it be sufficient to look for files that don't end in "Test.java" but contain "@Test" annotations? That's a significantly smaller set, so maybe we could fix unit tests first:
>
> ❯❯❯ for fl in test/unit/**/*.java; do if [[ $fl != *Test.java ]]; then if grep -qF '@Test' $fl; then print $fl; fi; fi; done                                                                                                                                                              circleci-parity ✱ ◼
> test/unit/org/apache/cassandra/auth/CassandraAuthorizerTruncatingTests.java
> test/unit/org/apache/cassandra/cql3/BatchTests.java
> test/unit/org/apache/cassandra/cql3/KeywordTestBase.java
> test/unit/org/apache/cassandra/db/guardrails/GuardrailAllowUncompressedTables.java
> test/unit/org/apache/cassandra/db/guardrails/GuardrailConsistencyLevelsTester.java
> test/unit/org/apache/cassandra/db/guardrails/GuardrailSecondaryIndexTester.java
> test/unit/org/apache/cassandra/db/guardrails/GuardrailSecondaryIndexesPerTable.java
> test/unit/org/apache/cassandra/db/guardrails/ThresholdTester.java
> test/unit/org/apache/cassandra/dht/PartitionerTestCase.java
> test/unit/org/apache/cassandra/utils/UUIDTests.java
>
> Cheers,
>
> Derek
>
>
> On Mon, Oct 24, 2022 at 9:00 AM Miklosovic, Stefan <St...@netapp.com>> wrote:
> One more point as I was reading your email in a hurry. The ultimate goal is to run all tests. So even if we apply the proposed regexp in checkstyle and there is an unfortunate case that MyTestSplit1 would slip through on a review and it would not be executed in CI, if we do not figure out some better regexp, we might relax "test.name<http://test.name>" property to include Split*.java as well. Yes, this might be also done.
>
> Enforcing "all tests to end on "Test" " means that we would need to know which .java files contain tests and which not without looking into them. I do not think that is possible under circumstances we have where (checkstyle module).
>
> ________________________________________
> From: Derek Chen-Becker <de...@chen-becker.org>>
> Sent: Monday, October 24, 2022 15:53
> To: dev@cassandra.apache.org<ma...@cassandra.apache.org>
> Subject: Re: Some tests are never executed in CI due to their name
>
> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
>
>
>
> OK, that makes sense, and I missed the rename of "TestBaseImpl" (which I agree is in need of fixing). When you say:
>
>> What we are trying to achieve by that is to have a test which ends on "Test" and nothing else and it may contain "Test" only in case it does not start with it.
> Should the regex enforce the "which ends on Test" part? Otherwise, if we want to allow tests like MyTestSplit1, do we need to update the "test.name<http://test.name><http://test.name>" parameter to allow that in the test runner?
>
> Thanks,
>
> Derek
>
>
> On Mon, Oct 24, 2022 at 7:27 AM Miklosovic, Stefan <St...@netapp.com>>> wrote:
> Hi Derek,
>
> yes, the regexp is ^(?!(?:Test))(?!.*?(?:Tests)).*$
>
> What we are trying to achieve by that is to have a test which ends on "Test" and nothing else and it may contain "Test" only in case it does not start with it.
>
> To your second paragraph, I do not think we have any *Base files which are tests (that would be a bug) nor they are standalone (not inherited). By first approach, TestBaseImpl (which we do have) would start to be invalid, we would have to rename it to "DistributedTestBaseImpl" because it was starting on "Test" which is no-no. I agree that "TestBaseImpl" is rather unfortunate name. Currently, TestBaseImpl itself extends "DistributedTestBase" so in the end we would have "DistributedTestBaseImpl -> TestBaseImpl -> DistributedTestBase". Please keep in mind that "DistributedTestBase" is located in dtest jar (separate project) and that is rather inconvenient to rename.
>
> Regards
>
> ________________________________________
> From: Derek Chen-Becker <de...@chen-becker.org>>>
> Sent: Monday, October 24, 2022 15:09
> To: dev@cassandra.apache.org<ma...@cassandra.apache.org>>
> Subject: Re: Some tests are never executed in CI due to their name
>
> NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.
>
>
>
> For clarity, what is the final regex you've landed on for the first approach? Is it "^(?!(?:Test))(?!.*?(?:Tests)).*$" ? Are we just trying to reject anything that *starts* with "Test" or contains "Tests" somewhere in the name? It might be more clear to state what we think a valid test name should be, not in regex form.
>
> When it comes to "TestBase" files, are there any of these that are actually intended to be run as a test? The "Base" would indicate to me that it's intended to be inherited, not run directly. Do we have some "TestBase" classes somewhere that aren't inherited and are meant to be run directly, because that definitely seems like a misleading name that should be fixed, invasive or not. I would lean toward the principle of least surprise here, even if it means an involved one-time cleanup.
>
> Cheers,
>
> Derek
>
> On Mon, Oct 24, 2022 at 3:02 AM Miklosovic, Stefan <St...@netapp.com>>>> wrote:
> Hi list,
>
> it was reported that some tests are not executed as part of CI because of their name (1). There is (2) and (3) so right now it runs only "*Test.java" files.
>
> Except of us renaming the affected classes, we should fix this in a more robust way - via checkstyle - so we enforce how test files have to be called.
>
> We came up with two approaches and we do not know which one to choose so we want to gather more feedback / opinions.
>
> The first approach would make this happen (implemented here (4))
>
> class name - passes / fails the checkstyle
>
> TestIt - fails
> TestsThatFeature - fails
> ThisTestsMyFeature - fails
> SomeTests - fails
>
> SomeHelperClassForTesting - passes
> SomeTest - passes
> DistributedTestBase - passes
>
> MyTestSplit1 - passes
>
> We would fix "SplitN" test files by hand (I think they are not executed right now in CI either) to something like MySplit1Test and MySplit2Test but the regexp we would eventually use would not prevent this from happening in the future. I do not know how to construct such a complex regexp yet which would cover all cases above plus this splitting one. Feel free to look into the ticket where details are discussed.
>
> The second approach would forbid to have any occurrence of "test" in the file name but at the end. For example, it would fail on classes like these
>
> MyTestSplit1
> TestHelper
> BurnTestUtil
> FuzzTestBase
> DistributedTestSnitch
> CASCommonTestCases
> ServerTestUtils
> AuthTestUtils
> TestMemtable
>
> While this would also fix "MyTestSplitN" cases, I consider this to be way more invasive to the current codebase because we would have to rename a lot of files (like, hundreds, plus a lot of places they are referenced / used at) and I do not even think this is necessary and desirable. Like, how would we call "FuzzTestBase" to not contain "Test" in it? I think having "test" string in the middle is just fine, specifically for helper classes. One possible fix would be to replace "Test" with "Tester" so we would have "TesterMemtable", "AuthTesterUtils", "TesterHelper", "DistributedTesterSnitch" ... Word "Tester" would be explicitly allowed.
>
> I personally lean to the first approach even though cases like "MyTestSplit1" will not be spotted in compile time, but hey, we still do have a review process on top of this. The obvious violations like "TestThisFeature" and "MySuperTests" would be evaluated as violations which I would argue happens the most often.
>
> Please keep in mind that we have checkstyle only in 4.1 and trunk. So, while we would fix names of classes in all branches we support so they are picked up by CI, checkstyle for this would be introduced in 4.1 and trunk only.
>
> (1) https://issues.apache.org/jira/browse/CASSANDRA-17964
> (2) https://github.com/apache/cassandra/blob/trunk/build.xml#L61
> (3) https://github.com/apache/cassandra/blob/trunk/build.xml#L1033
> (4) https://github.com/instaclustr/cassandra/commits/CASSANDRA-17964-4.1-2
>
>
> --
> +---------------------------------------------------------------+
> | Derek Chen-Becker                                             |
> | GPG Key available at https://keybase.io/dchenbecker and       |
> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
> +---------------------------------------------------------------+
>
>
>
> --
> +---------------------------------------------------------------+
> | Derek Chen-Becker                                             |
> | GPG Key available at https://keybase.io/dchenbecker and       |
> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
> +---------------------------------------------------------------+
>
>
>
> --
> +---------------------------------------------------------------+
> | Derek Chen-Becker                                             |
> | GPG Key available at https://keybase.io/dchenbecker and       |
> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
> +---------------------------------------------------------------+
>

Re: Some tests are never executed in CI due to their name

Posted by "Miklosovic, Stefan" <St...@netapp.com>.
Yeah, that is what the branch in my original email actually already solved. I mean ...

CassandraAuthorizerTruncatingTests
BatchTests
UUIDTests

these are ending on Tests which is illegal and they are fixed there.

Other cases are either "TestBase" or "Tester" which should be legal.

I think using the first approach and fixing all "SplitN" PLUS adding Split* to test.name regexp should do it.

I think we can "automate" at most like you suggest and scan it manually, fix the stuff and from then incorporate checkstyle to take care of that.

There are also some classes which do include @Test methods but I think they are abstract as they are meant to be extended as the real test is just wrapping that. This might happen when there are slight variations across test classes. This is fine as well.

________________________________________
From: Derek Chen-Becker <de...@chen-becker.org>
Sent: Monday, October 24, 2022 19:18
To: dev@cassandra.apache.org
Subject: Re: Some tests are never executed in CI due to their name

NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.



OK, I think the approach you're proposing is fine. As an orthogonal idea, is there any way to do a one-time audit to narrow down files that we need to inspect, or any way to automate confirmation of a file containing tests? For example:

* Start with all of the files under the `test` directory (is it safe to assume that this is the only location for test classes?) => 1457 files
* Remove any files that already end in "Test.java" => down to 396 files
* Is it safe to assume that only files having "Test" somewhere in their name are test classes? That reduces the list down to 60 files

Alternatively, are we only talking about unit tests? Would it be sufficient to look for files that don't end in "Test.java" but contain "@Test" annotations? That's a significantly smaller set, so maybe we could fix unit tests first:

❯❯❯ for fl in test/unit/**/*.java; do if [[ $fl != *Test.java ]]; then if grep -qF '@Test' $fl; then print $fl; fi; fi; done                                                                                                                                                              circleci-parity ✱ ◼
test/unit/org/apache/cassandra/auth/CassandraAuthorizerTruncatingTests.java
test/unit/org/apache/cassandra/cql3/BatchTests.java
test/unit/org/apache/cassandra/cql3/KeywordTestBase.java
test/unit/org/apache/cassandra/db/guardrails/GuardrailAllowUncompressedTables.java
test/unit/org/apache/cassandra/db/guardrails/GuardrailConsistencyLevelsTester.java
test/unit/org/apache/cassandra/db/guardrails/GuardrailSecondaryIndexTester.java
test/unit/org/apache/cassandra/db/guardrails/GuardrailSecondaryIndexesPerTable.java
test/unit/org/apache/cassandra/db/guardrails/ThresholdTester.java
test/unit/org/apache/cassandra/dht/PartitionerTestCase.java
test/unit/org/apache/cassandra/utils/UUIDTests.java

Cheers,

Derek


On Mon, Oct 24, 2022 at 9:00 AM Miklosovic, Stefan <St...@netapp.com>> wrote:
One more point as I was reading your email in a hurry. The ultimate goal is to run all tests. So even if we apply the proposed regexp in checkstyle and there is an unfortunate case that MyTestSplit1 would slip through on a review and it would not be executed in CI, if we do not figure out some better regexp, we might relax "test.name<http://test.name>" property to include Split*.java as well. Yes, this might be also done.

Enforcing "all tests to end on "Test" " means that we would need to know which .java files contain tests and which not without looking into them. I do not think that is possible under circumstances we have where (checkstyle module).

________________________________________
From: Derek Chen-Becker <de...@chen-becker.org>>
Sent: Monday, October 24, 2022 15:53
To: dev@cassandra.apache.org<ma...@cassandra.apache.org>
Subject: Re: Some tests are never executed in CI due to their name

NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.



OK, that makes sense, and I missed the rename of "TestBaseImpl" (which I agree is in need of fixing). When you say:

> What we are trying to achieve by that is to have a test which ends on "Test" and nothing else and it may contain "Test" only in case it does not start with it.

Should the regex enforce the "which ends on Test" part? Otherwise, if we want to allow tests like MyTestSplit1, do we need to update the "test.name<http://test.name><http://test.name>" parameter to allow that in the test runner?

Thanks,

Derek


On Mon, Oct 24, 2022 at 7:27 AM Miklosovic, Stefan <St...@netapp.com>>> wrote:
Hi Derek,

yes, the regexp is ^(?!(?:Test))(?!.*?(?:Tests)).*$

What we are trying to achieve by that is to have a test which ends on "Test" and nothing else and it may contain "Test" only in case it does not start with it.

To your second paragraph, I do not think we have any *Base files which are tests (that would be a bug) nor they are standalone (not inherited). By first approach, TestBaseImpl (which we do have) would start to be invalid, we would have to rename it to "DistributedTestBaseImpl" because it was starting on "Test" which is no-no. I agree that "TestBaseImpl" is rather unfortunate name. Currently, TestBaseImpl itself extends "DistributedTestBase" so in the end we would have "DistributedTestBaseImpl -> TestBaseImpl -> DistributedTestBase". Please keep in mind that "DistributedTestBase" is located in dtest jar (separate project) and that is rather inconvenient to rename.

Regards

________________________________________
From: Derek Chen-Becker <de...@chen-becker.org>>>
Sent: Monday, October 24, 2022 15:09
To: dev@cassandra.apache.org<ma...@cassandra.apache.org>>
Subject: Re: Some tests are never executed in CI due to their name

NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.



For clarity, what is the final regex you've landed on for the first approach? Is it "^(?!(?:Test))(?!.*?(?:Tests)).*$" ? Are we just trying to reject anything that *starts* with "Test" or contains "Tests" somewhere in the name? It might be more clear to state what we think a valid test name should be, not in regex form.

When it comes to "TestBase" files, are there any of these that are actually intended to be run as a test? The "Base" would indicate to me that it's intended to be inherited, not run directly. Do we have some "TestBase" classes somewhere that aren't inherited and are meant to be run directly, because that definitely seems like a misleading name that should be fixed, invasive or not. I would lean toward the principle of least surprise here, even if it means an involved one-time cleanup.

Cheers,

Derek

On Mon, Oct 24, 2022 at 3:02 AM Miklosovic, Stefan <St...@netapp.com>>>> wrote:
Hi list,

it was reported that some tests are not executed as part of CI because of their name (1). There is (2) and (3) so right now it runs only "*Test.java" files.

Except of us renaming the affected classes, we should fix this in a more robust way - via checkstyle - so we enforce how test files have to be called.

We came up with two approaches and we do not know which one to choose so we want to gather more feedback / opinions.

The first approach would make this happen (implemented here (4))

class name - passes / fails the checkstyle

TestIt - fails
TestsThatFeature - fails
ThisTestsMyFeature - fails
SomeTests - fails

SomeHelperClassForTesting - passes
SomeTest - passes
DistributedTestBase - passes

MyTestSplit1 - passes

We would fix "SplitN" test files by hand (I think they are not executed right now in CI either) to something like MySplit1Test and MySplit2Test but the regexp we would eventually use would not prevent this from happening in the future. I do not know how to construct such a complex regexp yet which would cover all cases above plus this splitting one. Feel free to look into the ticket where details are discussed.

The second approach would forbid to have any occurrence of "test" in the file name but at the end. For example, it would fail on classes like these

MyTestSplit1
TestHelper
BurnTestUtil
FuzzTestBase
DistributedTestSnitch
CASCommonTestCases
ServerTestUtils
AuthTestUtils
TestMemtable

While this would also fix "MyTestSplitN" cases, I consider this to be way more invasive to the current codebase because we would have to rename a lot of files (like, hundreds, plus a lot of places they are referenced / used at) and I do not even think this is necessary and desirable. Like, how would we call "FuzzTestBase" to not contain "Test" in it? I think having "test" string in the middle is just fine, specifically for helper classes. One possible fix would be to replace "Test" with "Tester" so we would have "TesterMemtable", "AuthTesterUtils", "TesterHelper", "DistributedTesterSnitch" ... Word "Tester" would be explicitly allowed.

I personally lean to the first approach even though cases like "MyTestSplit1" will not be spotted in compile time, but hey, we still do have a review process on top of this. The obvious violations like "TestThisFeature" and "MySuperTests" would be evaluated as violations which I would argue happens the most often.

Please keep in mind that we have checkstyle only in 4.1 and trunk. So, while we would fix names of classes in all branches we support so they are picked up by CI, checkstyle for this would be introduced in 4.1 and trunk only.

(1) https://issues.apache.org/jira/browse/CASSANDRA-17964
(2) https://github.com/apache/cassandra/blob/trunk/build.xml#L61
(3) https://github.com/apache/cassandra/blob/trunk/build.xml#L1033
(4) https://github.com/instaclustr/cassandra/commits/CASSANDRA-17964-4.1-2


--
+---------------------------------------------------------------+
| Derek Chen-Becker                                             |
| GPG Key available at https://keybase.io/dchenbecker and       |
| https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
| Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
+---------------------------------------------------------------+



--
+---------------------------------------------------------------+
| Derek Chen-Becker                                             |
| GPG Key available at https://keybase.io/dchenbecker and       |
| https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
| Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
+---------------------------------------------------------------+



--
+---------------------------------------------------------------+
| Derek Chen-Becker                                             |
| GPG Key available at https://keybase.io/dchenbecker and       |
| https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
| Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
+---------------------------------------------------------------+


Re: Some tests are never executed in CI due to their name

Posted by Derek Chen-Becker <de...@chen-becker.org>.
OK, I think the approach you're proposing is fine. As an orthogonal idea,
is there any way to do a one-time audit to narrow down files that we need
to inspect, or any way to automate confirmation of a file containing tests?
For example:

* Start with all of the files under the `test` directory (is it safe to
assume that this is the only location for test classes?) => 1457 files
* Remove any files that already end in "Test.java" => down to 396 files
* Is it safe to assume that only files having "Test" somewhere in their
name are test classes? That reduces the list down to 60 files

Alternatively, are we only talking about unit tests? Would it be sufficient
to look for files that don't end in "Test.java" but contain "@Test"
annotations? That's a significantly smaller set, so maybe we could fix unit
tests first:

❯❯❯ for fl in test/unit/**/*.java; do if [[ $fl != *Test.java ]]; then if
grep -qF '@Test' $fl; then print $fl; fi; fi; done

                                                         circleci-parity ✱ ◼
test/unit/org/apache/cassandra/auth/CassandraAuthorizerTruncatingTests.java
test/unit/org/apache/cassandra/cql3/BatchTests.java
test/unit/org/apache/cassandra/cql3/KeywordTestBase.java
test/unit/org/apache/cassandra/db/guardrails/GuardrailAllowUncompressedTables.java
test/unit/org/apache/cassandra/db/guardrails/GuardrailConsistencyLevelsTester.java
test/unit/org/apache/cassandra/db/guardrails/GuardrailSecondaryIndexTester.java
test/unit/org/apache/cassandra/db/guardrails/GuardrailSecondaryIndexesPerTable.java
test/unit/org/apache/cassandra/db/guardrails/ThresholdTester.java
test/unit/org/apache/cassandra/dht/PartitionerTestCase.java
test/unit/org/apache/cassandra/utils/UUIDTests.java

Cheers,

Derek


On Mon, Oct 24, 2022 at 9:00 AM Miklosovic, Stefan <
Stefan.Miklosovic@netapp.com> wrote:

> One more point as I was reading your email in a hurry. The ultimate goal
> is to run all tests. So even if we apply the proposed regexp in checkstyle
> and there is an unfortunate case that MyTestSplit1 would slip through on a
> review and it would not be executed in CI, if we do not figure out some
> better regexp, we might relax "test.name" property to include Split*.java
> as well. Yes, this might be also done.
>
> Enforcing "all tests to end on "Test" " means that we would need to know
> which .java files contain tests and which not without looking into them. I
> do not think that is possible under circumstances we have where (checkstyle
> module).
>
> ________________________________________
> From: Derek Chen-Becker <de...@chen-becker.org>
> Sent: Monday, October 24, 2022 15:53
> To: dev@cassandra.apache.org
> Subject: Re: Some tests are never executed in CI due to their name
>
> NetApp Security WARNING: This is an external email. Do not click links or
> open attachments unless you recognize the sender and know the content is
> safe.
>
>
>
> OK, that makes sense, and I missed the rename of "TestBaseImpl" (which I
> agree is in need of fixing). When you say:
>
> > What we are trying to achieve by that is to have a test which ends on
> "Test" and nothing else and it may contain "Test" only in case it does not
> start with it.
>
> Should the regex enforce the "which ends on Test" part? Otherwise, if we
> want to allow tests like MyTestSplit1, do we need to update the "test.name
> <http://test.name>" parameter to allow that in the test runner?
>
> Thanks,
>
> Derek
>
>
> On Mon, Oct 24, 2022 at 7:27 AM Miklosovic, Stefan <
> Stefan.Miklosovic@netapp.com<ma...@netapp.com>> wrote:
> Hi Derek,
>
> yes, the regexp is ^(?!(?:Test))(?!.*?(?:Tests)).*$
>
> What we are trying to achieve by that is to have a test which ends on
> "Test" and nothing else and it may contain "Test" only in case it does not
> start with it.
>
> To your second paragraph, I do not think we have any *Base files which are
> tests (that would be a bug) nor they are standalone (not inherited). By
> first approach, TestBaseImpl (which we do have) would start to be invalid,
> we would have to rename it to "DistributedTestBaseImpl" because it was
> starting on "Test" which is no-no. I agree that "TestBaseImpl" is rather
> unfortunate name. Currently, TestBaseImpl itself extends
> "DistributedTestBase" so in the end we would have "DistributedTestBaseImpl
> -> TestBaseImpl -> DistributedTestBase". Please keep in mind that
> "DistributedTestBase" is located in dtest jar (separate project) and that
> is rather inconvenient to rename.
>
> Regards
>
> ________________________________________
> From: Derek Chen-Becker <derek@chen-becker.org<mailto:
> derek@chen-becker.org>>
> Sent: Monday, October 24, 2022 15:09
> To: dev@cassandra.apache.org<ma...@cassandra.apache.org>
> Subject: Re: Some tests are never executed in CI due to their name
>
> NetApp Security WARNING: This is an external email. Do not click links or
> open attachments unless you recognize the sender and know the content is
> safe.
>
>
>
> For clarity, what is the final regex you've landed on for the first
> approach? Is it "^(?!(?:Test))(?!.*?(?:Tests)).*$" ? Are we just trying to
> reject anything that *starts* with "Test" or contains "Tests" somewhere in
> the name? It might be more clear to state what we think a valid test name
> should be, not in regex form.
>
> When it comes to "TestBase" files, are there any of these that are
> actually intended to be run as a test? The "Base" would indicate to me that
> it's intended to be inherited, not run directly. Do we have some "TestBase"
> classes somewhere that aren't inherited and are meant to be run directly,
> because that definitely seems like a misleading name that should be fixed,
> invasive or not. I would lean toward the principle of least surprise here,
> even if it means an involved one-time cleanup.
>
> Cheers,
>
> Derek
>
> On Mon, Oct 24, 2022 at 3:02 AM Miklosovic, Stefan <
> Stefan.Miklosovic@netapp.com<ma...@netapp.com><mailto:
> Stefan.Miklosovic@netapp.com<ma...@netapp.com>>> wrote:
> Hi list,
>
> it was reported that some tests are not executed as part of CI because of
> their name (1). There is (2) and (3) so right now it runs only "*Test.java"
> files.
>
> Except of us renaming the affected classes, we should fix this in a more
> robust way - via checkstyle - so we enforce how test files have to be
> called.
>
> We came up with two approaches and we do not know which one to choose so
> we want to gather more feedback / opinions.
>
> The first approach would make this happen (implemented here (4))
>
> class name - passes / fails the checkstyle
>
> TestIt - fails
> TestsThatFeature - fails
> ThisTestsMyFeature - fails
> SomeTests - fails
>
> SomeHelperClassForTesting - passes
> SomeTest - passes
> DistributedTestBase - passes
>
> MyTestSplit1 - passes
>
> We would fix "SplitN" test files by hand (I think they are not executed
> right now in CI either) to something like MySplit1Test and MySplit2Test but
> the regexp we would eventually use would not prevent this from happening in
> the future. I do not know how to construct such a complex regexp yet which
> would cover all cases above plus this splitting one. Feel free to look into
> the ticket where details are discussed.
>
> The second approach would forbid to have any occurrence of "test" in the
> file name but at the end. For example, it would fail on classes like these
>
> MyTestSplit1
> TestHelper
> BurnTestUtil
> FuzzTestBase
> DistributedTestSnitch
> CASCommonTestCases
> ServerTestUtils
> AuthTestUtils
> TestMemtable
>
> While this would also fix "MyTestSplitN" cases, I consider this to be way
> more invasive to the current codebase because we would have to rename a lot
> of files (like, hundreds, plus a lot of places they are referenced / used
> at) and I do not even think this is necessary and desirable. Like, how
> would we call "FuzzTestBase" to not contain "Test" in it? I think having
> "test" string in the middle is just fine, specifically for helper classes.
> One possible fix would be to replace "Test" with "Tester" so we would have
> "TesterMemtable", "AuthTesterUtils", "TesterHelper",
> "DistributedTesterSnitch" ... Word "Tester" would be explicitly allowed.
>
> I personally lean to the first approach even though cases like
> "MyTestSplit1" will not be spotted in compile time, but hey, we still do
> have a review process on top of this. The obvious violations like
> "TestThisFeature" and "MySuperTests" would be evaluated as violations which
> I would argue happens the most often.
>
> Please keep in mind that we have checkstyle only in 4.1 and trunk. So,
> while we would fix names of classes in all branches we support so they are
> picked up by CI, checkstyle for this would be introduced in 4.1 and trunk
> only.
>
> (1) https://issues.apache.org/jira/browse/CASSANDRA-17964
> (2) https://github.com/apache/cassandra/blob/trunk/build.xml#L61
> (3) https://github.com/apache/cassandra/blob/trunk/build.xml#L1033
> (4) https://github.com/instaclustr/cassandra/commits/CASSANDRA-17964-4.1-2
>
>
> --
> +---------------------------------------------------------------+
> | Derek Chen-Becker                                             |
> | GPG Key available at https://keybase.io/dchenbecker and       |
> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
> +---------------------------------------------------------------+
>
>
>
> --
> +---------------------------------------------------------------+
> | Derek Chen-Becker                                             |
> | GPG Key available at https://keybase.io/dchenbecker and       |
> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
> +---------------------------------------------------------------+
>
>

-- 
+---------------------------------------------------------------+
| Derek Chen-Becker                                             |
| GPG Key available at https://keybase.io/dchenbecker and       |
| https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
| Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
+---------------------------------------------------------------+

Re: Some tests are never executed in CI due to their name

Posted by "Miklosovic, Stefan" <St...@netapp.com>.
One more point as I was reading your email in a hurry. The ultimate goal is to run all tests. So even if we apply the proposed regexp in checkstyle and there is an unfortunate case that MyTestSplit1 would slip through on a review and it would not be executed in CI, if we do not figure out some better regexp, we might relax "test.name" property to include Split*.java as well. Yes, this might be also done.

Enforcing "all tests to end on "Test" " means that we would need to know which .java files contain tests and which not without looking into them. I do not think that is possible under circumstances we have where (checkstyle module).

________________________________________
From: Derek Chen-Becker <de...@chen-becker.org>
Sent: Monday, October 24, 2022 15:53
To: dev@cassandra.apache.org
Subject: Re: Some tests are never executed in CI due to their name

NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.



OK, that makes sense, and I missed the rename of "TestBaseImpl" (which I agree is in need of fixing). When you say:

> What we are trying to achieve by that is to have a test which ends on "Test" and nothing else and it may contain "Test" only in case it does not start with it.

Should the regex enforce the "which ends on Test" part? Otherwise, if we want to allow tests like MyTestSplit1, do we need to update the "test.name<http://test.name>" parameter to allow that in the test runner?

Thanks,

Derek


On Mon, Oct 24, 2022 at 7:27 AM Miklosovic, Stefan <St...@netapp.com>> wrote:
Hi Derek,

yes, the regexp is ^(?!(?:Test))(?!.*?(?:Tests)).*$

What we are trying to achieve by that is to have a test which ends on "Test" and nothing else and it may contain "Test" only in case it does not start with it.

To your second paragraph, I do not think we have any *Base files which are tests (that would be a bug) nor they are standalone (not inherited). By first approach, TestBaseImpl (which we do have) would start to be invalid, we would have to rename it to "DistributedTestBaseImpl" because it was starting on "Test" which is no-no. I agree that "TestBaseImpl" is rather unfortunate name. Currently, TestBaseImpl itself extends "DistributedTestBase" so in the end we would have "DistributedTestBaseImpl -> TestBaseImpl -> DistributedTestBase". Please keep in mind that "DistributedTestBase" is located in dtest jar (separate project) and that is rather inconvenient to rename.

Regards

________________________________________
From: Derek Chen-Becker <de...@chen-becker.org>>
Sent: Monday, October 24, 2022 15:09
To: dev@cassandra.apache.org<ma...@cassandra.apache.org>
Subject: Re: Some tests are never executed in CI due to their name

NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.



For clarity, what is the final regex you've landed on for the first approach? Is it "^(?!(?:Test))(?!.*?(?:Tests)).*$" ? Are we just trying to reject anything that *starts* with "Test" or contains "Tests" somewhere in the name? It might be more clear to state what we think a valid test name should be, not in regex form.

When it comes to "TestBase" files, are there any of these that are actually intended to be run as a test? The "Base" would indicate to me that it's intended to be inherited, not run directly. Do we have some "TestBase" classes somewhere that aren't inherited and are meant to be run directly, because that definitely seems like a misleading name that should be fixed, invasive or not. I would lean toward the principle of least surprise here, even if it means an involved one-time cleanup.

Cheers,

Derek

On Mon, Oct 24, 2022 at 3:02 AM Miklosovic, Stefan <St...@netapp.com>>> wrote:
Hi list,

it was reported that some tests are not executed as part of CI because of their name (1). There is (2) and (3) so right now it runs only "*Test.java" files.

Except of us renaming the affected classes, we should fix this in a more robust way - via checkstyle - so we enforce how test files have to be called.

We came up with two approaches and we do not know which one to choose so we want to gather more feedback / opinions.

The first approach would make this happen (implemented here (4))

class name - passes / fails the checkstyle

TestIt - fails
TestsThatFeature - fails
ThisTestsMyFeature - fails
SomeTests - fails

SomeHelperClassForTesting - passes
SomeTest - passes
DistributedTestBase - passes

MyTestSplit1 - passes

We would fix "SplitN" test files by hand (I think they are not executed right now in CI either) to something like MySplit1Test and MySplit2Test but the regexp we would eventually use would not prevent this from happening in the future. I do not know how to construct such a complex regexp yet which would cover all cases above plus this splitting one. Feel free to look into the ticket where details are discussed.

The second approach would forbid to have any occurrence of "test" in the file name but at the end. For example, it would fail on classes like these

MyTestSplit1
TestHelper
BurnTestUtil
FuzzTestBase
DistributedTestSnitch
CASCommonTestCases
ServerTestUtils
AuthTestUtils
TestMemtable

While this would also fix "MyTestSplitN" cases, I consider this to be way more invasive to the current codebase because we would have to rename a lot of files (like, hundreds, plus a lot of places they are referenced / used at) and I do not even think this is necessary and desirable. Like, how would we call "FuzzTestBase" to not contain "Test" in it? I think having "test" string in the middle is just fine, specifically for helper classes. One possible fix would be to replace "Test" with "Tester" so we would have "TesterMemtable", "AuthTesterUtils", "TesterHelper", "DistributedTesterSnitch" ... Word "Tester" would be explicitly allowed.

I personally lean to the first approach even though cases like "MyTestSplit1" will not be spotted in compile time, but hey, we still do have a review process on top of this. The obvious violations like "TestThisFeature" and "MySuperTests" would be evaluated as violations which I would argue happens the most often.

Please keep in mind that we have checkstyle only in 4.1 and trunk. So, while we would fix names of classes in all branches we support so they are picked up by CI, checkstyle for this would be introduced in 4.1 and trunk only.

(1) https://issues.apache.org/jira/browse/CASSANDRA-17964
(2) https://github.com/apache/cassandra/blob/trunk/build.xml#L61
(3) https://github.com/apache/cassandra/blob/trunk/build.xml#L1033
(4) https://github.com/instaclustr/cassandra/commits/CASSANDRA-17964-4.1-2


--
+---------------------------------------------------------------+
| Derek Chen-Becker                                             |
| GPG Key available at https://keybase.io/dchenbecker and       |
| https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
| Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
+---------------------------------------------------------------+



--
+---------------------------------------------------------------+
| Derek Chen-Becker                                             |
| GPG Key available at https://keybase.io/dchenbecker and       |
| https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
| Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
+---------------------------------------------------------------+


Re: Some tests are never executed in CI due to their name

Posted by "Miklosovic, Stefan" <St...@netapp.com>.
We do not want to allow MyTestSplit1. That would be forbidden but current regexp would not catch it. There is a gap here I was talking about. This is the case of "test" in the middle which should be possible (DistributedTestBaseImpl) but then we need to fix cases like these and detect this manually on a review.

Feel free to come up with better regexp if you want but I think that catching MyTestSplit1 as illegal would require massive rename of all other classes which happen to contain "test" in their names.

________________________________________
From: Derek Chen-Becker <de...@chen-becker.org>
Sent: Monday, October 24, 2022 15:53
To: dev@cassandra.apache.org
Subject: Re: Some tests are never executed in CI due to their name

NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.



OK, that makes sense, and I missed the rename of "TestBaseImpl" (which I agree is in need of fixing). When you say:

> What we are trying to achieve by that is to have a test which ends on "Test" and nothing else and it may contain "Test" only in case it does not start with it.

Should the regex enforce the "which ends on Test" part? Otherwise, if we want to allow tests like MyTestSplit1, do we need to update the "test.name<http://test.name>" parameter to allow that in the test runner?

Thanks,

Derek


On Mon, Oct 24, 2022 at 7:27 AM Miklosovic, Stefan <St...@netapp.com>> wrote:
Hi Derek,

yes, the regexp is ^(?!(?:Test))(?!.*?(?:Tests)).*$

What we are trying to achieve by that is to have a test which ends on "Test" and nothing else and it may contain "Test" only in case it does not start with it.

To your second paragraph, I do not think we have any *Base files which are tests (that would be a bug) nor they are standalone (not inherited). By first approach, TestBaseImpl (which we do have) would start to be invalid, we would have to rename it to "DistributedTestBaseImpl" because it was starting on "Test" which is no-no. I agree that "TestBaseImpl" is rather unfortunate name. Currently, TestBaseImpl itself extends "DistributedTestBase" so in the end we would have "DistributedTestBaseImpl -> TestBaseImpl -> DistributedTestBase". Please keep in mind that "DistributedTestBase" is located in dtest jar (separate project) and that is rather inconvenient to rename.

Regards

________________________________________
From: Derek Chen-Becker <de...@chen-becker.org>>
Sent: Monday, October 24, 2022 15:09
To: dev@cassandra.apache.org<ma...@cassandra.apache.org>
Subject: Re: Some tests are never executed in CI due to their name

NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.



For clarity, what is the final regex you've landed on for the first approach? Is it "^(?!(?:Test))(?!.*?(?:Tests)).*$" ? Are we just trying to reject anything that *starts* with "Test" or contains "Tests" somewhere in the name? It might be more clear to state what we think a valid test name should be, not in regex form.

When it comes to "TestBase" files, are there any of these that are actually intended to be run as a test? The "Base" would indicate to me that it's intended to be inherited, not run directly. Do we have some "TestBase" classes somewhere that aren't inherited and are meant to be run directly, because that definitely seems like a misleading name that should be fixed, invasive or not. I would lean toward the principle of least surprise here, even if it means an involved one-time cleanup.

Cheers,

Derek

On Mon, Oct 24, 2022 at 3:02 AM Miklosovic, Stefan <St...@netapp.com>>> wrote:
Hi list,

it was reported that some tests are not executed as part of CI because of their name (1). There is (2) and (3) so right now it runs only "*Test.java" files.

Except of us renaming the affected classes, we should fix this in a more robust way - via checkstyle - so we enforce how test files have to be called.

We came up with two approaches and we do not know which one to choose so we want to gather more feedback / opinions.

The first approach would make this happen (implemented here (4))

class name - passes / fails the checkstyle

TestIt - fails
TestsThatFeature - fails
ThisTestsMyFeature - fails
SomeTests - fails

SomeHelperClassForTesting - passes
SomeTest - passes
DistributedTestBase - passes

MyTestSplit1 - passes

We would fix "SplitN" test files by hand (I think they are not executed right now in CI either) to something like MySplit1Test and MySplit2Test but the regexp we would eventually use would not prevent this from happening in the future. I do not know how to construct such a complex regexp yet which would cover all cases above plus this splitting one. Feel free to look into the ticket where details are discussed.

The second approach would forbid to have any occurrence of "test" in the file name but at the end. For example, it would fail on classes like these

MyTestSplit1
TestHelper
BurnTestUtil
FuzzTestBase
DistributedTestSnitch
CASCommonTestCases
ServerTestUtils
AuthTestUtils
TestMemtable

While this would also fix "MyTestSplitN" cases, I consider this to be way more invasive to the current codebase because we would have to rename a lot of files (like, hundreds, plus a lot of places they are referenced / used at) and I do not even think this is necessary and desirable. Like, how would we call "FuzzTestBase" to not contain "Test" in it? I think having "test" string in the middle is just fine, specifically for helper classes. One possible fix would be to replace "Test" with "Tester" so we would have "TesterMemtable", "AuthTesterUtils", "TesterHelper", "DistributedTesterSnitch" ... Word "Tester" would be explicitly allowed.

I personally lean to the first approach even though cases like "MyTestSplit1" will not be spotted in compile time, but hey, we still do have a review process on top of this. The obvious violations like "TestThisFeature" and "MySuperTests" would be evaluated as violations which I would argue happens the most often.

Please keep in mind that we have checkstyle only in 4.1 and trunk. So, while we would fix names of classes in all branches we support so they are picked up by CI, checkstyle for this would be introduced in 4.1 and trunk only.

(1) https://issues.apache.org/jira/browse/CASSANDRA-17964
(2) https://github.com/apache/cassandra/blob/trunk/build.xml#L61
(3) https://github.com/apache/cassandra/blob/trunk/build.xml#L1033
(4) https://github.com/instaclustr/cassandra/commits/CASSANDRA-17964-4.1-2


--
+---------------------------------------------------------------+
| Derek Chen-Becker                                             |
| GPG Key available at https://keybase.io/dchenbecker and       |
| https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
| Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
+---------------------------------------------------------------+



--
+---------------------------------------------------------------+
| Derek Chen-Becker                                             |
| GPG Key available at https://keybase.io/dchenbecker and       |
| https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
| Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
+---------------------------------------------------------------+


Re: Some tests are never executed in CI due to their name

Posted by Derek Chen-Becker <de...@chen-becker.org>.
OK, that makes sense, and I missed the rename of "TestBaseImpl" (which I
agree is in need of fixing). When you say:

> What we are trying to achieve by that is to have a test which ends on
"Test" and nothing else and it may contain "Test" only in case it does not
start with it.

Should the regex enforce the "which ends on Test" part? Otherwise, if we
want to allow tests like MyTestSplit1, do we need to update the "test.name"
parameter to allow that in the test runner?

Thanks,

Derek


On Mon, Oct 24, 2022 at 7:27 AM Miklosovic, Stefan <
Stefan.Miklosovic@netapp.com> wrote:

> Hi Derek,
>
> yes, the regexp is ^(?!(?:Test))(?!.*?(?:Tests)).*$
>
> What we are trying to achieve by that is to have a test which ends on
> "Test" and nothing else and it may contain "Test" only in case it does not
> start with it.
>
> To your second paragraph, I do not think we have any *Base files which are
> tests (that would be a bug) nor they are standalone (not inherited). By
> first approach, TestBaseImpl (which we do have) would start to be invalid,
> we would have to rename it to "DistributedTestBaseImpl" because it was
> starting on "Test" which is no-no. I agree that "TestBaseImpl" is rather
> unfortunate name. Currently, TestBaseImpl itself extends
> "DistributedTestBase" so in the end we would have "DistributedTestBaseImpl
> -> TestBaseImpl -> DistributedTestBase". Please keep in mind that
> "DistributedTestBase" is located in dtest jar (separate project) and that
> is rather inconvenient to rename.
>
> Regards
>
> ________________________________________
> From: Derek Chen-Becker <de...@chen-becker.org>
> Sent: Monday, October 24, 2022 15:09
> To: dev@cassandra.apache.org
> Subject: Re: Some tests are never executed in CI due to their name
>
> NetApp Security WARNING: This is an external email. Do not click links or
> open attachments unless you recognize the sender and know the content is
> safe.
>
>
>
> For clarity, what is the final regex you've landed on for the first
> approach? Is it "^(?!(?:Test))(?!.*?(?:Tests)).*$" ? Are we just trying to
> reject anything that *starts* with "Test" or contains "Tests" somewhere in
> the name? It might be more clear to state what we think a valid test name
> should be, not in regex form.
>
> When it comes to "TestBase" files, are there any of these that are
> actually intended to be run as a test? The "Base" would indicate to me that
> it's intended to be inherited, not run directly. Do we have some "TestBase"
> classes somewhere that aren't inherited and are meant to be run directly,
> because that definitely seems like a misleading name that should be fixed,
> invasive or not. I would lean toward the principle of least surprise here,
> even if it means an involved one-time cleanup.
>
> Cheers,
>
> Derek
>
> On Mon, Oct 24, 2022 at 3:02 AM Miklosovic, Stefan <
> Stefan.Miklosovic@netapp.com<ma...@netapp.com>> wrote:
> Hi list,
>
> it was reported that some tests are not executed as part of CI because of
> their name (1). There is (2) and (3) so right now it runs only "*Test.java"
> files.
>
> Except of us renaming the affected classes, we should fix this in a more
> robust way - via checkstyle - so we enforce how test files have to be
> called.
>
> We came up with two approaches and we do not know which one to choose so
> we want to gather more feedback / opinions.
>
> The first approach would make this happen (implemented here (4))
>
> class name - passes / fails the checkstyle
>
> TestIt - fails
> TestsThatFeature - fails
> ThisTestsMyFeature - fails
> SomeTests - fails
>
> SomeHelperClassForTesting - passes
> SomeTest - passes
> DistributedTestBase - passes
>
> MyTestSplit1 - passes
>
> We would fix "SplitN" test files by hand (I think they are not executed
> right now in CI either) to something like MySplit1Test and MySplit2Test but
> the regexp we would eventually use would not prevent this from happening in
> the future. I do not know how to construct such a complex regexp yet which
> would cover all cases above plus this splitting one. Feel free to look into
> the ticket where details are discussed.
>
> The second approach would forbid to have any occurrence of "test" in the
> file name but at the end. For example, it would fail on classes like these
>
> MyTestSplit1
> TestHelper
> BurnTestUtil
> FuzzTestBase
> DistributedTestSnitch
> CASCommonTestCases
> ServerTestUtils
> AuthTestUtils
> TestMemtable
>
> While this would also fix "MyTestSplitN" cases, I consider this to be way
> more invasive to the current codebase because we would have to rename a lot
> of files (like, hundreds, plus a lot of places they are referenced / used
> at) and I do not even think this is necessary and desirable. Like, how
> would we call "FuzzTestBase" to not contain "Test" in it? I think having
> "test" string in the middle is just fine, specifically for helper classes.
> One possible fix would be to replace "Test" with "Tester" so we would have
> "TesterMemtable", "AuthTesterUtils", "TesterHelper",
> "DistributedTesterSnitch" ... Word "Tester" would be explicitly allowed.
>
> I personally lean to the first approach even though cases like
> "MyTestSplit1" will not be spotted in compile time, but hey, we still do
> have a review process on top of this. The obvious violations like
> "TestThisFeature" and "MySuperTests" would be evaluated as violations which
> I would argue happens the most often.
>
> Please keep in mind that we have checkstyle only in 4.1 and trunk. So,
> while we would fix names of classes in all branches we support so they are
> picked up by CI, checkstyle for this would be introduced in 4.1 and trunk
> only.
>
> (1) https://issues.apache.org/jira/browse/CASSANDRA-17964
> (2) https://github.com/apache/cassandra/blob/trunk/build.xml#L61
> (3) https://github.com/apache/cassandra/blob/trunk/build.xml#L1033
> (4) https://github.com/instaclustr/cassandra/commits/CASSANDRA-17964-4.1-2
>
>
> --
> +---------------------------------------------------------------+
> | Derek Chen-Becker                                             |
> | GPG Key available at https://keybase.io/dchenbecker and       |
> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
> +---------------------------------------------------------------+
>
>

-- 
+---------------------------------------------------------------+
| Derek Chen-Becker                                             |
| GPG Key available at https://keybase.io/dchenbecker and       |
| https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
| Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
+---------------------------------------------------------------+

Re: Some tests are never executed in CI due to their name

Posted by "Miklosovic, Stefan" <St...@netapp.com>.
Hi Derek,

yes, the regexp is ^(?!(?:Test))(?!.*?(?:Tests)).*$

What we are trying to achieve by that is to have a test which ends on "Test" and nothing else and it may contain "Test" only in case it does not start with it.

To your second paragraph, I do not think we have any *Base files which are tests (that would be a bug) nor they are standalone (not inherited). By first approach, TestBaseImpl (which we do have) would start to be invalid, we would have to rename it to "DistributedTestBaseImpl" because it was starting on "Test" which is no-no. I agree that "TestBaseImpl" is rather unfortunate name. Currently, TestBaseImpl itself extends "DistributedTestBase" so in the end we would have "DistributedTestBaseImpl -> TestBaseImpl -> DistributedTestBase". Please keep in mind that "DistributedTestBase" is located in dtest jar (separate project) and that is rather inconvenient to rename.

Regards

________________________________________
From: Derek Chen-Becker <de...@chen-becker.org>
Sent: Monday, October 24, 2022 15:09
To: dev@cassandra.apache.org
Subject: Re: Some tests are never executed in CI due to their name

NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe.



For clarity, what is the final regex you've landed on for the first approach? Is it "^(?!(?:Test))(?!.*?(?:Tests)).*$" ? Are we just trying to reject anything that *starts* with "Test" or contains "Tests" somewhere in the name? It might be more clear to state what we think a valid test name should be, not in regex form.

When it comes to "TestBase" files, are there any of these that are actually intended to be run as a test? The "Base" would indicate to me that it's intended to be inherited, not run directly. Do we have some "TestBase" classes somewhere that aren't inherited and are meant to be run directly, because that definitely seems like a misleading name that should be fixed, invasive or not. I would lean toward the principle of least surprise here, even if it means an involved one-time cleanup.

Cheers,

Derek

On Mon, Oct 24, 2022 at 3:02 AM Miklosovic, Stefan <St...@netapp.com>> wrote:
Hi list,

it was reported that some tests are not executed as part of CI because of their name (1). There is (2) and (3) so right now it runs only "*Test.java" files.

Except of us renaming the affected classes, we should fix this in a more robust way - via checkstyle - so we enforce how test files have to be called.

We came up with two approaches and we do not know which one to choose so we want to gather more feedback / opinions.

The first approach would make this happen (implemented here (4))

class name - passes / fails the checkstyle

TestIt - fails
TestsThatFeature - fails
ThisTestsMyFeature - fails
SomeTests - fails

SomeHelperClassForTesting - passes
SomeTest - passes
DistributedTestBase - passes

MyTestSplit1 - passes

We would fix "SplitN" test files by hand (I think they are not executed right now in CI either) to something like MySplit1Test and MySplit2Test but the regexp we would eventually use would not prevent this from happening in the future. I do not know how to construct such a complex regexp yet which would cover all cases above plus this splitting one. Feel free to look into the ticket where details are discussed.

The second approach would forbid to have any occurrence of "test" in the file name but at the end. For example, it would fail on classes like these

MyTestSplit1
TestHelper
BurnTestUtil
FuzzTestBase
DistributedTestSnitch
CASCommonTestCases
ServerTestUtils
AuthTestUtils
TestMemtable

While this would also fix "MyTestSplitN" cases, I consider this to be way more invasive to the current codebase because we would have to rename a lot of files (like, hundreds, plus a lot of places they are referenced / used at) and I do not even think this is necessary and desirable. Like, how would we call "FuzzTestBase" to not contain "Test" in it? I think having "test" string in the middle is just fine, specifically for helper classes. One possible fix would be to replace "Test" with "Tester" so we would have "TesterMemtable", "AuthTesterUtils", "TesterHelper", "DistributedTesterSnitch" ... Word "Tester" would be explicitly allowed.

I personally lean to the first approach even though cases like "MyTestSplit1" will not be spotted in compile time, but hey, we still do have a review process on top of this. The obvious violations like "TestThisFeature" and "MySuperTests" would be evaluated as violations which I would argue happens the most often.

Please keep in mind that we have checkstyle only in 4.1 and trunk. So, while we would fix names of classes in all branches we support so they are picked up by CI, checkstyle for this would be introduced in 4.1 and trunk only.

(1) https://issues.apache.org/jira/browse/CASSANDRA-17964
(2) https://github.com/apache/cassandra/blob/trunk/build.xml#L61
(3) https://github.com/apache/cassandra/blob/trunk/build.xml#L1033
(4) https://github.com/instaclustr/cassandra/commits/CASSANDRA-17964-4.1-2


--
+---------------------------------------------------------------+
| Derek Chen-Becker                                             |
| GPG Key available at https://keybase.io/dchenbecker and       |
| https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
| Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
+---------------------------------------------------------------+


Re: Some tests are never executed in CI due to their name

Posted by Derek Chen-Becker <de...@chen-becker.org>.
For clarity, what is the final regex you've landed on for the first
approach? Is it "^(?!(?:Test))(?!.*?(?:Tests)).*$" ? Are we just trying to
reject anything that *starts* with "Test" or contains "Tests" somewhere in
the name? It might be more clear to state what we think a valid test name
should be, not in regex form.

When it comes to "TestBase" files, are there any of these that are actually
intended to be run as a test? The "Base" would indicate to me that it's
intended to be inherited, not run directly. Do we have some "TestBase"
classes somewhere that aren't inherited and are meant to be run directly,
because that definitely seems like a misleading name that should be fixed,
invasive or not. I would lean toward the principle of least surprise here,
even if it means an involved one-time cleanup.

Cheers,

Derek

On Mon, Oct 24, 2022 at 3:02 AM Miklosovic, Stefan <
Stefan.Miklosovic@netapp.com> wrote:

> Hi list,
>
> it was reported that some tests are not executed as part of CI because of
> their name (1). There is (2) and (3) so right now it runs only "*Test.java"
> files.
>
> Except of us renaming the affected classes, we should fix this in a more
> robust way - via checkstyle - so we enforce how test files have to be
> called.
>
> We came up with two approaches and we do not know which one to choose so
> we want to gather more feedback / opinions.
>
> The first approach would make this happen (implemented here (4))
>
> class name - passes / fails the checkstyle
>
> TestIt - fails
> TestsThatFeature - fails
> ThisTestsMyFeature - fails
> SomeTests - fails
>
> SomeHelperClassForTesting - passes
> SomeTest - passes
> DistributedTestBase - passes
>
> MyTestSplit1 - passes
>
> We would fix "SplitN" test files by hand (I think they are not executed
> right now in CI either) to something like MySplit1Test and MySplit2Test but
> the regexp we would eventually use would not prevent this from happening in
> the future. I do not know how to construct such a complex regexp yet which
> would cover all cases above plus this splitting one. Feel free to look into
> the ticket where details are discussed.
>
> The second approach would forbid to have any occurrence of "test" in the
> file name but at the end. For example, it would fail on classes like these
>
> MyTestSplit1
> TestHelper
> BurnTestUtil
> FuzzTestBase
> DistributedTestSnitch
> CASCommonTestCases
> ServerTestUtils
> AuthTestUtils
> TestMemtable
>
> While this would also fix "MyTestSplitN" cases, I consider this to be way
> more invasive to the current codebase because we would have to rename a lot
> of files (like, hundreds, plus a lot of places they are referenced / used
> at) and I do not even think this is necessary and desirable. Like, how
> would we call "FuzzTestBase" to not contain "Test" in it? I think having
> "test" string in the middle is just fine, specifically for helper classes.
> One possible fix would be to replace "Test" with "Tester" so we would have
> "TesterMemtable", "AuthTesterUtils", "TesterHelper",
> "DistributedTesterSnitch" ... Word "Tester" would be explicitly allowed.
>
> I personally lean to the first approach even though cases like
> "MyTestSplit1" will not be spotted in compile time, but hey, we still do
> have a review process on top of this. The obvious violations like
> "TestThisFeature" and "MySuperTests" would be evaluated as violations which
> I would argue happens the most often.
>
> Please keep in mind that we have checkstyle only in 4.1 and trunk. So,
> while we would fix names of classes in all branches we support so they are
> picked up by CI, checkstyle for this would be introduced in 4.1 and trunk
> only.
>
> (1) https://issues.apache.org/jira/browse/CASSANDRA-17964
> (2) https://github.com/apache/cassandra/blob/trunk/build.xml#L61
> (3) https://github.com/apache/cassandra/blob/trunk/build.xml#L1033
> (4) https://github.com/instaclustr/cassandra/commits/CASSANDRA-17964-4.1-2



-- 
+---------------------------------------------------------------+
| Derek Chen-Becker                                             |
| GPG Key available at https://keybase.io/dchenbecker and       |
| https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
| Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
+---------------------------------------------------------------+