You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cassandra.apache.org by Mick Semb Wever <mc...@apache.org> on 2021/04/12 18:16:35 UTC

[DISCUSS] Remove support for `test.runners` and `testparallel`

Cassandra's build.xml supports parallel test runners. This
functionality is available through `-Dtest.runners` and the
`testparallel` ant macro.

It's always been there, but hasn't been active recently since both
ci-cassandra and circleci call testclasslist instead of test.

Recently testclasslist was updated to enable multiple runners too.
Since then we witnessed a lot more test failures… The distributed
in-jvm tests just don't work with parallel runners, and currently they
need `-Dtest.runners=1` specified to work. And plenty of flakies where
tests use fixed ports (StorageServiceServerTest), byteman (eg
BMUnitRunner), and around conf files on disk.

From here, I can see two ways forward, a) fix everything to be
parallel ready or b) remove test.runners and parallelise with docker
instead.

All in all, I think this is kinda odd to do (a) when docker is readily
available, especially on the CI servers where we are concerned about
build times.

For (b)… to remove everything related to 'testparallel' and
'test.runners' from the build.xml an example patch is here:
https://github.com/apache/cassandra/compare/trunk...thelastpickle:mck/16587-2/trunk

Then replacing 'ant task parallelism' with docker containers would be
done something like this:
https://github.com/apache/cassandra-builds/compare/trunk...thelastpickle:mck/16587-2/trunk
(this is just a quick PoC, aimed at the ci-cassandra agents that have
4 cores and 16gb ram available to each executor, but I imagine instead
something that spawns a number of containers based on system
resources, like we currently do with get-cores and get-mem). Also
worth noting the overhead here, compared with the ant approach, docker
builds everything in each container from scratch, but this too can be
improved easily enough.

What are folks' opinions?

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org


Re: [DISCUSS] Remove support for `test.runners` and `testparallel`

Posted by Yifan Cai <yc...@gmail.com>.
+1 to remove ant test parallelism and leverage container for it.

- Yifan

> On Apr 13, 2021, at 4:00 AM, Angelo Polo <la...@gmail.com> wrote:
> 
> Docker doesn't run natively on FreeBSD (though work is underway to enable
> that). It's possible to run Docker Machine inside VirtualBox so maybe
> that's workable, otherwise I suppose I can live without parallel testing
> for now since I'm probably the only one.
> 
> Best,
> Angelo
> 
> On Tue, Apr 13, 2021 at 10:59 AM Mick Semb Wever <mc...@apache.org> wrote:
> 
>>> +1 after chatting with Mick who clarified the picture for me. Thx Mick.
>> 
>> 👍
>> 
>> I'm +1 as well to removing test.runner and testparallel support, from
>> all branches.
>> 
>> CASSANDRA-16595 has been created.
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
>> For additional commands, e-mail: dev-help@cassandra.apache.org
>> 
>> 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org


Re: [DISCUSS] Remove support for `test.runners` and `testparallel`

Posted by Angelo Polo <la...@gmail.com>.
Docker doesn't run natively on FreeBSD (though work is underway to enable
that). It's possible to run Docker Machine inside VirtualBox so maybe
that's workable, otherwise I suppose I can live without parallel testing
for now since I'm probably the only one.

Best,
Angelo

On Tue, Apr 13, 2021 at 10:59 AM Mick Semb Wever <mc...@apache.org> wrote:

> > +1 after chatting with Mick who clarified the picture for me. Thx Mick.
>
> 👍
>
> I'm +1 as well to removing test.runner and testparallel support, from
> all branches.
>
> CASSANDRA-16595 has been created.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: dev-help@cassandra.apache.org
>
>

Re: [DISCUSS] Remove support for `test.runners` and `testparallel`

Posted by Mick Semb Wever <mc...@apache.org>.
> +1 after chatting with Mick who clarified the picture for me. Thx Mick.

👍

I'm +1 as well to removing test.runner and testparallel support, from
all branches.

CASSANDRA-16595 has been created.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org


Re: [DISCUSS] Remove support for `test.runners` and `testparallel`

Posted by Berenguer Blasi <be...@gmail.com>.
+1 after chatting with Mick who clarified the picture for me. Thx Mick.

On 12/4/21 20:32, Brandon Williams wrote:
> While I'm certain we could push through all these tests to support
> parallelism, I think it will end up requiring continual work since
> there is a class of tests that won't always work under concurrency,
> but also that won't be immediately obvious until the damage is done.
>
> I'm +1 on punting to docker to parallelize.
>
> On Mon, Apr 12, 2021 at 1:17 PM Mick Semb Wever <mc...@apache.org> wrote:
>> Cassandra's build.xml supports parallel test runners. This
>> functionality is available through `-Dtest.runners` and the
>> `testparallel` ant macro.
>>
>> It's always been there, but hasn't been active recently since both
>> ci-cassandra and circleci call testclasslist instead of test.
>>
>> Recently testclasslist was updated to enable multiple runners too.
>> Since then we witnessed a lot more test failures… The distributed
>> in-jvm tests just don't work with parallel runners, and currently they
>> need `-Dtest.runners=1` specified to work. And plenty of flakies where
>> tests use fixed ports (StorageServiceServerTest), byteman (eg
>> BMUnitRunner), and around conf files on disk.
>>
>> From here, I can see two ways forward, a) fix everything to be
>> parallel ready or b) remove test.runners and parallelise with docker
>> instead.
>>
>> All in all, I think this is kinda odd to do (a) when docker is readily
>> available, especially on the CI servers where we are concerned about
>> build times.
>>
>> For (b)… to remove everything related to 'testparallel' and
>> 'test.runners' from the build.xml an example patch is here:
>> https://github.com/apache/cassandra/compare/trunk...thelastpickle:mck/16587-2/trunk
>>
>> Then replacing 'ant task parallelism' with docker containers would be
>> done something like this:
>> https://github.com/apache/cassandra-builds/compare/trunk...thelastpickle:mck/16587-2/trunk
>> (this is just a quick PoC, aimed at the ci-cassandra agents that have
>> 4 cores and 16gb ram available to each executor, but I imagine instead
>> something that spawns a number of containers based on system
>> resources, like we currently do with get-cores and get-mem). Also
>> worth noting the overhead here, compared with the ant approach, docker
>> builds everything in each container from scratch, but this too can be
>> improved easily enough.
>>
>> What are folks' opinions?
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
>> For additional commands, e-mail: dev-help@cassandra.apache.org
>>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: dev-help@cassandra.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org


Re: [DISCUSS] Remove support for `test.runners` and `testparallel`

Posted by Brandon Williams <dr...@gmail.com>.
While I'm certain we could push through all these tests to support
parallelism, I think it will end up requiring continual work since
there is a class of tests that won't always work under concurrency,
but also that won't be immediately obvious until the damage is done.

I'm +1 on punting to docker to parallelize.

On Mon, Apr 12, 2021 at 1:17 PM Mick Semb Wever <mc...@apache.org> wrote:
>
> Cassandra's build.xml supports parallel test runners. This
> functionality is available through `-Dtest.runners` and the
> `testparallel` ant macro.
>
> It's always been there, but hasn't been active recently since both
> ci-cassandra and circleci call testclasslist instead of test.
>
> Recently testclasslist was updated to enable multiple runners too.
> Since then we witnessed a lot more test failures… The distributed
> in-jvm tests just don't work with parallel runners, and currently they
> need `-Dtest.runners=1` specified to work. And plenty of flakies where
> tests use fixed ports (StorageServiceServerTest), byteman (eg
> BMUnitRunner), and around conf files on disk.
>
> From here, I can see two ways forward, a) fix everything to be
> parallel ready or b) remove test.runners and parallelise with docker
> instead.
>
> All in all, I think this is kinda odd to do (a) when docker is readily
> available, especially on the CI servers where we are concerned about
> build times.
>
> For (b)… to remove everything related to 'testparallel' and
> 'test.runners' from the build.xml an example patch is here:
> https://github.com/apache/cassandra/compare/trunk...thelastpickle:mck/16587-2/trunk
>
> Then replacing 'ant task parallelism' with docker containers would be
> done something like this:
> https://github.com/apache/cassandra-builds/compare/trunk...thelastpickle:mck/16587-2/trunk
> (this is just a quick PoC, aimed at the ci-cassandra agents that have
> 4 cores and 16gb ram available to each executor, but I imagine instead
> something that spawns a number of containers based on system
> resources, like we currently do with get-cores and get-mem). Also
> worth noting the overhead here, compared with the ant approach, docker
> builds everything in each container from scratch, but this too can be
> improved easily enough.
>
> What are folks' opinions?
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: dev-help@cassandra.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org


Re: [DISCUSS] Remove support for `test.runners` and `testparallel`

Posted by David Capwell <dc...@apple.com.INVALID>.
+1 to remove in favor of CI handling. 

> On Apr 12, 2021, at 11:16 AM, Mick Semb Wever <mc...@apache.org> wrote:
> 
> Cassandra's build.xml supports parallel test runners. This
> functionality is available through `-Dtest.runners` and the
> `testparallel` ant macro.
> 
> It's always been there, but hasn't been active recently since both
> ci-cassandra and circleci call testclasslist instead of test.
> 
> Recently testclasslist was updated to enable multiple runners too.
> Since then we witnessed a lot more test failures… The distributed
> in-jvm tests just don't work with parallel runners, and currently they
> need `-Dtest.runners=1` specified to work. And plenty of flakies where
> tests use fixed ports (StorageServiceServerTest), byteman (eg
> BMUnitRunner), and around conf files on disk.
> 
> From here, I can see two ways forward, a) fix everything to be
> parallel ready or b) remove test.runners and parallelise with docker
> instead.
> 
> All in all, I think this is kinda odd to do (a) when docker is readily
> available, especially on the CI servers where we are concerned about
> build times.
> 
> For (b)… to remove everything related to 'testparallel' and
> 'test.runners' from the build.xml an example patch is here:
> https://github.com/apache/cassandra/compare/trunk...thelastpickle:mck/16587-2/trunk
> 
> Then replacing 'ant task parallelism' with docker containers would be
> done something like this:
> https://github.com/apache/cassandra-builds/compare/trunk...thelastpickle:mck/16587-2/trunk
> (this is just a quick PoC, aimed at the ci-cassandra agents that have
> 4 cores and 16gb ram available to each executor, but I imagine instead
> something that spawns a number of containers based on system
> resources, like we currently do with get-cores and get-mem). Also
> worth noting the overhead here, compared with the ant approach, docker
> builds everything in each container from scratch, but this too can be
> improved easily enough.
> 
> What are folks' opinions?
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: dev-help@cassandra.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org