You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Michael Semb Wever (Jira)" <ji...@apache.org> on 2021/04/19 13:28:00 UTC

[jira] [Comment Edited] (CASSANDRA-16604) Parallelise docker container runs for tests in ci-cassandra.a.o

    [ https://issues.apache.org/jira/browse/CASSANDRA-16604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17324852#comment-17324852 ] 

Michael Semb Wever edited comment on CASSANDRA-16604 at 4/19/21, 1:27 PM:
--------------------------------------------------------------------------

Patch at 
 - https://github.com/apache/cassandra-builds/compare/trunk...thelastpickle:mck/16604

The patch
 - for most test types: creates inner splits (run by docker containers) based on available cpu and mem,
 - retries {{`git clone …`}} commands (a common failure on ci-cassandra is git timeouts)
 - replaces any gitbox git clones with github (which times out less)
 - tidies up where/how the code is built before running tests (can shave off a minute from container runs)
 - removes any remaining occurrences of {{`-Dtest.runners=1`}}
 - executes test scripts directly, instead of via {{sh}}
 - updates job description to document use of nightlies.a.o (in second commit)
 - grabs and archives the jenkins console logs (in second commit)

Basic CI
 - https://ci-cassandra.apache.org/job/Cassandra-devbranch-test-parallel/6/
 - reports one quarter less in build time (with a parallelism of two docker containers). (further time can be saved by local docker images and keeping images updated with latest '~/.m2/repository/`)

Before committing, more extensive CI required
 - every test target, every release branch, every arch (amd and arm)

Side note… this type of work remains limited in testability, and is leading to a far amount of churn in the cassandra-builds repository.  Local Jenkins testing, and copying temporary jenkins jobs in ci-cassandra.a.o, has been the primary approach so far. But a more robust, repeatable, accessible approach would be to create a test pipeline script using the jenkins k8s operator. For those with access to a k8s cluster this would make it possible to setup jenkins from scratch, run a CI pipeline, and tear it down, from a single command line. More info: https://jenkinsci.github.io/kubernetes-operator/docs/getting-started/latest/deploy-jenkins/ 


was (Author: michaelsembwever):
Patch at 
 - https://github.com/apache/cassandra-builds/compare/trunk...thelastpickle:mck/16604

The patch
 - for most test types: creates inner splits (run by docker containers) based on available cpu and mem,
 - retries {{`git clone …`}} commands (a common failure on ci-cassandra is git timeouts)
 - replaces any gitbox git clones with github (which times out less)
 - tidies up where/how the code is built before running tests (can shave off a minute from container runs)
 - removes any remaining occurrences of {{`-Dtest.runners=1`}}
 - requires the test scripts to be executed with {{bash}} instead of {{sh}}

Basic CI
 - https://ci-cassandra.apache.org/job/Cassandra-devbranch-test-parallel/6/
 - reports one quarter less in build time (with a parallelism of two docker containers). (further time can be saved by local docker images and keeping images updated with latest '~/.m2/repository/`)

Before committing, more extensive CI required
 - every test target, every release branch, every arch (amd and arm)

Side note… this type of work remains limited in testability, and is leading to a far amount of churn in the cassandra-builds repository.  Local Jenkins testing, and copying temporary jenkins jobs in ci-cassandra.a.o, has been the primary approach so far. But a more robust, repeatable, accessible approach would be to create a test pipeline script using the jenkins k8s operator. For those with access to a k8s cluster this would make it possible to setup jenkins from scratch, run a CI pipeline, and tear it down, from a single command line. More info: https://jenkinsci.github.io/kubernetes-operator/docs/getting-started/latest/deploy-jenkins/ 

> Parallelise docker container runs for tests in ci-cassandra.a.o
> ---------------------------------------------------------------
>
>                 Key: CASSANDRA-16604
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16604
>             Project: Cassandra
>          Issue Type: Task
>          Components: Test/unit
>            Reporter: Michael Semb Wever
>            Assignee: Michael Semb Wever
>            Priority: Normal
>             Fix For: 2.2.x, 3.0.x, 3.11.x, 4.0.x
>
>
> This was raised on the dev ML, where the consensus was to remove it: https://lists.apache.org/thread.html/r1ca3c72b90fa6c57c1cb7dcd02a44221dcca991fe7392abd8c29fe95%40%3Cdev.cassandra.apache.org%3E
> The idea is to then replace ant test parallelism with docker container parallelism.
> PoC patch: 
> https://github.com/apache/cassandra-builds/compare/trunk...thelastpickle:mck/16587-2/trunk
> This is just a quick PoC, aimed at the ci-cassandra agents that have
> 4 cores and 16gb ram available to each executor, but I imagine instead
> something that spawns a number of containers based on system
> resources, like we currently do with get-cores and get-mem. 
> Also worth noting the overhead here, compared with the ant parallelism approach, docker
> builds everything in each container from scratch, but this too can be
> improved easily enough.
> Cleaning up any remnant `-Dtest.runners=` options is also part of this ticket.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org