You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Michael Semb Wever (Jira)" <ji...@apache.org> on 2021/04/19 13:28:00 UTC
[jira] [Comment Edited] (CASSANDRA-16604) Parallelise docker
container runs for tests in ci-cassandra.a.o
[ https://issues.apache.org/jira/browse/CASSANDRA-16604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17324852#comment-17324852 ]
Michael Semb Wever edited comment on CASSANDRA-16604 at 4/19/21, 1:27 PM:
--------------------------------------------------------------------------
Patch at
- https://github.com/apache/cassandra-builds/compare/trunk...thelastpickle:mck/16604
The patch
- for most test types: creates inner splits (run by docker containers) based on available cpu and mem,
- retries {{`git clone …`}} commands (a common failure on ci-cassandra is git timeouts)
- replaces any gitbox git clones with github (which times out less)
- tidies up where/how the code is built before running tests (can shave off a minute from container runs)
- removes any remaining occurrences of {{`-Dtest.runners=1`}}
- executes test scripts directly, instead of via {{sh}}
- updates job description to document use of nightlies.a.o (in second commit)
- grabs and archives the jenkins console logs (in second commit)
Basic CI
- https://ci-cassandra.apache.org/job/Cassandra-devbranch-test-parallel/6/
- reports one quarter less in build time (with a parallelism of two docker containers). (further time can be saved by local docker images and keeping images updated with latest '~/.m2/repository/`)
Before committing, more extensive CI required
- every test target, every release branch, every arch (amd and arm)
Side note… this type of work remains limited in testability, and is leading to a far amount of churn in the cassandra-builds repository. Local Jenkins testing, and copying temporary jenkins jobs in ci-cassandra.a.o, has been the primary approach so far. But a more robust, repeatable, accessible approach would be to create a test pipeline script using the jenkins k8s operator. For those with access to a k8s cluster this would make it possible to setup jenkins from scratch, run a CI pipeline, and tear it down, from a single command line. More info: https://jenkinsci.github.io/kubernetes-operator/docs/getting-started/latest/deploy-jenkins/
was (Author: michaelsembwever):
Patch at
- https://github.com/apache/cassandra-builds/compare/trunk...thelastpickle:mck/16604
The patch
- for most test types: creates inner splits (run by docker containers) based on available cpu and mem,
- retries {{`git clone …`}} commands (a common failure on ci-cassandra is git timeouts)
- replaces any gitbox git clones with github (which times out less)
- tidies up where/how the code is built before running tests (can shave off a minute from container runs)
- removes any remaining occurrences of {{`-Dtest.runners=1`}}
- requires the test scripts to be executed with {{bash}} instead of {{sh}}
Basic CI
- https://ci-cassandra.apache.org/job/Cassandra-devbranch-test-parallel/6/
- reports one quarter less in build time (with a parallelism of two docker containers). (further time can be saved by local docker images and keeping images updated with latest '~/.m2/repository/`)
Before committing, more extensive CI required
- every test target, every release branch, every arch (amd and arm)
Side note… this type of work remains limited in testability, and is leading to a far amount of churn in the cassandra-builds repository. Local Jenkins testing, and copying temporary jenkins jobs in ci-cassandra.a.o, has been the primary approach so far. But a more robust, repeatable, accessible approach would be to create a test pipeline script using the jenkins k8s operator. For those with access to a k8s cluster this would make it possible to setup jenkins from scratch, run a CI pipeline, and tear it down, from a single command line. More info: https://jenkinsci.github.io/kubernetes-operator/docs/getting-started/latest/deploy-jenkins/
> Parallelise docker container runs for tests in ci-cassandra.a.o
> ---------------------------------------------------------------
>
> Key: CASSANDRA-16604
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16604
> Project: Cassandra
> Issue Type: Task
> Components: Test/unit
> Reporter: Michael Semb Wever
> Assignee: Michael Semb Wever
> Priority: Normal
> Fix For: 2.2.x, 3.0.x, 3.11.x, 4.0.x
>
>
> This was raised on the dev ML, where the consensus was to remove it: https://lists.apache.org/thread.html/r1ca3c72b90fa6c57c1cb7dcd02a44221dcca991fe7392abd8c29fe95%40%3Cdev.cassandra.apache.org%3E
> The idea is to then replace ant test parallelism with docker container parallelism.
> PoC patch:
> https://github.com/apache/cassandra-builds/compare/trunk...thelastpickle:mck/16587-2/trunk
> This is just a quick PoC, aimed at the ci-cassandra agents that have
> 4 cores and 16gb ram available to each executor, but I imagine instead
> something that spawns a number of containers based on system
> resources, like we currently do with get-cores and get-mem.
> Also worth noting the overhead here, compared with the ant parallelism approach, docker
> builds everything in each container from scratch, but this too can be
> improved easily enough.
> Cleaning up any remnant `-Dtest.runners=` options is also part of this ticket.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org