You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Josh McKenzie (Jira)" <ji...@apache.org> on 2022/11/02 19:13:00 UTC

[jira] [Created] (CASSANDRA-18009) Tune parallelism for circleci jobs

Josh McKenzie created CASSANDRA-18009:
-----------------------------------------

             Summary: Tune parallelism for circleci jobs
                 Key: CASSANDRA-18009
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-18009
             Project: Cassandra
          Issue Type: Task
          Components: Test/dtest/java, Test/dtest/python, Test/unit
            Reporter: Josh McKenzie


We should tune the parallel parameters for our circleci config to be more optimal. From the email / slack conversations on the topic:

{code}
> def java_parallelism(src_dir, kind, num_file_in_worker, include = lambda a, b: True):
>     d = os.path.join(src_dir, 'test', kind)
>     num_files = 0
>     for root, dirs, files in os.walk(d):
>         for f in files:
>             if f.endswith('Test.java') and include(os.path.join(root, f), f):
>                 num_files += 1
>     return math.floor(num_files / num_file_in_worker)
> 
> def fix_parallelism(args, contents):
>     jobs = contents['jobs']
> 
>     unit_parallelism                = java_parallelism(args.src, 'unit', 20)
>     jvm_dtest_parallelism           = java_parallelism(args.src, 'distributed', 4, lambda full, name: 'upgrade' not in full)
>     jvm_dtest_upgrade_parallelism   = java_parallelism(args.src, 'distributed', 2, lambda full, name: 'upgrade' in full)
{code}

bq. `TL;DR - I find all test files we are going to run, and based off a pre-defined variable that says “idea” number of files per worker, I then calculate how many workers we need.  So unit tests are num_files / 20 ~= 35 workers.  Can I be “smarter” by knowing which files have higher cost?  Sure… but the “perfect” and the “average” are too similar that it wasn’t worth it...`

Quoting [~dcapwell]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org