You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Alexander Dejanovski (Jira)" <ji...@apache.org> on 2021/01/25 16:59:00 UTC

[jira] [Commented] (CASSANDRA-16245) Implement repair quality test scenarios

    [ https://issues.apache.org/jira/browse/CASSANDRA-16245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17271530#comment-17271530 ] 

Alexander Dejanovski commented on CASSANDRA-16245:
--------------------------------------------------

Status update:

The test scenarios described in this ticket were implemented and are now scheduled for [nightly runs in CircleCI|https://app.circleci.com/pipelines/github/riptano/cassandra-rtest?branch=trunk] against trunk.

We had to reduce the density per node to 20GB for now as the tests take a while to run already. We may generate additional data without adding more entropy to see how that impacts the execution times.

[One last PR|https://github.com/riptano/cassandra-rtest/pull/4] is waiting to be merged to fix the code style and use the Cassandra code conventions, and also complement the push triggered CI runs with the CCM based test scenarios which are used for development purposes.

[~vinaykumarcse], are you still willing to do a review on the code? I guess it can wait until we get a consensus on whether we integrate this repair test to the Cassandra repo or not, but I'd be happy to get your feedback already.

 

> Implement repair quality test scenarios
> ---------------------------------------
>
>                 Key: CASSANDRA-16245
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16245
>             Project: Cassandra
>          Issue Type: Task
>          Components: Test/dtest/java
>            Reporter: Alexander Dejanovski
>            Assignee: Radovan Zvoncek
>            Priority: Normal
>             Fix For: 4.0-rc
>
>
> Implement the following test scenarios in a new test suite for repair integration testing with significant load:
> Generate/restore a workload of ~100GB per node. Medusa should be considered to create the initial backup which could then be restored from an S3 bucket to speed up node population.
>  Data should on purpose require repair and be generated accordingly.
> Perform repairs for a 3 nodes cluster with 4 cores each and 16GB-32GB RAM (m5d.xlarge instances would be the most cost efficient type).
>  Repaired keyspaces will use RF=3 or RF=2 in some cases (the latter is for subranges with different sets of replicas).
> ||Mode||Version||Settings||Checks||
> |Full repair|trunk|Sequential + All token ranges|"No anticompaction (repairedAt==0)
>  Out of sync ranges > 0
>  Subsequent run must show no out of sync range"|
> |Full repair|trunk|Parallel + Primary range|"No anticompaction (repairedAt==0)
>  Out of sync ranges > 0
>  Subsequent run must show no out of sync range"|
> |Full repair|trunk|Force terminate repair shortly after it was triggered|Repair threads must be cleaned up|
> |Subrange repair|trunk|Sequential + single token range|"No anticompaction (repairedAt==0)
>  Out of sync ranges > 0
>  Subsequent run must show no out of sync range"|
> |Subrange repair|trunk|Parallel + 10 token ranges which have the same replicas|"No anticompaction (repairedAt == 0)
>  Out of sync ranges > 0
>  Subsequent run must show no out of sync range
> A single repair session will handle all subranges at once"|
> |Subrange repair|trunk|Parallel + 10 token ranges which have different replicas|"No anticompaction (repairedAt==0)
>  Out of sync ranges > 0
>  Subsequent run must show no out of sync range
> More than one repair session is triggered to process all subranges"|
> |Incremental repair|trunk|"Parallel (mandatory)
>  No compaction during repair"|"Anticompaction status (repairedAt != 0) on all SSTables
>  No pending repair on SSTables after completion (could require to wait a bit as this will happen asynchronously)
>  Out of sync ranges > 0 + Subsequent run must show no out of sync range"|
> |Incremental repair|trunk|"Parallel (mandatory)
>  Major compaction triggered during repair"|"Anticompaction status (repairedAt != 0) on all SSTables
>  No pending repair on SSTables after completion (could require to wait a bit as this will happen asynchronously)
>  Out of sync ranges > 0 + Subsequent run must show no out of sync range"|
> |Incremental repair|trunk|Force terminate repair shortly after it was triggered.|Repair threads must be cleaned up|



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org