You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Alex Petrov (Jira)" <ji...@apache.org> on 2021/04/22 15:38:00 UTC

[jira] [Commented] (CASSANDRA-16262) 4.0 Quality: Coordination & Replication Fuzz Testing

    [ https://issues.apache.org/jira/browse/CASSANDRA-16262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17329209#comment-17329209 ] 

Alex Petrov commented on CASSANDRA-16262:
-----------------------------------------

We've done over 700 hours of fuzz testing with Harry (this is a very conservative estimate, with all re-runs I believe it was even more than this), most of the tests with 5-node clusters, some with 4.0 only, and some mixed 3.0/4.0. 

Clusters didn't have that much data (mostly under 10Gb), but I still have a good feeling about it, since we've exercised a lot of combinations: different schemas (with simple and composite partition keys, with and without static columns, with ASC and DESC clustering keys, and with different types for values), SELECT queries with read-repair, paging, ASC/DESC queries, made sure to include different kinds of deletions (range tombstones, partition deletions, row deletions), and ran tests with incremental repair, and repaired data tracking enabled.

Harry is now available as an artefact, and can be used as a library: 
https://repository.apache.org/content/repositories/snapshots/org/apache/cassandra/harry-core/
https://repository.apache.org/content/repositories/snapshots/org/apache/cassandra/harry-integration/

I will keep this ticket open until a small fuzz testing kit is merged into trunk, but I think it's fair to say that fuzz testing prerequisite for coordination and replication is fulfilled. 
cc [~cscotta] [~adelapena] [~blerer] [~aholmber]

> 4.0 Quality: Coordination & Replication Fuzz Testing
> ----------------------------------------------------
>
>                 Key: CASSANDRA-16262
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16262
>             Project: Cassandra
>          Issue Type: Task
>          Components: Test/fuzz
>            Reporter: Caleb Rackliffe
>            Assignee: Alex Petrov
>            Priority: Normal
>             Fix For: 4.0-rc
>
>
> CASSANDRA-16180, CASSANDRA-16181, and CASSANDRA-15977 have largely focused on auditing the existing tests around coordination, replication, and read-repair, respectively. We've expanded existing test cases, added coverage around components that we've refactored along the way, and added in-JVM dtest upgrade tests where possible.
> What remains is verifying the distributed read and write paths in the face of common operational events, namely node restarts, bootstrapping, decommission, and cleanup. If we can find a way to simulate these events, [Harry|https://github.com/apache/cassandra-harry] seems like a good candidate to host the verification logic itself.
> To keep things simple initially, I would propose that we start by testing simple read-only and write-only workloads (the former without read repair).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org