You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Josh McKenzie (Jira)" <ji...@apache.org> on 2020/10/02 02:31:00 UTC

[jira] [Commented] (CASSANDRA-15588) 4.0 quality testing: Cluster Upgrade

    [ https://issues.apache.org/jira/browse/CASSANDRA-15588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17205940#comment-17205940 ] 

Josh McKenzie commented on CASSANDRA-15588:
-------------------------------------------

[~xingh] We don't have a shepherd here which is concerning item #1 (probably should hit up ML for this and the couple other tickets on this epic w/out; I can take that on shortly). Diff testing workloads has proven to be one of the most powerful ways to confirm mixed version clusters are behaving, and we can reasonably expect a post CASSANDRA-8099 cluster to 4.0 to have significantly less risk and exposure to defects than the straddle as they don't rely on LegacyLayout.

I don't see much of a path other than having a corpus of real user schemas we can run through cassandra-diff w/a generative workload and forward + reverse iteration to confirm correctness; we're not there yet, and while we should have the framework to accept that in the relatively near future, I think blocking a 4.0 release on us building up that collection of anonymized schemas is a pretty big set of unknowns.

So the current relatively inadequate technical coverage we have (afaik) is in upgrade_tests in the dtest repo ([link|[https://github.com/apache/cassandra-dtest/tree/master/upgrade_tests]]). I frame as inadequate because they didn't catch a bunch of the things that had sharp edges in the StorageEngine rewrite so clearly our mixed version cluster testing wasn't as robust as we'd hoped.

Looks like we have a candidate to build from in UpgradeTestBase.java we could start to specifically flesh out if we had a PoV on things to test. I'd advocate for us building more there than investing further in dtests.

So: long winded way to say "Hm. Not sure what we should do here."

[~jjirsa] - got any ideas in terms of a straw man proposal of things we should test on mixed version clusters from a unit perspective, or is integration the way to go here?

> 4.0 quality testing: Cluster Upgrade
> ------------------------------------
>
>                 Key: CASSANDRA-15588
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15588
>             Project: Cassandra
>          Issue Type: Task
>          Components: Test/dtest/java, Test/dtest/python
>            Reporter: Josh McKenzie
>            Priority: Normal
>             Fix For: 4.0-beta, 4.0-triage
>
>
> We've historically had numerous bugs concerning upgrading clusters from one version to the other. Let's establish the supported upgrade path and ensure that users can safely perform the upgrades in production.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org