You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Benedict (JIRA)" <ji...@apache.org> on 2014/06/30 13:48:25 UTC

[jira] [Comment Edited] (CASSANDRA-6146) CQL-native stress

    [ https://issues.apache.org/jira/browse/CASSANDRA-6146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14047575#comment-14047575 ] 

Benedict edited comment on CASSANDRA-6146 at 6/30/14 11:48 AM:
---------------------------------------------------------------

I've pushed a version of these changes [here|https://github.com/belliottsmith/cassandra/tree/6146-cqlstress]

I wanted to integrate the changes a bit more tightly with the old stress, so we didn't seem to simply have two different stresses only nominally related. At the same time I wanted to address a few things I felt were important to setup so that future improvements are easy to introduce:

# We now generate partitions predictably, so when we perform queries we can be sure we're using data that is relevant to the partition we're operating over
# We explicitly generate multi-row partitions, with configurable distirbution of clustering components 
# We can support multiple queries / inserts simultaneously in the new path
# The new path is executed with a more standard syntax (it's execute with stress user, instead of stress write/read; can perform e.g. inserts/queries with "stress user ops(insert=1,query=10)" for 90/10 read/write workload)
# I've switched configs to all support the range of distributions we could previously (including for size, etc.)
# All old paths use the same partition generators as the new paths to keep maintenance and extension simpler
# I've moved a few more config parameters into the yaml
# We report partition and row statistics now

Some other implications:
# To simplify matters and maintenance, I've stripped from the old paths support for super columns, indexes and multi-gets, as we did not typically seem to exercise these paths and these are probably best encapsulated with the new ones
# The old path now generates a lot more garbage, because the new path has to, so it will be slightly higher overhead than it was previously. We also only generate random data on the old path, so we may again see a decline in performance

Some things still to do in near future; all of which reasonably easy but wanted to limit scope of refactor:
# Support deletes
# Support partial inserts/deletes (currently insert only supports writing the whole partition)
# Support query result validation

The diff is quite big, but I think a lot of the changes are due to package movements. The basic functionality of your patch is left intact, so hopefully it shouldn't be too tricky to figure out what's happening now.


was (Author: benedict):
I've pushed a version of these changes [here|https://github.com/belliottsmith/cassandra/tree/6146-cqlstress]

I wanted to integrate the changes a bit more tightly with the old stress, so we didn't seem to simply have two different stresses only nominally related. At the same time I wanted to address a few things I felt were important to setup so that future improvements are easy to introduce:

# We now generate partitions predictably, so when we perform queries we can be sure we're using data that is relevant to the partition we're operating over
# We explicitly generate multi-row partitions, with configurable distirbution of clustering components 
# We can support multiple queries / inserts simultaneously in the new path
# The new path is executed with a more standard syntax (it's execute with stress user, instead of stress write/read; can perform e.g. inserts/queries with "stress user ops(insert=1,query=10)" for 90/10 read/write workload)
# I've switched configs to all support the range of distributions we could previously (including for size, etc.)
# All old paths use the same partition generators as the new paths to keep maintenance and extension simpler
# I've moved a few more config parameters into the yaml

Some other implications:
# To simplify matters and maintenance, I've stripped from the old paths support for super columns, indexes and multi-gets, as we did not typically seem to exercise these paths and these are probably best encapsulated with the new ones
# The old path now generates a lot more garbage, because the new path has to, so it will be slightly higher overhead than it was previously. We also only generate random data on the old path, so we may again see a decline in performance

Some things still to do in near future; all of which reasonably easy but wanted to limit scope of refactor:
# Support deletes
# Support partial inserts/deletes (currently insert only supports writing the whole partition)
# Support query result validation

The diff is quite big, but I think a lot of the changes are due to package movements.

> CQL-native stress
> -----------------
>
>                 Key: CASSANDRA-6146
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6146
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Tools
>            Reporter: Jonathan Ellis
>            Assignee: T Jake Luciani
>             Fix For: 2.1.1
>
>         Attachments: 6146-v2.txt, 6146.txt, 6164-v3.txt
>
>
> The existing CQL "support" in stress is not worth discussing.  We need to start over, and we might as well kill two birds with one stone and move to the native protocol while we're at it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)