You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Lyuben Todorov (JIRA)" <ji...@apache.org> on 2014/04/11 00:32:17 UTC

[jira] [Updated] (CASSANDRA-6572) Workload recording / playback

     [ https://issues.apache.org/jira/browse/CASSANDRA-6572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lyuben Todorov updated CASSANDRA-6572:
--------------------------------------

    Attachment: 6572-trunk.diff

The process goes like this:
# We enable recording via JMX SS#enableQueryRecording
# Insert n number of queries
# Replay queries to a new cluster using tools/bin/workloadreplayer

Running through an example:

# JMX Call to SS#enableQueryRecording where we supply parameters of 5 for 5MB log limit, 4 for record ever 1/4 queries and finally {{/var/lib/cassadra/querylog}} as the directory for the logs
# Insert 100k rows
This should result in 2 query logs, one of which is 5mb and has been renamed to store a timestamp in its name, the other will be named QueryLog.log. Between the two logs there should be 25k queries.
# Replaying the logs is done via the replay tool (workloadreplayer) where we first supply the directory of the query logs and then various flags ([see git branch here|https://github.com/lyubent/cassandra/commit/526672982870bec49e5b234e8d11ef5e1f17cd28#diff-91cd490dd94b74e10ade733f61dc6ab7R207]) e.g:
{{./tools/bin/workloadreplayer /Users/lyubentodorov/Desktop/Log/ -t 1000000}}

Concerns:
Two synchronize blocks (one in [QueryProcessor#maybeLogQuery|https://github.com/lyubent/cassandra/commit/526672982870bec49e5b234e8d11ef5e1f17cd28#diff-9c19942eca6c858baad84e942b3c7e21R402] and the other in [QueryRecorder#append|https://github.com/lyubent/cassandra/commit/526672982870bec49e5b234e8d11ef5e1f17cd28#diff-7d2a64c8ee2a2b78b3f1921e673b423eR73]) have been added on the read path, but since these blocks will only be hit when query logging is enabled it shouldn't hinder performance where it matters most. 
I've used the thrift client so I'm not sure if queries routing will be optimal.

Feature branch [here|https://github.com/lyubent/cassandra/tree/6572], also attaching a patch for trunk. I'll patch this for cassandra-2.0 tomorrow :)

> Workload recording / playback
> -----------------------------
>
>                 Key: CASSANDRA-6572
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6572
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core, Tools
>            Reporter: Jonathan Ellis
>            Assignee: Lyuben Todorov
>             Fix For: 2.0.8
>
>         Attachments: 6572-trunk.diff
>
>
> "Write sample mode" gets us part way to testing new versions against a real world workload, but we need an easy way to test the query side as well.



--
This message was sent by Atlassian JIRA
(v6.2#6252)