You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "David Alves (JIRA)" <ji...@apache.org> on 2012/08/20 14:28:39 UTC
[jira] [Comment Edited] (CASSANDRA-1123) Allow tracing query
details
[ https://issues.apache.org/jira/browse/CASSANDRA-1123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13437816#comment-13437816 ]
David Alves edited comment on CASSANDRA-1123 at 8/20/12 11:28 PM:
------------------------------------------------------------------
Attached is a patch that implements almost everything mentioned:
- Trace is implemented asynchronously with a new Stage that has a threadpool with a single thread that refuses to execute when the queue if full ( as suggested a warn is logged, experiments say that under huge load 0.001 traceProbability still works without rejecting events). (also this threadpool is the only one that does not propagate trace events).
- Allows to enable tracing/disable tracing from cli. Also enable tracing has two parameters traceProbability (the prop that any single request from that client gets traced) and maxTraceNumber to allow to set a maximum number of traces to do (-1 set this to Integer.MAX_INT which is also the default)
- Adds the possibility to enable tracing in stress (using -tr probability [optionally maxNumTraces]
- TraceEvents can be build using a fluent builder (TraceEventBuilder) that is also able to deserialize events (both from thrift and from IColumns).
- All requests in CassandraServer start a tracing session when tracing is enabled and all parameters are stored along with the request details. This is done using a ThriftType that is able to serialize and deserialize thrift objects into Cassandra.
- User Request/Reply, Stage Start/Finish and Message Request/Reply are traced along with specific custom requests (such as apply_mutation, and get_column_family)
- Cli contains two other commands to explore traces:
show tracing summary [request_name] - this displays a summary for a request type:
{code}
Summary for sessions of request: batch_mutate
Total Sessions: 500
Total Events: 3190
==============================================================================
| Avg. | StdDev. | Max. | Min. | 99% | Unit |
=====================================================================================================
| Latency | 22.48 | 28.68 | 293.90 | 0.56 | 150.38| msec |
|---------------------------------------------------------------------------------------------------|
| Batch Rows | 1.00 | 0.00 | 1.00 | 1.00 | 1.00| amount/req |
|---------------------------------------------------------------------------------------------------|
| Mutations | 5.00 | 0.00 | 5.00 | 5.00 | 5.00| amount/req |
|---------------------------------------------------------------------------------------------------|
| Deletions | 0.00 | 0.00 | 0.00 | 0.00 | 0.00| amount/req |
|---------------------------------------------------------------------------------------------------|
| Columns | 5.00 | 0.00 | 5.00 | 5.00 | 5.00| amount/req |
|---------------------------------------------------------------------------------------------------|
| Counters | 0.00 | 0.00 | 0.00 | 0.00 | 0.00| amount/req |
|---------------------------------------------------------------------------------------------------|
| Super Col. | 0.00 | 0.00 | 0.00 | 0.00 | 0.00| amount/req |
|---------------------------------------------------------------------------------------------------|
| Written Bytes | 170.00 | 0.00 | 170.00 | 170.00 | 170.00| amount/req |
|---------------------------------------------------------------------------------------------------|
Quickest Request sessionId: 135b0c60-eac0-11e1-0000-fe8ebeead9ff
Slowest Request sessionId: f09c3e10-eabf-11e1-0000-fe8ebeead9ff
{code}
explain trace session [sessionId] - displays the complete set of events of a tracing session along with a to-scale timeline for its execution.
{code}
Session Summary: f09c3e10-eabf-11e1-0000-fe8ebeead9ff
Request: batch_mutate
Coordinator: /127.0.0.1
Interacting nodes: 2 {[/127.0.0.1, /127.0.0.2]}
Duration: 293901000 nano seconds (293.901 msecs)
Consistency Level: ONE
Request Timeline:
||--------------------------------------------------------------------------------------------------|
Se;/Msg /Se
|-||
Msg,apply_mutation;St:[Mutation];/St:[Mutation]
Caption:
Se - Session start (start of request: batch_mutate)
/Se - Session end (end of request: batch_mutate)
St[StageName] - Begin of Stage with StageName (e.g. Mutation)
/St[StageName] - End of Stage with StageName
Msg - Arrival of message from coordinator
/Msg - Reply to message from coordinator
??? - Custom traces
NOTE: timelines are drawn to scale but precision is limited to 100 positions.
When multiple events occur in the same position (due to lack of precision) they are separated by ','
When two positions are too close together trace names are separated by ';'
Full event list (in order of occurrence)
Id: f09c3e11-eabf-11e1-0000-fe8ebeead9ff Name: batch_mutate Type: SESSION_START Source: /127.0.0.1 Timestamp (millis): 1345464589425 Duration (nanos): 73000
Id: f09c8c30-eabf-11e1-0000-fe8ebeead9ff Name: MessageDeparture[47681] Type: MESSAGE_DEPARTURE Source: /127.0.0.1 Timestamp (millis): 1345464589427 Duration (nanos): 2003000
Id: f0c63440-eabf-11e1-0000-7fd66bb03abf Name: MessageArrival[47681] Type: MESSAGE_ARRIVAL Source: /127.0.0.2 Timestamp (millis): 1345464589700 Duration (nanos): 47000
Id: f0c745b0-eabf-11e1-0000-7fd66bb03abf Name: MutationStage Type: STAGE_START Source: /127.0.0.2 Timestamp (millis): 1345464589707 Duration (nanos): 7045000
Id: f0c745b1-eabf-11e1-0000-7fd66bb03abf Name: apply_mutation Type: MISC Source: /127.0.0.2 Timestamp (millis): 1345464589707 Duration (nanos): 25000
Id: f0c793d0-eabf-11e1-0000-7fd66bb03abf Name: MutationStage Type: STAGE_FINISH Source: /127.0.0.2 Timestamp (millis): 1345464589709 Duration (nanos): 9617000
Id: f0c91a70-eabf-11e1-0000-fe8ebeead9ff Name: SESSION_END Type: SESSION_END Source: /127.0.0.1 Timestamp (millis): 1345464589719 Duration (nanos): 293901000
{code}
I timeline from a fully distributed deployment (this was from ccm) would be much more balanced (the lower line would be bigger)
NOTES:
- Some parameters are hard-coded (such as the tracing queue, the ttl for trace events and the number of trace events to fetch for the summaries). It might be better to configure these somewhere but I've refrained from adding more config variables without consensus.
- There should be more traces.
- I implemented a distributed system keyspace to be used elsewhere but in the end reverted to using the trace keyspace, reasoning was that a distributed system table might want different things than a tracing system (replication factor for instance). Since tracing has its own keyspace also kept CFMetadata there (Im glad to move it if it makes sense even without the dist systable or if the consensus is that trace should be there)
- In order to display the summarizing cli command the commons math jar is required (for statistics)
- I ended up changing the schema to:
{code}
CREATE KEYSPACE trace WITH strategy_class = 'SimpleStrategy'
AND strategy_options:replication_factor = '1';
CREATE TABLE trace.trace_events(sessionId timeuuid,
coordinator inet,
eventId timeuuid,
description text,
duration bigint,
happened_at timestamp,
name text,
payload_types map<text, text>,
payload map<text, blob>,
source inet,
type text,
PRIMARY KEY (sessionId, coordinator, eventId));
CREATE INDEX idx_name ON trace.trace_events (name);
{code}
Note the missing Sessions table (all the info there was redundant) and the added fields for payload.
Finally I'm having issues using the index from thrift. Is there an incompatibility (i.e., using an IndexExpression with a CQL3 table?).
- Tests pass
was (Author: dr-alves):
Attached is a patch that implements almost everything mentioned:
- Trace is implemented asynchronously with a new Stage that has a threadpool with a single thread that refuses to execute when the queue if full ( as suggested a warn is logged, experiments say that under huge load 0.001 traceProbability still works without rejecting events). (also this threadpool is the only one that does not propagate trace events).
- Allows to enable tracing/disable tracing from cli. Also enable tracing has two parameters traceProbability (the prop that any single request from that client gets traced) and maxTraceNumber to allow to set a maximum number of traces to do (-1 set this to Integer.MAX_INT which is also the default)
- Adds the possibility to enable tracing in stress (using -tr probability [optionally maxNumTraces]
- TraceEvents can be build using a fluent builder (TraceEventBuilder) that is also able to deserialize events (both from thrift and from IColumns).
- All requests in CassandraServer start a tracing session when tracing is enabled and all parameters are stored along with the request details. This is done using a ThriftType that is able to serialize and deserialize thrift objects into Cassandra.
- User Request/Reply, Stage Start/Finish and Message Request/Reply are traced along with specific custom requests (such as apply_mutation, and get_column_family)
- Cli contains two other commands to explore traces:
show tracing summary [request_name] - this displays a summary for a request type:
{quote}
Summary for sessions of request: batch_mutate
Total Sessions: 500
Total Events: 3190
==============================================================================
| Avg. | StdDev. | Max. | Min. | 99% | Unit |
=====================================================================================================
| Latency | 22.48 | 28.68 | 293.90 | 0.56 | 150.38| msec |
|---------------------------------------------------------------------------------------------------|
| Batch Rows | 1.00 | 0.00 | 1.00 | 1.00 | 1.00| amount/req |
|---------------------------------------------------------------------------------------------------|
| Mutations | 5.00 | 0.00 | 5.00 | 5.00 | 5.00| amount/req |
|---------------------------------------------------------------------------------------------------|
| Deletions | 0.00 | 0.00 | 0.00 | 0.00 | 0.00| amount/req |
|---------------------------------------------------------------------------------------------------|
| Columns | 5.00 | 0.00 | 5.00 | 5.00 | 5.00| amount/req |
|---------------------------------------------------------------------------------------------------|
| Counters | 0.00 | 0.00 | 0.00 | 0.00 | 0.00| amount/req |
|---------------------------------------------------------------------------------------------------|
| Super Col. | 0.00 | 0.00 | 0.00 | 0.00 | 0.00| amount/req |
|---------------------------------------------------------------------------------------------------|
| Written Bytes | 170.00 | 0.00 | 170.00 | 170.00 | 170.00| amount/req |
|---------------------------------------------------------------------------------------------------|
Quickest Request sessionId: 135b0c60-eac0-11e1-0000-fe8ebeead9ff
Slowest Request sessionId: f09c3e10-eabf-11e1-0000-fe8ebeead9ff
{quote}
explain trace session [sessionId] - displays the complete set of events of a tracing session along with a to-scale timeline for its execution.
{quote}
Session Summary: f09c3e10-eabf-11e1-0000-fe8ebeead9ff
Request: batch_mutate
Coordinator: /127.0.0.1
Interacting nodes: 2 {[/127.0.0.1, /127.0.0.2]}
Duration: 293901000 nano seconds (293.901 msecs)
Consistency Level: ONE
Request Timeline:
||--------------------------------------------------------------------------------------------------|
Se;/Msg /Se
|-||
Msg,apply_mutation;St:[Mutation];/St:[Mutation]
Caption:
Se - Session start (start of request: batch_mutate)
/Se - Session end (end of request: batch_mutate)
St[StageName] - Begin of Stage with StageName (e.g. Mutation)
/St[StageName] - End of Stage with StageName
Msg - Arrival of message from coordinator
/Msg - Reply to message from coordinator
??? - Custom traces
NOTE: timelines are drawn to scale but precision is limited to 100 positions.
When multiple events occur in the same position (due to lack of precision) they are separated by ','
When two positions are too close together trace names are separated by ';'
Full event list (in order of occurrence)
Id: f09c3e11-eabf-11e1-0000-fe8ebeead9ff Name: batch_mutate Type: SESSION_START Source: /127.0.0.1 Timestamp (millis): 1345464589425 Duration (nanos): 73000
Id: f09c8c30-eabf-11e1-0000-fe8ebeead9ff Name: MessageDeparture[47681] Type: MESSAGE_DEPARTURE Source: /127.0.0.1 Timestamp (millis): 1345464589427 Duration (nanos): 2003000
Id: f0c63440-eabf-11e1-0000-7fd66bb03abf Name: MessageArrival[47681] Type: MESSAGE_ARRIVAL Source: /127.0.0.2 Timestamp (millis): 1345464589700 Duration (nanos): 47000
Id: f0c745b0-eabf-11e1-0000-7fd66bb03abf Name: MutationStage Type: STAGE_START Source: /127.0.0.2 Timestamp (millis): 1345464589707 Duration (nanos): 7045000
Id: f0c745b1-eabf-11e1-0000-7fd66bb03abf Name: apply_mutation Type: MISC Source: /127.0.0.2 Timestamp (millis): 1345464589707 Duration (nanos): 25000
Id: f0c793d0-eabf-11e1-0000-7fd66bb03abf Name: MutationStage Type: STAGE_FINISH Source: /127.0.0.2 Timestamp (millis): 1345464589709 Duration (nanos): 9617000
Id: f0c91a70-eabf-11e1-0000-fe8ebeead9ff Name: SESSION_END Type: SESSION_END Source: /127.0.0.1 Timestamp (millis): 1345464589719 Duration (nanos): 293901000
{quote}
I timeline from a fully distributed deployment (this was from ccm) would be much more balanced (the lower line would be bigger)
NOTES:
- Some parameters are hard-coded (such as the tracing queue, the ttl for trace events and the number of trace events to fetch for the summaries). It might be better to configure these somewhere but I've refrained from adding more config variables without consensus.
- There should be more traces.
- I implemented a distributed system keyspace to be used elsewhere but in the end reverted to using the trace keyspace, reasoning was that a distributed system table might want different things than a tracing system (replication factor for instance). Since tracing has its own keyspace also kept CFMetadata there (Im glad to move it if it makes sense even without the dist systable or if the consensus is that trace should be there)
- In order to display the summarizing cli command the commons math jar is required (for statistics)
- I ended up changing the schema to:
{quote}
CREATE KEYSPACE trace WITH strategy_class = 'SimpleStrategy'
AND strategy_options:replication_factor = '1';
CREATE TABLE trace.trace_events(sessionId timeuuid,
coordinator inet,
eventId timeuuid,
description text,
duration bigint,
happened_at timestamp,
name text,
payload_types map<text, text>,
payload map<text, blob>,
source inet,
type text,
PRIMARY KEY (sessionId, coordinator, eventId));
CREATE INDEX idx_name ON trace.trace_events (name);
{quote}
Note the missing Sessions table (all the info there was redundant) and the added fields for payload.
Finally I'm having issues using the index from thrift. Is there an incompatibility (i.e., using an IndexExpression with a CQL3 table?).
- Tests pass
> Allow tracing query details
> ---------------------------
>
> Key: CASSANDRA-1123
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1123
> Project: Cassandra
> Issue Type: New Feature
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: David Alves
> Fix For: 1.2.0
>
> Attachments: 1123-3.patch.gz, 1123.patch, 1123.patch
>
>
> In the spirit of CASSANDRA-511, it would be useful to tracing on queries to see where latency is coming from: how long did row cache lookup take? key search in the index? merging the data from the sstables? etc.
> The main difference vs setting debug logging is that debug logging is too big of a hammer; by turning on the flood of logging for everyone, you actually distort the information you're looking for. This would be something you could set per-query (or more likely per connection).
> We don't need to be as sophisticated as the techniques discussed in the following papers but they are interesting reading:
> http://research.google.com/pubs/pub36356.html
> http://www.usenix.org/events/osdi04/tech/full_papers/barham/barham_html/
> http://www.usenix.org/event/nsdi07/tech/fonseca.html
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira