You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Jonathan Ellis (JIRA)" <ji...@apache.org> on 2012/06/30 03:07:44 UTC
[jira] [Assigned] (CASSANDRA-1123) Allow tracing query details

     [ https://issues.apache.org/jira/browse/CASSANDRA-1123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis reassigned CASSANDRA-1123:
-----------------------------------------

    Assignee: David Alves  (was: Aaron Morton)

So, super high level view of how I think this should work:

Store events like this:
{code}
CREATE TABLE trace_sessions (
  session int,
  coordinator inetaddr,
  request text,
  PRIMARY KEY (session, coordinator)
);

CREATE TABLE trace_events (
  session     int,
  coordinator inetaddr,
  id     uuid,
  source inetaddr,
  event  text,
  happened_at timestamp,
  duration int,
  PRIMARY KEY (session, coordinator, id)
);
{code}

The session table will log the operation being processed, and the events will log its progress.

Coordinator and replicas will all just throw events into this table; they don't have to keep anything in memory.  Then we get the permanent record, AND clients can query the log directly without needing to add more special-case code to the coordinator.

One way to tell the client about session ids would be to change Aaron's API a bit: {{int system_trace_next_query()}}.  Then clients can just "SELECT * from trace_events WHERE coordinator = ? and session = ?" if desired.

Alternatively we could retain the "enable tracing on all queries until I turn it off" and add {{int system_get_last_query_session()}} but I prefer the one-method version for simplicity and because tracing probably shouldn't be left on indefinitely.

No idea how best to expose to a Thrift-less CQL.  Should we create a special-cased function there too?

Notes:
- coordinator will generate the session id locally.  Session ids will NOT be globally unique; (id, coordinator) pair identifies a series of events
- session comes before coordinator since we want id to be the partition key; this will create narrower rows
- "source" is the machine logging the event, not necessarily the same as coordinator (better name needed?)
- happened_at will come from currentTimeMillis, so will not be high resolution enough to give us the duration of many events; instead, we should use nanotime on coordinator (and replicas) to compute duration
- We should TTL event data (1 day?) both to conserve space and to avoid problems when event ids wrap
- Tracing N% of queries randomly and providing an external session id can be broken out into separate tickets
- Not sure if the threadlocal approach I originally advocated is a good one.  Certainly there are important aspects that are not covered by the "main" coordinator and replica threads: commitlog on the replica, and response processing on the coordinator; we may add more such thread pools in the future.
- Aaron's hacking of the slf4j adapator is clever, but I think we should leave it out; you'd still have to piece together what happened across multiple machines.  Logging into a columnfamily instead solves that problem and makes the data more accessible than log files.
- Probably worth encapsulating "log an event to the columnfamily if tracing is enabled, otherwise log at debug to log4j" into a utility method, though.
                
> Allow tracing query details
> ---------------------------
>
>                 Key: CASSANDRA-1123
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1123
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: David Alves
>             Fix For: 1.2
>
>         Attachments: 1123-3.patch.gz
>
>
> In the spirit of CASSANDRA-511, it would be useful to tracing on queries to see where latency is coming from: how long did row cache lookup take?  key search in the index?  merging the data from the sstables?  etc.
> The main difference vs setting debug logging is that debug logging is too big of a hammer; by turning on the flood of logging for everyone, you actually distort the information you're looking for.  This would be something you could set per-query (or more likely per connection).
> We don't need to be as sophisticated as the techniques discussed in the following papers but they are interesting reading:
> http://research.google.com/pubs/pub36356.html
> http://www.usenix.org/events/osdi04/tech/full_papers/barham/barham_html/
> http://www.usenix.org/event/nsdi07/tech/fonseca.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira