You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@kudu.apache.org by Todd Lipcon <to...@cloudera.com> on 2016/03/02 03:55:04 UTC

RFC: consolidating tracing frameworks?

Hey folks,

I was briefly talking to some folks from the Impala podling about our
tracing infrastructure earlier today, and it occurred to me that it's
pretty messy that we basically have two distinct ways of tracing:

1) the original util/trace.h which is focused on request-scoped tracing.
This one maintains a threadlocal for the request's trace context and allows
you to 'printf' messages into the trace, but not much more.

This one is basically always on in our current implementation, so that we
can dump RPC traces on the server side when RPCs are too slow, etc. The
overhead is noticeable but I've usually found the gain in debuggability to
be worth it. All reads and writes are derived from RPCs, so they have
traces associated, and this stuff is only ever disabled in background
operations like compactions/flushes (which currently have no trace).

2) the Chromium-derived util/debug/trace_event.h, which is process wide
tracing and has no correspondence to individual requests. This one focuses
more on tracing of scopes and less on printf-style tracing (though it's
available with the TRACE_EVENT_INSTANT macros)

This one is only on when the user explicitly requests it. It's designed to
be very low overhead when off.


The fact that we have two separate tracing frameworks is somewhat ugly in
the code -- we have lots of places where we do something like:

{
  TRACE_EVENT0("txn", "AcquireRowLocks");
  TRACE("Acquiring row locks");
  ...
  TRACE("Done acquiring row locks");
}

so that an interesting scope is exposed to both tracing systems.

I'd like to propose the following:

- we mostly switch to the scope-based tracing and use the 'TRACE()' macro
more sparingly.
- we modify the trace_event tracing so that, if there is a request trace
active, we propagate these spans into the per-request tracing even if the
global tracing is not enabled.

I did a quick POC implementation in the last hour or so to see what it
looks like. Here's the example trace dump which shows up for an RPC:

0301 18:39:17.218478 (+     0us) trace_event_impl.cc:1873] prepare
[worker]: PrepareAndStartTask[txn]
0301 18:39:17.218486 (+     8us) trace_event_impl.cc:1873] prepare
[worker]: PrepareAndStart[txn], {txn:"0x7efc1400b5d0"}
0301 18:39:17.218491 (+     5us) trace_event_impl.cc:1873] prepare
[worker]: | WriteTransaction::Prepare[txn]
0301 18:39:17.218523 (+    32us) trace_event_impl.cc:1873] prepare
[worker]: | | Tablet::DecodeWriteOperations[tablet]
0301 18:39:17.218528 (+     5us) trace_event_impl.cc:1873] prepare
[worker]: | | | WriteTransactionState::AcquireSchemaLock[txn]
0301 18:39:17.218536 (+     8us) trace_event_impl.cc:2026] prepare
[worker]: | | | WriteTransactionState::AcquireSchemaLock[txn] (0.006 ms)
0301 18:39:17.218567 (+    31us) trace_event_impl.cc:2026] prepare
[worker]: | | Tablet::DecodeWriteOperations[tablet] (0.047 ms)
0301 18:39:17.218574 (+     7us) trace_event_impl.cc:1873] prepare
[worker]: | | Tablet::AcquireRowLocks[tablet], {num_locks:3}
0301 18:39:17.218620 (+    46us) trace_event_impl.cc:2026] prepare
[worker]: | | Tablet::AcquireRowLocks[tablet], {num_locks:3} (0.043 ms)
0301 18:39:17.218655 (+    35us) trace_event_impl.cc:2026] prepare
[worker]: | WriteTransaction::Prepare[txn] (0.136 ms)
0301 18:39:17.218664 (+     9us) trace_event_impl.cc:1873] prepare
[worker]: | WriteTransaction::Start[txn]
0301 18:39:17.218682 (+    18us) trace_event_impl.cc:1873] prepare
[worker]: | | Timestamp[txn], {ts:5967406519167664128}
0301 18:39:17.218687 (+     5us) trace_event_impl.cc:2026] prepare
[worker]: | WriteTransaction::Start[txn] (0.026 ms)
0301 18:39:17.218695 (+     8us) trace_event_impl.cc:1873] prepare
[worker]: | LocalConsensus::Replicate[consensus]
0301 18:39:17.218718 (+    23us) trace_event_impl.cc:1873] prepare
[worker]: | | Log::Reserve[log]
0301 18:39:17.218766 (+    48us) trace_event_impl.cc:2026] prepare
[worker]: | | Log::Reserve[log] (0.036 ms)
0301 18:39:17.218774 (+     8us) trace_event_impl.cc:1873] prepare
[worker]: | | Log::AsyncAppend[log]
0301 18:39:17.218799 (+    25us) trace_event_impl.cc:1873] prepare
[worker]: | | | Batch[log], {entry_size:489}
0301 18:39:17.218805 (+     6us) trace_event_impl.cc:2026] prepare
[worker]: | | Log::AsyncAppend[log] (0.031 ms)
0301 18:39:17.218811 (+     6us) trace_event_impl.cc:2026] prepare
[worker]: | LocalConsensus::Replicate[consensus] (0.116 ms)
0301 18:39:17.218816 (+     5us) trace_event_impl.cc:2026] prepare
[worker]: PrepareAndStart[txn], {txn:"0x7efc1400b5d0"} (0.332 ms)
0301 18:39:17.219127 (+   311us) trace_event_impl.cc:1873] apply [worker]:
WriteTransaction::Apply[txn]
0301 18:39:17.219203 (+    76us) trace_event_impl.cc:1873] apply [worker]:
| ProbeStats[tablet], {stats:"ProbeStats:
bloom_lookups=0,key_file_lookups=0,delta_file_lookups=0,mrs_lookups=0"}
0301 18:39:17.219216 (+    13us) trace_event_impl.cc:2026] apply [worker]:
WriteTransaction::Apply[txn] (0.143 ms)
0301 18:39:17.219228 (+    12us) trace_event_impl.cc:1873] apply [worker]:
AsyncAppendCommit[txn], {txn:"0x7efc1400b5d0"}
0301 18:39:17.219237 (+     9us) trace_event_impl.cc:1873] apply [worker]:
| Log::Reserve[log]
0301 18:39:17.219281 (+    44us) trace_event_impl.cc:2026] apply [worker]:
| Log::Reserve[log] (0.029 ms)
0301 18:39:17.219287 (+     6us) trace_event_impl.cc:1873] apply [worker]:
| Log::AsyncAppend[log]
0301 18:39:17.219304 (+    17us) trace_event_impl.cc:1873] apply [worker]:
| | Batch[log], {entry_size:34}
0301 18:39:17.219338 (+    34us) trace_event_impl.cc:2026] apply [worker]:
| Log::AsyncAppend[log] (0.023 ms)
0301 18:39:17.219349 (+    11us) trace_event_impl.cc:2026] apply [worker]:
AsyncAppendCommit[txn], {txn:"0x7efc1400b5d0"} (0.120 ms)
0301 18:39:17.219357 (+     8us) trace_event_impl.cc:1873] apply [worker]:
WriteTransaction::Finish[txn], {result:"COMMITTED"}
0301 18:39:17.219379 (+    22us) trace_event_impl.cc:1873] apply [worker]:
| Releasing row and schema locks[txn]
0301 18:39:17.219387 (+     8us) trace_event_impl.cc:1873] apply [worker]:
| WriteTransactionState::ReleaseSchemaLock[txn]
0301 18:39:17.219401 (+    14us) trace_event_impl.cc:2026] apply [worker]:
WriteTransaction::Finish[txn], {result:"COMMITTED"} (0.044 ms)
0301 18:39:17.219445 (+    44us) trace_event_impl.cc:1873] apply [worker]:
Releasing row and schema locks[txn]
0301 18:39:17.219450 (+     5us) trace_event_impl.cc:1873] apply [worker]:
WriteTransactionState::ReleaseSchemaLock[txn]

(obviously we'd want to remove the useless file/line numbers above, but
this was just a quick hack)

I think this is more useful than the manual printf-style tracing we tend to
add, and we've already got pretty reasonable coverage of most things that
might be slow (eg IO, etc). One potential downside is that the amount of
tracing info for longer RPCs like scans might actually be somewhat
unmanageable, so we'd probably want to add the ability to set a given trace
category to only be included if specifically requested (or if globally
enabled for a short time period). Filtering traces on time duration would
also be useful (automatically drop events if they're less than 1ms for
example).

Do people think this is worth my spending a bit of time on? The above hack
was pretty quick, but would be a bit more work to measure and reduce
overhead, etc.

-Todd
-- 
Todd Lipcon
Software Engineer, Cloudera

Re: RFC: consolidating tracing frameworks?

Posted by Todd Lipcon <to...@cloudera.com>.

On Wed, Mar 2, 2016 at 1:04 PM, Adar Dembo <ad...@cloudera.com> wrote:

> +1, I always found the presence of two tracing frameworks to be confusing
> and would support a consolidation effort.
>
> I also agree with filtering out <1ms events, maybe with a postscript that
> says "10 events filtered from trace".
>
Good idea

>
> You said that we'd use TRACE() more sparingly post-consolidation. Under
> what circumstances would it make sense to use? Or when you say "request
> trace" in the second point, you mean a TRACE() call? The idea being that
> the current proliferation of TRACE() calls would be too much overhead if
> spans were pushed into them?
>

I think we'd still use TRACE() for things where we really want to print a
free-form "log message", where there is no begin/end timing that's
interesting, and you just want to attach something to the request trace.

But in many cases where we actually have some structure, the
TRACE_EVENT_INSTANT macros are nicer, since they support lazy
stringification, and a bit more structure (the resulting traces are easy to
machine-parse). In cases where we previousl yhad a "Starting X..." followed
by a "Finished X", a TRACE_EVENTx() span makes a lot more sense.

-Todd


> On Tue, Mar 1, 2016 at 6:55 PM, Todd Lipcon <to...@cloudera.com> wrote:
>
> > Hey folks,
> >
> > I was briefly talking to some folks from the Impala podling about our
> > tracing infrastructure earlier today, and it occurred to me that it's
> > pretty messy that we basically have two distinct ways of tracing:
> >
> > 1) the original util/trace.h which is focused on request-scoped tracing.
> > This one maintains a threadlocal for the request's trace context and
> allows
> > you to 'printf' messages into the trace, but not much more.
> >
> > This one is basically always on in our current implementation, so that we
> > can dump RPC traces on the server side when RPCs are too slow, etc. The
> > overhead is noticeable but I've usually found the gain in debuggability
> to
> > be worth it. All reads and writes are derived from RPCs, so they have
> > traces associated, and this stuff is only ever disabled in background
> > operations like compactions/flushes (which currently have no trace).
> >
> > 2) the Chromium-derived util/debug/trace_event.h, which is process wide
> > tracing and has no correspondence to individual requests. This one
> focuses
> > more on tracing of scopes and less on printf-style tracing (though it's
> > available with the TRACE_EVENT_INSTANT macros)
> >
> > This one is only on when the user explicitly requests it. It's designed
> to
> > be very low overhead when off.
> >
> >
> > The fact that we have two separate tracing frameworks is somewhat ugly in
> > the code -- we have lots of places where we do something like:
> >
> > {
> >   TRACE_EVENT0("txn", "AcquireRowLocks");
> >   TRACE("Acquiring row locks");
> >   ...
> >   TRACE("Done acquiring row locks");
> > }
> >
> > so that an interesting scope is exposed to both tracing systems.
> >
> > I'd like to propose the following:
> >
> > - we mostly switch to the scope-based tracing and use the 'TRACE()' macro
> > more sparingly.
> > - we modify the trace_event tracing so that, if there is a request trace
> > active, we propagate these spans into the per-request tracing even if the
> > global tracing is not enabled.
> >
> > I did a quick POC implementation in the last hour or so to see what it
> > looks like. Here's the example trace dump which shows up for an RPC:
> >
> > 0301 18:39:17.218478 (+     0us) trace_event_impl.cc:1873] prepare
> > [worker]: PrepareAndStartTask[txn]
> > 0301 18:39:17.218486 (+     8us) trace_event_impl.cc:1873] prepare
> > [worker]: PrepareAndStart[txn], {txn:"0x7efc1400b5d0"}
> > 0301 18:39:17.218491 (+     5us) trace_event_impl.cc:1873] prepare
> > [worker]: | WriteTransaction::Prepare[txn]
> > 0301 18:39:17.218523 (+    32us) trace_event_impl.cc:1873] prepare
> > [worker]: | | Tablet::DecodeWriteOperations[tablet]
> > 0301 18:39:17.218528 (+     5us) trace_event_impl.cc:1873] prepare
> > [worker]: | | | WriteTransactionState::AcquireSchemaLock[txn]
> > 0301 18:39:17.218536 (+     8us) trace_event_impl.cc:2026] prepare
> > [worker]: | | | WriteTransactionState::AcquireSchemaLock[txn] (0.006 ms)
> > 0301 18:39:17.218567 (+    31us) trace_event_impl.cc:2026] prepare
> > [worker]: | | Tablet::DecodeWriteOperations[tablet] (0.047 ms)
> > 0301 18:39:17.218574 (+     7us) trace_event_impl.cc:1873] prepare
> > [worker]: | | Tablet::AcquireRowLocks[tablet], {num_locks:3}
> > 0301 18:39:17.218620 (+    46us) trace_event_impl.cc:2026] prepare
> > [worker]: | | Tablet::AcquireRowLocks[tablet], {num_locks:3} (0.043 ms)
> > 0301 18:39:17.218655 (+    35us) trace_event_impl.cc:2026] prepare
> > [worker]: | WriteTransaction::Prepare[txn] (0.136 ms)
> > 0301 18:39:17.218664 (+     9us) trace_event_impl.cc:1873] prepare
> > [worker]: | WriteTransaction::Start[txn]
> > 0301 18:39:17.218682 (+    18us) trace_event_impl.cc:1873] prepare
> > [worker]: | | Timestamp[txn], {ts:5967406519167664128}
> > 0301 18:39:17.218687 (+     5us) trace_event_impl.cc:2026] prepare
> > [worker]: | WriteTransaction::Start[txn] (0.026 ms)
> > 0301 18:39:17.218695 (+     8us) trace_event_impl.cc:1873] prepare
> > [worker]: | LocalConsensus::Replicate[consensus]
> > 0301 18:39:17.218718 (+    23us) trace_event_impl.cc:1873] prepare
> > [worker]: | | Log::Reserve[log]
> > 0301 18:39:17.218766 (+    48us) trace_event_impl.cc:2026] prepare
> > [worker]: | | Log::Reserve[log] (0.036 ms)
> > 0301 18:39:17.218774 (+     8us) trace_event_impl.cc:1873] prepare
> > [worker]: | | Log::AsyncAppend[log]
> > 0301 18:39:17.218799 (+    25us) trace_event_impl.cc:1873] prepare
> > [worker]: | | | Batch[log], {entry_size:489}
> > 0301 18:39:17.218805 (+     6us) trace_event_impl.cc:2026] prepare
> > [worker]: | | Log::AsyncAppend[log] (0.031 ms)
> > 0301 18:39:17.218811 (+     6us) trace_event_impl.cc:2026] prepare
> > [worker]: | LocalConsensus::Replicate[consensus] (0.116 ms)
> > 0301 18:39:17.218816 (+     5us) trace_event_impl.cc:2026] prepare
> > [worker]: PrepareAndStart[txn], {txn:"0x7efc1400b5d0"} (0.332 ms)
> > 0301 18:39:17.219127 (+   311us) trace_event_impl.cc:1873] apply
> [worker]:
> > WriteTransaction::Apply[txn]
> > 0301 18:39:17.219203 (+    76us) trace_event_impl.cc:1873] apply
> [worker]:
> > | ProbeStats[tablet], {stats:"ProbeStats:
> > bloom_lookups=0,key_file_lookups=0,delta_file_lookups=0,mrs_lookups=0"}
> > 0301 18:39:17.219216 (+    13us) trace_event_impl.cc:2026] apply
> [worker]:
> > WriteTransaction::Apply[txn] (0.143 ms)
> > 0301 18:39:17.219228 (+    12us) trace_event_impl.cc:1873] apply
> [worker]:
> > AsyncAppendCommit[txn], {txn:"0x7efc1400b5d0"}
> > 0301 18:39:17.219237 (+     9us) trace_event_impl.cc:1873] apply
> [worker]:
> > | Log::Reserve[log]
> > 0301 18:39:17.219281 (+    44us) trace_event_impl.cc:2026] apply
> [worker]:
> > | Log::Reserve[log] (0.029 ms)
> > 0301 18:39:17.219287 (+     6us) trace_event_impl.cc:1873] apply
> [worker]:
> > | Log::AsyncAppend[log]
> > 0301 18:39:17.219304 (+    17us) trace_event_impl.cc:1873] apply
> [worker]:
> > | | Batch[log], {entry_size:34}
> > 0301 18:39:17.219338 (+    34us) trace_event_impl.cc:2026] apply
> [worker]:
> > | Log::AsyncAppend[log] (0.023 ms)
> > 0301 18:39:17.219349 (+    11us) trace_event_impl.cc:2026] apply
> [worker]:
> > AsyncAppendCommit[txn], {txn:"0x7efc1400b5d0"} (0.120 ms)
> > 0301 18:39:17.219357 (+     8us) trace_event_impl.cc:1873] apply
> [worker]:
> > WriteTransaction::Finish[txn], {result:"COMMITTED"}
> > 0301 18:39:17.219379 (+    22us) trace_event_impl.cc:1873] apply
> [worker]:
> > | Releasing row and schema locks[txn]
> > 0301 18:39:17.219387 (+     8us) trace_event_impl.cc:1873] apply
> [worker]:
> > | WriteTransactionState::ReleaseSchemaLock[txn]
> > 0301 18:39:17.219401 (+    14us) trace_event_impl.cc:2026] apply
> [worker]:
> > WriteTransaction::Finish[txn], {result:"COMMITTED"} (0.044 ms)
> > 0301 18:39:17.219445 (+    44us) trace_event_impl.cc:1873] apply
> [worker]:
> > Releasing row and schema locks[txn]
> > 0301 18:39:17.219450 (+     5us) trace_event_impl.cc:1873] apply
> [worker]:
> > WriteTransactionState::ReleaseSchemaLock[txn]
> >
> > (obviously we'd want to remove the useless file/line numbers above, but
> > this was just a quick hack)
> >
> > I think this is more useful than the manual printf-style tracing we tend
> to
> > add, and we've already got pretty reasonable coverage of most things that
> > might be slow (eg IO, etc). One potential downside is that the amount of
> > tracing info for longer RPCs like scans might actually be somewhat
> > unmanageable, so we'd probably want to add the ability to set a given
> trace
> > category to only be included if specifically requested (or if globally
> > enabled for a short time period). Filtering traces on time duration would
> > also be useful (automatically drop events if they're less than 1ms for
> > example).
> >
> > Do people think this is worth my spending a bit of time on? The above
> hack
> > was pretty quick, but would be a bit more work to measure and reduce
> > overhead, etc.
> >
> > -Todd
> > --
> > Todd Lipcon
> > Software Engineer, Cloudera
> >
>



-- 
Todd Lipcon
Software Engineer, Cloudera

Re: RFC: consolidating tracing frameworks?

Posted by Adar Dembo <ad...@cloudera.com>.

+1, I always found the presence of two tracing frameworks to be confusing
and would support a consolidation effort.

I also agree with filtering out <1ms events, maybe with a postscript that
says "10 events filtered from trace".

You said that we'd use TRACE() more sparingly post-consolidation. Under
what circumstances would it make sense to use? Or when you say "request
trace" in the second point, you mean a TRACE() call? The idea being that
the current proliferation of TRACE() calls would be too much overhead if
spans were pushed into them?



On Tue, Mar 1, 2016 at 6:55 PM, Todd Lipcon <to...@cloudera.com> wrote:

> Hey folks,
>
> I was briefly talking to some folks from the Impala podling about our
> tracing infrastructure earlier today, and it occurred to me that it's
> pretty messy that we basically have two distinct ways of tracing:
>
> 1) the original util/trace.h which is focused on request-scoped tracing.
> This one maintains a threadlocal for the request's trace context and allows
> you to 'printf' messages into the trace, but not much more.
>
> This one is basically always on in our current implementation, so that we
> can dump RPC traces on the server side when RPCs are too slow, etc. The
> overhead is noticeable but I've usually found the gain in debuggability to
> be worth it. All reads and writes are derived from RPCs, so they have
> traces associated, and this stuff is only ever disabled in background
> operations like compactions/flushes (which currently have no trace).
>
> 2) the Chromium-derived util/debug/trace_event.h, which is process wide
> tracing and has no correspondence to individual requests. This one focuses
> more on tracing of scopes and less on printf-style tracing (though it's
> available with the TRACE_EVENT_INSTANT macros)
>
> This one is only on when the user explicitly requests it. It's designed to
> be very low overhead when off.
>
>
> The fact that we have two separate tracing frameworks is somewhat ugly in
> the code -- we have lots of places where we do something like:
>
> {
>   TRACE_EVENT0("txn", "AcquireRowLocks");
>   TRACE("Acquiring row locks");
>   ...
>   TRACE("Done acquiring row locks");
> }
>
> so that an interesting scope is exposed to both tracing systems.
>
> I'd like to propose the following:
>
> - we mostly switch to the scope-based tracing and use the 'TRACE()' macro
> more sparingly.
> - we modify the trace_event tracing so that, if there is a request trace
> active, we propagate these spans into the per-request tracing even if the
> global tracing is not enabled.
>
> I did a quick POC implementation in the last hour or so to see what it
> looks like. Here's the example trace dump which shows up for an RPC:
>
> 0301 18:39:17.218478 (+     0us) trace_event_impl.cc:1873] prepare
> [worker]: PrepareAndStartTask[txn]
> 0301 18:39:17.218486 (+     8us) trace_event_impl.cc:1873] prepare
> [worker]: PrepareAndStart[txn], {txn:"0x7efc1400b5d0"}
> 0301 18:39:17.218491 (+     5us) trace_event_impl.cc:1873] prepare
> [worker]: | WriteTransaction::Prepare[txn]
> 0301 18:39:17.218523 (+    32us) trace_event_impl.cc:1873] prepare
> [worker]: | | Tablet::DecodeWriteOperations[tablet]
> 0301 18:39:17.218528 (+     5us) trace_event_impl.cc:1873] prepare
> [worker]: | | | WriteTransactionState::AcquireSchemaLock[txn]
> 0301 18:39:17.218536 (+     8us) trace_event_impl.cc:2026] prepare
> [worker]: | | | WriteTransactionState::AcquireSchemaLock[txn] (0.006 ms)
> 0301 18:39:17.218567 (+    31us) trace_event_impl.cc:2026] prepare
> [worker]: | | Tablet::DecodeWriteOperations[tablet] (0.047 ms)
> 0301 18:39:17.218574 (+     7us) trace_event_impl.cc:1873] prepare
> [worker]: | | Tablet::AcquireRowLocks[tablet], {num_locks:3}
> 0301 18:39:17.218620 (+    46us) trace_event_impl.cc:2026] prepare
> [worker]: | | Tablet::AcquireRowLocks[tablet], {num_locks:3} (0.043 ms)
> 0301 18:39:17.218655 (+    35us) trace_event_impl.cc:2026] prepare
> [worker]: | WriteTransaction::Prepare[txn] (0.136 ms)
> 0301 18:39:17.218664 (+     9us) trace_event_impl.cc:1873] prepare
> [worker]: | WriteTransaction::Start[txn]
> 0301 18:39:17.218682 (+    18us) trace_event_impl.cc:1873] prepare
> [worker]: | | Timestamp[txn], {ts:5967406519167664128}
> 0301 18:39:17.218687 (+     5us) trace_event_impl.cc:2026] prepare
> [worker]: | WriteTransaction::Start[txn] (0.026 ms)
> 0301 18:39:17.218695 (+     8us) trace_event_impl.cc:1873] prepare
> [worker]: | LocalConsensus::Replicate[consensus]
> 0301 18:39:17.218718 (+    23us) trace_event_impl.cc:1873] prepare
> [worker]: | | Log::Reserve[log]
> 0301 18:39:17.218766 (+    48us) trace_event_impl.cc:2026] prepare
> [worker]: | | Log::Reserve[log] (0.036 ms)
> 0301 18:39:17.218774 (+     8us) trace_event_impl.cc:1873] prepare
> [worker]: | | Log::AsyncAppend[log]
> 0301 18:39:17.218799 (+    25us) trace_event_impl.cc:1873] prepare
> [worker]: | | | Batch[log], {entry_size:489}
> 0301 18:39:17.218805 (+     6us) trace_event_impl.cc:2026] prepare
> [worker]: | | Log::AsyncAppend[log] (0.031 ms)
> 0301 18:39:17.218811 (+     6us) trace_event_impl.cc:2026] prepare
> [worker]: | LocalConsensus::Replicate[consensus] (0.116 ms)
> 0301 18:39:17.218816 (+     5us) trace_event_impl.cc:2026] prepare
> [worker]: PrepareAndStart[txn], {txn:"0x7efc1400b5d0"} (0.332 ms)
> 0301 18:39:17.219127 (+   311us) trace_event_impl.cc:1873] apply [worker]:
> WriteTransaction::Apply[txn]
> 0301 18:39:17.219203 (+    76us) trace_event_impl.cc:1873] apply [worker]:
> | ProbeStats[tablet], {stats:"ProbeStats:
> bloom_lookups=0,key_file_lookups=0,delta_file_lookups=0,mrs_lookups=0"}
> 0301 18:39:17.219216 (+    13us) trace_event_impl.cc:2026] apply [worker]:
> WriteTransaction::Apply[txn] (0.143 ms)
> 0301 18:39:17.219228 (+    12us) trace_event_impl.cc:1873] apply [worker]:
> AsyncAppendCommit[txn], {txn:"0x7efc1400b5d0"}
> 0301 18:39:17.219237 (+     9us) trace_event_impl.cc:1873] apply [worker]:
> | Log::Reserve[log]
> 0301 18:39:17.219281 (+    44us) trace_event_impl.cc:2026] apply [worker]:
> | Log::Reserve[log] (0.029 ms)
> 0301 18:39:17.219287 (+     6us) trace_event_impl.cc:1873] apply [worker]:
> | Log::AsyncAppend[log]
> 0301 18:39:17.219304 (+    17us) trace_event_impl.cc:1873] apply [worker]:
> | | Batch[log], {entry_size:34}
> 0301 18:39:17.219338 (+    34us) trace_event_impl.cc:2026] apply [worker]:
> | Log::AsyncAppend[log] (0.023 ms)
> 0301 18:39:17.219349 (+    11us) trace_event_impl.cc:2026] apply [worker]:
> AsyncAppendCommit[txn], {txn:"0x7efc1400b5d0"} (0.120 ms)
> 0301 18:39:17.219357 (+     8us) trace_event_impl.cc:1873] apply [worker]:
> WriteTransaction::Finish[txn], {result:"COMMITTED"}
> 0301 18:39:17.219379 (+    22us) trace_event_impl.cc:1873] apply [worker]:
> | Releasing row and schema locks[txn]
> 0301 18:39:17.219387 (+     8us) trace_event_impl.cc:1873] apply [worker]:
> | WriteTransactionState::ReleaseSchemaLock[txn]
> 0301 18:39:17.219401 (+    14us) trace_event_impl.cc:2026] apply [worker]:
> WriteTransaction::Finish[txn], {result:"COMMITTED"} (0.044 ms)
> 0301 18:39:17.219445 (+    44us) trace_event_impl.cc:1873] apply [worker]:
> Releasing row and schema locks[txn]
> 0301 18:39:17.219450 (+     5us) trace_event_impl.cc:1873] apply [worker]:
> WriteTransactionState::ReleaseSchemaLock[txn]
>
> (obviously we'd want to remove the useless file/line numbers above, but
> this was just a quick hack)
>
> I think this is more useful than the manual printf-style tracing we tend to
> add, and we've already got pretty reasonable coverage of most things that
> might be slow (eg IO, etc). One potential downside is that the amount of
> tracing info for longer RPCs like scans might actually be somewhat
> unmanageable, so we'd probably want to add the ability to set a given trace
> category to only be included if specifically requested (or if globally
> enabled for a short time period). Filtering traces on time duration would
> also be useful (automatically drop events if they're less than 1ms for
> example).
>
> Do people think this is worth my spending a bit of time on? The above hack
> was pretty quick, but would be a bit more work to measure and reduce
> overhead, etc.
>
> -Todd
> --
> Todd Lipcon
> Software Engineer, Cloudera
>