You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kudu.apache.org by "Todd Lipcon (Code Review)" <ge...@cloudera.org> on 2016/04/14 03:05:57 UTC

[kudu-CR] WIP: KUDU-1410. Add per-request metrics

Todd Lipcon has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/2785

Change subject: WIP: KUDU-1410. Add per-request metrics
......................................................................

WIP: KUDU-1410. Add per-request metrics

(submitting a squashed patch for a gerrit run - will separate out into a
reasonable patch series later)

This is a WIP patch which adds a map of counters to each Trace object
and provides a nice macro to increment a given counter on the current
trace.

To keep things fast, the map is keyed by const char* pointers which
use pointer comparison rather than string comparison - this avoids
having to do any hashing on the increment path. We could probably
improve this further (eg using a different more-optimized map type)
if we start seeing it on a profile.

I also sprinkled various counters around the code that I thought
might be useful. A typical trace loops something like:

I0411 23:47:48.600158 23021 inbound_call.cc:230] Call
  kudu.tserver.TabletServerService.Write from 127.0.0.1:41490 (request call id 2029711) took 1125ms.
  Request Metrics:
{ "child_traces" : [ { "apply.queue_time_us" : 390,
        "cfile_cache_hit" : 5,
        "cfile_cache_hit_bytes" : 20510,
        "num_ops" : 1,
        "prepare.queue_time_us" : 8,
        "prepare.run_cpu_time_us" : 0,
        "prepare.run_wall_time_us" : 29
      } ] }

(pretty-printed using jsonlint -f -- the actual log is compact)

Per the referenced JIRA, I'm hoping that collecting these types of per-request
metrics and then sampling RPCs in buckets by percentile will make it really
easy to determine the general cause of slow requests, without having to worry
about plumbing request-specific metrics structures through every call stack.

rpc: clean up generated RPC service code

This changes the generated RPC service code to extract the actual
handling logic into a non-generated base class. The generated
code now just generates a map of method name to MethodInfo, each
of which contains the requisite information to handle an RPC.

This results in a small reduction in lines of non-generated code,
and a larger reduction in lines of generated code.

Furthermore, this refactor makes it easier to make improvements and
further cleanups on the RPC server side, since most changes now
won't need to affect the code generator.

Move to looking up the method from the reactor

rpc: refactor call trace logging to new RpczStore class

Add super simple "sample each RPC once a second" impl

Sample a few different latency buckets, dump on /rpcz

fix gcc compile error

Change-Id: I697c007d4945b945ca72ad18709ee8a8904f58ec
---
M src/kudu/cfile/cfile_reader.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/rpc/CMakeLists.txt
M src/kudu/rpc/connection.cc
M src/kudu/rpc/connection.h
M src/kudu/rpc/inbound_call.cc
M src/kudu/rpc/inbound_call.h
M src/kudu/rpc/messenger.cc
M src/kudu/rpc/messenger.h
M src/kudu/rpc/protoc-gen-krpc.cc
M src/kudu/rpc/rpc_context.cc
M src/kudu/rpc/rpc_context.h
M src/kudu/rpc/rpc_introspection.proto
M src/kudu/rpc/rpc_service.h
A src/kudu/rpc/rpcz_store.cc
A src/kudu/rpc/rpcz_store.h
M src/kudu/rpc/service_if.cc
M src/kudu/rpc/service_if.h
M src/kudu/rpc/service_pool.cc
M src/kudu/rpc/service_pool.h
M src/kudu/server/rpcz-path-handler.cc
M src/kudu/tablet/deltafile.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/transactions/transaction_driver.cc
M src/kudu/tablet/transactions/transaction_driver.h
M src/kudu/util/CMakeLists.txt
M src/kudu/util/mutex.cc
M src/kudu/util/threadpool.cc
M src/kudu/util/threadpool.h
M src/kudu/util/trace-test.cc
M src/kudu/util/trace.cc
M src/kudu/util/trace.h
A src/kudu/util/trace_metrics.cc
A src/kudu/util/trace_metrics.h
34 files changed, 801 insertions(+), 179 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/85/2785/1
-- 
To view, visit http://gerrit.cloudera.org:8080/2785
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I697c007d4945b945ca72ad18709ee8a8904f58ec
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>