You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@kudu.apache.org by da...@apache.org on 2017/05/18 15:57:33 UTC

[1/5] kudu git commit: process_memory: go back to non-incremental tracking

Repository: kudu
Updated Branches:
  refs/heads/master c178563f6 -> 4347174d7


process_memory: go back to non-incremental tracking

The incremental tracking has more overhead than expected, even with our
fancy striped counters, etc. In tpch_real_world benchmarks, the
LongAdder::IncrementBy() and GetAllocatedSize() calls in the hooks
were taking a significant amount of CPU (in the top 10 consumers).

Here's yet another approach: back to non-incremental tracking, but with
an optimization that, if we've calculated the memory usage in the last
50ms, or if another thread is already in the process of calculating, we
don't re-calculate it.

This has some minor risk of "overshooting" the memory limit, but since
our limiting is already probabilistic and not 100% thorough anyway, I
think that's acceptable.

Benchmarked using process_memory-test:

Before:
I0517 16:46:09.596946  7419 process_memory-test.cc:68] Performed 2.18441e+06 iters/sec

After:
I0517 16:45:50.497973  7051 process_memory-test.cc:68] Performed 1.22264e+08 iters/sec

(55x speedup)

I also ran tpch_real_world and graphed CPU consumed per inserted row and
found it to be noticeably better with this optimized version
(approximately 7-8% by eye-balling it). I spot-checked perf results and
don't see any tcmalloc stat-related methods in the high consumers list
anymore. I also checked in this workload that the heap usage was
properly limited (indistinguishable from the before-patch
implementation)

Change-Id: I8823028de3ea260f1450d9bf34af2dc5a794b206
Reviewed-on: http://gerrit.cloudera.org:8080/6915
Tested-by: Kudu Jenkins
Reviewed-by: Adar Dembo <ad...@cloudera.com>


Project: http://git-wip-us.apache.org/repos/asf/kudu/repo
Commit: http://git-wip-us.apache.org/repos/asf/kudu/commit/1c5d91b5
Tree: http://git-wip-us.apache.org/repos/asf/kudu/tree/1c5d91b5
Diff: http://git-wip-us.apache.org/repos/asf/kudu/diff/1c5d91b5

Branch: refs/heads/master
Commit: 1c5d91b588895009b516359cf59f76c9165bd2cc
Parents: c178563
Author: Todd Lipcon <to...@apache.org>
Authored: Wed May 17 16:42:40 2017 -0700
Committer: Todd Lipcon <to...@apache.org>
Committed: Thu May 18 03:28:26 2017 +0000

----------------------------------------------------------------------
 src/kudu/tserver/tablet_server-stress-test.cc | 17 ++++---
 src/kudu/util/process_memory.cc               | 52 ++++++++--------------
 2 files changed, 28 insertions(+), 41 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/kudu/blob/1c5d91b5/src/kudu/tserver/tablet_server-stress-test.cc
----------------------------------------------------------------------
diff --git a/src/kudu/tserver/tablet_server-stress-test.cc b/src/kudu/tserver/tablet_server-stress-test.cc
index 8c4059f..761c62b 100644
--- a/src/kudu/tserver/tablet_server-stress-test.cc
+++ b/src/kudu/tserver/tablet_server-stress-test.cc
@@ -141,13 +141,16 @@ TEST_F(TSStressTest, TestMTInserts) {
   if (timeout_thread.joinable()) timeout_thread.join();
 
 #ifdef TCMALLOC_ENABLED
-  // In TCMalloc-enabled builds, verify that our incremental memory tracking matches the
-  // actual memory consumed, within half a percent.
-  int64_t consumption = process_memory::CurrentConsumption();
-  LOG(INFO) << "consumption: " << consumption;
-  ASSERT_NEAR(process_memory::GetTCMallocCurrentAllocatedBytes(),
-              consumption,
-              consumption * 0.005);
+  // In TCMalloc-enabled builds, verify that our memory tracking matches the
+  // actual memory consumed, within a short period of time (the memory tracking
+  // can lag by up to 50ms).
+  ASSERT_EVENTUALLY([&]() {
+      int64_t consumption = process_memory::CurrentConsumption();
+      LOG(INFO) << "consumption: " << consumption;
+      ASSERT_NEAR(process_memory::GetTCMallocCurrentAllocatedBytes(),
+                  consumption,
+                  consumption * 0.005);
+    });
 #endif
 }
 

http://git-wip-us.apache.org/repos/asf/kudu/blob/1c5d91b5/src/kudu/util/process_memory.cc
----------------------------------------------------------------------
diff --git a/src/kudu/util/process_memory.cc b/src/kudu/util/process_memory.cc
index cac1e14..986e15a 100644
--- a/src/kudu/util/process_memory.cc
+++ b/src/kudu/util/process_memory.cc
@@ -19,10 +19,10 @@
 
 #include <gflags/gflags.h>
 #include <gperftools/malloc_extension.h>
-#include <gperftools/malloc_hook.h>
 
 #include "kudu/gutil/once.h"
 #include "kudu/gutil/strings/substitute.h"
+#include "kudu/gutil/walltime.h"
 #include "kudu/util/debug/trace_event.h"
 #include "kudu/util/env.h"
 #include "kudu/util/flag_tags.h"
@@ -86,10 +86,6 @@ Atomic64 g_released_memory_since_gc;
 // TODO(todd): this is a stopgap.
 const int64_t GC_RELEASE_SIZE = 128 * 1024L * 1024L;
 
-// The total memory consumption of the process, incrementally tracked by
-// New/Delete hooks.
-LongAdder* g_total_consumption = nullptr;
-
 #endif // TCMALLOC_ENABLED
 
 } // anonymous namespace
@@ -188,39 +184,27 @@ void InitLimits() {
   GoogleOnceInit(&once, &DoInitLimits);
 }
 
-#ifdef TCMALLOC_ENABLED
-
-void NewHook(const void* ptr, size_t size) {
-  // We ignore 'size' because it's possible the allocated size is actually
-  // rounded up to the next biggest slab class.
-  ssize_t alloc_size = MallocExtension::instance()->GetAllocatedSize(ptr);
-  g_total_consumption->IncrementBy(alloc_size);
-}
-void DeleteHook(const void* ptr) {
-  ssize_t size = MallocExtension::instance()->GetAllocatedSize(ptr);
-  g_total_consumption->IncrementBy(-size);
-}
-
-void InitConsumptionHook() {
-  CHECK(!g_total_consumption);
-  g_total_consumption = new LongAdder();
-  g_total_consumption->IncrementBy(GetTCMallocCurrentAllocatedBytes());
-  MallocHook::AddNewHook(&NewHook);
-  MallocHook::AddDeleteHook(&DeleteHook);
-}
-#endif // TCMALLOC_ENABLED
-
 } // anonymous namespace
 
 int64_t CurrentConsumption() {
 #ifdef TCMALLOC_ENABLED
-  // Calling GetTCMallocCurrentAllocatedBytes() is too slow to do frequently, since it has to
-  // take the tcmalloc global lock and traverse all live threads to add up their caches.
-  // Instead, we install a new/delete hook to incrementally track allocated memory, and
-  // then just return the currently tracked value here.
-  static GoogleOnceType once;
-  GoogleOnceInit(&once, &InitConsumptionHook);
-  return g_total_consumption->Value();
+  const int64_t kReadIntervalMicros = 50000;
+  static Atomic64 last_read_time = 0;
+  static simple_spinlock read_lock;
+  static Atomic64 consumption = 0;
+  uint64_t time = GetMonoTimeMicros();
+  if (time > last_read_time + kReadIntervalMicros && read_lock.try_lock()) {
+    base::subtle::NoBarrier_Store(&consumption, GetTCMallocCurrentAllocatedBytes());
+    // Re-fetch the time after getting the consumption. This way, in case fetching
+    // consumption is extremely slow for some reason (eg due to lots of contention
+    // in tcmalloc) we at least ensure that we wait at least another full interval
+    // before fetching the information again.
+    time = GetMonoTimeMicros();
+    base::subtle::NoBarrier_Store(&last_read_time, time);
+    read_lock.unlock();
+  }
+
+  return base::subtle::NoBarrier_Load(&consumption);
 #else
   // Without tcmalloc, we have no reliable way of determining our own heap
   // size (e.g. mallinfo doesn't work in ASAN builds). So, we'll fall back

[5/5] kudu git commit: Fix broken url to "Building a Better Bloom Filter"

Posted by da...@apache.org.

Fix broken url to "Building a Better Bloom Filter"

Change-Id: Ic113677c7ff4c2080c0ac07cb8f90db77aaeaa52
Reviewed-on: http://gerrit.cloudera.org:8080/6920
Reviewed-by: Dan Burkert <da...@apache.org>
Tested-by: Dan Burkert <da...@apache.org>


Project: http://git-wip-us.apache.org/repos/asf/kudu/repo
Commit: http://git-wip-us.apache.org/repos/asf/kudu/commit/4347174d
Tree: http://git-wip-us.apache.org/repos/asf/kudu/tree/4347174d
Diff: http://git-wip-us.apache.org/repos/asf/kudu/diff/4347174d

Branch: refs/heads/master
Commit: 4347174d7e78f27a69c28ef1c8e551649d2173c5
Parents: 36debad
Author: André Laszlo <an...@meltwater.com>
Authored: Thu May 18 16:38:10 2017 +0200
Committer: Dan Burkert <da...@apache.org>
Committed: Thu May 18 15:57:10 2017 +0000

----------------------------------------------------------------------
 src/kudu/util/bloom_filter.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/kudu/blob/4347174d/src/kudu/util/bloom_filter.h
----------------------------------------------------------------------
diff --git a/src/kudu/util/bloom_filter.h b/src/kudu/util/bloom_filter.h
index cbf5250..419de3b 100644
--- a/src/kudu/util/bloom_filter.h
+++ b/src/kudu/util/bloom_filter.h
@@ -33,7 +33,7 @@ namespace kudu {
 // This is implemented based on the idea of double-hashing from the following paper:
 //   "Less Hashing, Same Performance: Building a Better Bloom Filter"
 //   Kirsch and Mitzenmacher, ESA 2006
-//   http://www.eecs.harvard.edu/~kirsch/pubs/bbbf/esa06.pdf
+//   https://www.eecs.harvard.edu/~michaelm/postscripts/tr-02-05.pdf
 //
 // Currently, the implementation uses the 64-bit City Hash.
 // TODO: an SSE CRC32 hash is probably ~20% faster. Come back to this

[4/5] kudu git commit: [c++ client] less logging about authn token

Posted by da...@apache.org.

[c++ client] less logging about authn token

LOG(INFO) --> VLOG(2) on receiving and adopting authn token by
the Kudu C++ client.

Change-Id: I13693bc89e528eb4dfdd0e85d6d5b8faac8edadf
Reviewed-on: http://gerrit.cloudera.org:8080/6914
Tested-by: Kudu Jenkins
Reviewed-by: Todd Lipcon <to...@apache.org>


Project: http://git-wip-us.apache.org/repos/asf/kudu/repo
Commit: http://git-wip-us.apache.org/repos/asf/kudu/commit/36debad5
Tree: http://git-wip-us.apache.org/repos/asf/kudu/tree/36debad5
Diff: http://git-wip-us.apache.org/repos/asf/kudu/diff/36debad5

Branch: refs/heads/master
Commit: 36debad5fcbd6be3ab82c33e6cda1f7f349c643f
Parents: 352c0e3
Author: Alexey Serbin <as...@cloudera.com>
Authored: Wed May 17 16:37:00 2017 -0700
Committer: Todd Lipcon <to...@apache.org>
Committed: Thu May 18 05:16:39 2017 +0000

----------------------------------------------------------------------
 src/kudu/client/client-internal.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/kudu/blob/36debad5/src/kudu/client/client-internal.cc
----------------------------------------------------------------------
diff --git a/src/kudu/client/client-internal.cc b/src/kudu/client/client-internal.cc
index 30723d2..6a61adc 100644
--- a/src/kudu/client/client-internal.cc
+++ b/src/kudu/client/client-internal.cc
@@ -654,7 +654,7 @@ void KuduClient::Data::ConnectedToClusterCb(
   // Adopt the authentication token from the response, if it's been set.
   if (connect_response.has_authn_token()) {
     messenger_->set_authn_token(connect_response.authn_token());
-    LOG(INFO) << "Received and adopted authn token";
+    VLOG(2) << "Received and adopted authn token";
   }
 
   vector<StatusCallback> cbs;

[2/5] kudu git commit: KUDU-2017. Fix calculation of perf improvement for flushes

Posted by da...@apache.org.

KUDU-2017. Fix calculation of perf improvement for flushes

The code to calculate the MM "perf improvement score" for a flush had two bugs:

  (1) we calculated threshold - current_usage instead of current_usage -
  threshold, which resulted in a negative score.

  (2) we had an unsigned integer overflow, which resulted in the above negative
  score becoming insanely large.

These two wrongs "made a right" in which we'd still trigger flushes at the
flush threshold, which is why this went unnoticed for quite some time. However,
the flushing behavior is more aggressive than originally intended, and we would
lose the correct prioritization of flushing tablets that are farther above the
threshold.

This patch addresses the issue.

I tested using tpch_real_world against a server configured with a 40G
memory limit. I verified the 'perf improvement' just prior to a flush
was the expected value (151 perf improvement listed for a tablet with
1.15G MRS and 1G flush threshold)

Change-Id: I8bcf443db857ce172b8e83758eb73a24c4b0c6af
Reviewed-on: http://gerrit.cloudera.org:8080/6908
Reviewed-by: Adar Dembo <ad...@cloudera.com>
Tested-by: Kudu Jenkins


Project: http://git-wip-us.apache.org/repos/asf/kudu/repo
Commit: http://git-wip-us.apache.org/repos/asf/kudu/commit/e1104e4b
Tree: http://git-wip-us.apache.org/repos/asf/kudu/tree/e1104e4b
Diff: http://git-wip-us.apache.org/repos/asf/kudu/diff/e1104e4b

Branch: refs/heads/master
Commit: e1104e4b8e64459d6a90c24d41ccc764d377c85c
Parents: 1c5d91b
Author: Todd Lipcon <to...@cloudera.com>
Authored: Tue May 16 17:08:20 2017 -0700
Committer: Todd Lipcon <to...@apache.org>
Committed: Thu May 18 05:04:14 2017 +0000

----------------------------------------------------------------------
 src/kudu/tablet/tablet_replica-test.cc   | 4 ++--
 src/kudu/tablet/tablet_replica_mm_ops.cc | 9 +++++----
 2 files changed, 7 insertions(+), 6 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/kudu/blob/e1104e4b/src/kudu/tablet/tablet_replica-test.cc
----------------------------------------------------------------------
diff --git a/src/kudu/tablet/tablet_replica-test.cc b/src/kudu/tablet/tablet_replica-test.cc
index c48da6a..1a02b61 100644
--- a/src/kudu/tablet/tablet_replica-test.cc
+++ b/src/kudu/tablet/tablet_replica-test.cc
@@ -547,10 +547,10 @@ TEST_F(TabletReplicaTest, TestFlushOpsPerfImprovements) {
   ASSERT_GT(stats.perf_improvement(), 0.01);
   stats.Clear();
 
-  // Way over the threshold, number is much higher than 1.
+  // Over the threshold, we expect improvement equal to the excess MB.
   stats.set_ram_anchored(128 * 1024 * 1024);
   FlushOpPerfImprovementPolicy::SetPerfImprovementForFlush(&stats, 1);
-  ASSERT_LT(1.0, stats.perf_improvement());
+  ASSERT_NEAR(stats.perf_improvement(), 64, 0.01);
   stats.Clear();
 
   // Below the threshold but have been there a long time, closing in to 1.0.

http://git-wip-us.apache.org/repos/asf/kudu/blob/e1104e4b/src/kudu/tablet/tablet_replica_mm_ops.cc
----------------------------------------------------------------------
diff --git a/src/kudu/tablet/tablet_replica_mm_ops.cc b/src/kudu/tablet/tablet_replica_mm_ops.cc
index 85650ff..f2d16e2 100644
--- a/src/kudu/tablet/tablet_replica_mm_ops.cc
+++ b/src/kudu/tablet/tablet_replica_mm_ops.cc
@@ -65,15 +65,16 @@ const double kFlushUpperBoundMs = 60 * 60 * 1000;
 
 void FlushOpPerfImprovementPolicy::SetPerfImprovementForFlush(MaintenanceOpStats* stats,
                                                               double elapsed_ms) {
-  if (stats->ram_anchored() > FLAGS_flush_threshold_mb * 1024 * 1024) {
+  double anchored_mb = static_cast<double>(stats->ram_anchored()) / (1024 * 1024);
+  if (anchored_mb > FLAGS_flush_threshold_mb) {
     // If we're over the user-specified flush threshold, then consider the perf
     // improvement to be 1 for every extra MB.  This produces perf_improvement results
-    // which are much higher than any compaction would produce, and means that, when
+    // which are much higher than most compactions would produce, and means that, when
     // there is an MRS over threshold, a flush will almost always be selected instead of
     // a compaction.  That's not necessarily a good thing, but in the absence of better
     // heuristics, it will do for now.
-    double extra_mb =
-        static_cast<double>(FLAGS_flush_threshold_mb - (stats->ram_anchored()) / (1024 * 1024));
+    double extra_mb = anchored_mb - static_cast<double>(FLAGS_flush_threshold_mb);
+    DCHECK_GE(extra_mb, 0);
     stats->set_perf_improvement(extra_mb);
   } else if (elapsed_ms > FLAGS_flush_threshold_secs * 1000) {
     // Even if we aren't over the threshold, consider flushing if we haven't flushed

[3/5] kudu git commit: KUDU-1998. Re-enable tcmalloc on OSX

Posted by da...@apache.org.

KUDU-1998. Re-enable tcmalloc on OSX

We switched back away from using malloc hooks for memory tracking, so
this shouldn't be an issue anymore.

Change-Id: I9bb3329aa4c12595dedf27d022558671e90206fb
Reviewed-on: http://gerrit.cloudera.org:8080/6917
Reviewed-by: Adar Dembo <ad...@cloudera.com>
Tested-by: Kudu Jenkins


Project: http://git-wip-us.apache.org/repos/asf/kudu/repo
Commit: http://git-wip-us.apache.org/repos/asf/kudu/commit/352c0e3b
Tree: http://git-wip-us.apache.org/repos/asf/kudu/tree/352c0e3b
Diff: http://git-wip-us.apache.org/repos/asf/kudu/diff/352c0e3b

Branch: refs/heads/master
Commit: 352c0e3bdaf36e44f5de6f5be2b9fbe9def1d6b2
Parents: e1104e4
Author: Todd Lipcon <to...@apache.org>
Authored: Wed May 17 21:28:32 2017 -0700
Committer: Todd Lipcon <to...@apache.org>
Committed: Thu May 18 05:04:19 2017 +0000

----------------------------------------------------------------------
 CMakeLists.txt | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/kudu/blob/352c0e3b/CMakeLists.txt
----------------------------------------------------------------------
diff --git a/CMakeLists.txt b/CMakeLists.txt
index c0bca83..c103802 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -889,10 +889,8 @@ ADD_THIRDPARTY_LIB(krb5
 ##
 ## Disabled with TSAN/ASAN as well as with gold+dynamic linking (see comment
 ## near definition of KUDU_USING_GOLD).
-## As a workaround for KUDU-1998, tcmalloc is disabled in OS X builds.
 find_package(GPerf REQUIRED)
-if (NOT APPLE AND
-    NOT "${KUDU_USE_ASAN}" AND
+if (NOT "${KUDU_USE_ASAN}" AND
     NOT "${KUDU_USE_TSAN}" AND
     NOT ("${KUDU_USING_GOLD}" AND "${KUDU_LINK}" STREQUAL "d"))
   ADD_THIRDPARTY_LIB(tcmalloc