You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@kudu.apache.org by to...@apache.org on 2016/10/05 21:29:42 UTC

[1/4] kudu git commit: KUDU-1674: Fix SubProcess:Call SEGV when trying to capture stderr alone

Repository: kudu
Updated Branches:
  refs/heads/master 3830fc972 -> 64f9ab34f


KUDU-1674: Fix SubProcess:Call SEGV when trying to capture stderr alone

ReadFdsFully() captures either stdout or stderr or both depending on how
caller has registered. Once the helpers are done with capturing, it needs
take into account that there can be less than 2 fds registered from the
caller and hence can not do std::move(vector[1]) operation on it.

Also added a test code to exercise combination of:
SubProcess::Call({argv}, nullptr, &stderr);
SubProcess::Call({argv}, &stdout, nullptr);
SubProcess::Call({argv}, nullptr, nullptr);

Change-Id: I67bd462098526a9ba032669b621b8139b56ee5b8
Reviewed-on: http://gerrit.cloudera.org:8080/4594
Tested-by: Kudu Jenkins
Reviewed-by: Alexey Serbin <as...@cloudera.com>


Project: http://git-wip-us.apache.org/repos/asf/kudu/repo
Commit: http://git-wip-us.apache.org/repos/asf/kudu/commit/0f99d403
Tree: http://git-wip-us.apache.org/repos/asf/kudu/tree/0f99d403
Diff: http://git-wip-us.apache.org/repos/asf/kudu/diff/0f99d403

Branch: refs/heads/master
Commit: 0f99d403da61a5304d1e2b01516d949ecd311334
Parents: 3830fc9
Author: Dinesh Bhat <di...@cloudera.com>
Authored: Mon Oct 3 00:34:36 2016 -0700
Committer: Alexey Serbin <as...@cloudera.com>
Committed: Wed Oct 5 19:22:58 2016 +0000

----------------------------------------------------------------------
 src/kudu/util/subprocess-test.cc | 20 ++++++++++++++++++++
 src/kudu/util/subprocess.cc      | 12 ++++++++----
 2 files changed, 28 insertions(+), 4 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/kudu/blob/0f99d403/src/kudu/util/subprocess-test.cc
----------------------------------------------------------------------
diff --git a/src/kudu/util/subprocess-test.cc b/src/kudu/util/subprocess-test.cc
index 31b47fa..a1da506 100644
--- a/src/kudu/util/subprocess-test.cc
+++ b/src/kudu/util/subprocess-test.cc
@@ -24,6 +24,7 @@
 
 #include <gtest/gtest.h>
 
+#include "kudu/gutil/strings/substitute.h"
 #include "kudu/util/test_util.h"
 
 using std::string;
@@ -127,4 +128,23 @@ TEST_F(SubprocessTest, TestReadFromStdoutAndStderr) {
   }, &stdout, &stderr));
 }
 
+// Test KUDU-1674: '/bin/bash -c "echo"' command below is expected to
+// capture a string on stderr. This test validates that passing
+// stderr alone doesn't result in SIGSEGV as reported in the bug and
+// also check for sanity of stderr in the output.
+TEST_F(SubprocessTest, TestReadSingleFD) {
+  string stderr;
+  const string str = "ApacheKudu";
+  const string cmd_str = strings::Substitute("/bin/echo -n $0 1>&2", str);
+  ASSERT_OK(Subprocess::Call({"/bin/sh", "-c", cmd_str}, nullptr, &stderr));
+  ASSERT_EQ(stderr, str);
+
+  // Also sanity check other combinations.
+  string stdout;
+  ASSERT_OK(Subprocess::Call({"/bin/ls", "/dev/null"}, &stdout, nullptr));
+  ASSERT_STR_CONTAINS(stdout, "/dev/null");
+
+  ASSERT_OK(Subprocess::Call({"/bin/ls", "/dev/zero"}, nullptr, nullptr));
+}
+
 } // namespace kudu

http://git-wip-us.apache.org/repos/asf/kudu/blob/0f99d403/src/kudu/util/subprocess.cc
----------------------------------------------------------------------
diff --git a/src/kudu/util/subprocess.cc b/src/kudu/util/subprocess.cc
index fb296d4..db8315f 100644
--- a/src/kudu/util/subprocess.cc
+++ b/src/kudu/util/subprocess.cc
@@ -493,14 +493,18 @@ Status Subprocess::Call(const vector<string>& argv,
   if (stderr_out) {
     fds.push_back(p.from_child_stderr_fd());
   }
-  vector<string> out;
-  RETURN_NOT_OK(ReadFdsFully(argv[0], fds, &out));
+  vector<string> outv;
+  RETURN_NOT_OK(ReadFdsFully(argv[0], fds, &outv));
 
+  // Given that ReadFdsFully captures the strings in the order in which we
+  // had installed 'fds' above, it can be assured that we can receive
+  // as many strings as there were 'fds' in the vector and in that order.
+  CHECK_EQ(outv.size(), fds.size());
   if (stdout_out) {
-    *stdout_out = std::move(out[0]);
+    *stdout_out = std::move(outv.front());
   }
   if (stderr_out) {
-    *stderr_out = std::move(out[1]);
+    *stderr_out = std::move(outv.back());
   }
 
   int retcode;

[4/4] kudu git commit: mem_tracker: fix race between FindTracker() and destructor

Posted by to...@apache.org.

mem_tracker: fix race between FindTracker() and destructor

In very rare cases, it's possible for an interleaving between these two
functions to lead to a recursive lock acquisition of child_trackers_lock_
in the destructor.

For example:
1. The tracker hierarchy contains one parent (P) and one child (C1).
2. Thread 1 creates a second child (C2) parented to P. It has the sole ref
   to C2.
3. Thread 2 calls FindTracker() looking for C1.
4. Thread 2 runs as far as the loop in FindTrackerUnlocked(), getting
   descheduled just as it has locked a ref to C2. It also holds P's
   child_trackers_lock_.
5. Thread 1 is rescheduled and drops its ref to C2.
6. Thread 2 is rescheduled. It also drops its ref to C2, which was the last
   ref, so it runs C2's destructor. This acquires P's child_trackers_lock_,
   which it already owns. Boom.

The fix is simple: when looking up a tracker, make a local copy of the list
of children while holding child_trackers_lock_, then iterate without it.
It gets a little complicated due to the unusual requirements of
FindOrCreateTracker(); I ended up introducing a special static lock to
handle that case. The rest of the changes are basically code motion.

Unfortunately, the new test does not fail reliably without the fix. In my
testing, it failed maybe once every few hundred runs. But it doesn't fail at
all with the fix in place.

Change-Id: I69610f782c7a00d161bfb48d2487c29c0309c985
Reviewed-on: http://gerrit.cloudera.org:8080/4614
Tested-by: Kudu Jenkins
Reviewed-by: Todd Lipcon <to...@apache.org>


Project: http://git-wip-us.apache.org/repos/asf/kudu/repo
Commit: http://git-wip-us.apache.org/repos/asf/kudu/commit/64f9ab34
Tree: http://git-wip-us.apache.org/repos/asf/kudu/tree/64f9ab34
Diff: http://git-wip-us.apache.org/repos/asf/kudu/diff/64f9ab34

Branch: refs/heads/master
Commit: 64f9ab34ffbc249cf72fc187308a1402888ba994
Parents: 0558aaa
Author: Adar Dembo <ad...@cloudera.com>
Authored: Mon Oct 3 17:21:54 2016 -0700
Committer: Todd Lipcon <to...@apache.org>
Committed: Wed Oct 5 21:29:23 2016 +0000

----------------------------------------------------------------------
 src/kudu/consensus/log_cache.cc   |  4 +--
 src/kudu/util/cache.cc            |  2 +-
 src/kudu/util/mem_tracker-test.cc | 36 +++++++++++++++++++--
 src/kudu/util/mem_tracker.cc      | 58 +++++++++++++++++++---------------
 src/kudu/util/mem_tracker.h       | 34 ++++++--------------
 5 files changed, 78 insertions(+), 56 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/kudu/blob/64f9ab34/src/kudu/consensus/log_cache.cc
----------------------------------------------------------------------
diff --git a/src/kudu/consensus/log_cache.cc b/src/kudu/consensus/log_cache.cc
index dcb4ba2..8ac6578 100644
--- a/src/kudu/consensus/log_cache.cc
+++ b/src/kudu/consensus/log_cache.cc
@@ -85,8 +85,8 @@ LogCache::LogCache(const scoped_refptr<MetricEntity>& metric_entity,
 
   // Set up (or reuse) a tracker with the global limit. It is parented directly
   // to the root tracker so that it's always global.
-  parent_tracker_ = MemTracker::FindOrCreateTracker(global_max_ops_size_bytes,
-                                                    kParentMemTrackerId);
+  parent_tracker_ = MemTracker::FindOrCreateGlobalTracker(global_max_ops_size_bytes,
+                                                          kParentMemTrackerId);
 
   // And create a child tracker with the per-tablet limit.
   tracker_ = MemTracker::CreateTracker(

http://git-wip-us.apache.org/repos/asf/kudu/blob/64f9ab34/src/kudu/util/cache.cc
----------------------------------------------------------------------
diff --git a/src/kudu/util/cache.cc b/src/kudu/util/cache.cc
index 1ef16a0..9d46121 100644
--- a/src/kudu/util/cache.cc
+++ b/src/kudu/util/cache.cc
@@ -407,7 +407,7 @@ class ShardedLRUCache : public Cache {
     // A cache is often a singleton, so:
     // 1. We reuse its MemTracker if one already exists, and
     // 2. It is directly parented to the root MemTracker.
-    mem_tracker_ = MemTracker::FindOrCreateTracker(
+    mem_tracker_ = MemTracker::FindOrCreateGlobalTracker(
         -1, strings::Substitute("$0-sharded_lru_cache", id));
 
     int num_shards = 1 << shard_bits_;

http://git-wip-us.apache.org/repos/asf/kudu/blob/64f9ab34/src/kudu/util/mem_tracker-test.cc
----------------------------------------------------------------------
diff --git a/src/kudu/util/mem_tracker-test.cc b/src/kudu/util/mem_tracker-test.cc
index 22a0bcd..11bf2a2 100644
--- a/src/kudu/util/mem_tracker-test.cc
+++ b/src/kudu/util/mem_tracker-test.cc
@@ -27,6 +27,7 @@
 #include <boost/bind.hpp>
 #include <gperftools/malloc_extension.h>
 
+#include "kudu/gutil/strings/substitute.h"
 #include "kudu/util/test_util.h"
 
 DECLARE_int32(memory_limit_soft_percentage);
@@ -40,6 +41,7 @@ using std::shared_ptr;
 using std::string;
 using std::unordered_map;
 using std::vector;
+using strings::Substitute;
 
 TEST(MemTrackerTest, SingleTrackerNoLimit) {
   shared_ptr<MemTracker> t = MemTracker::CreateTracker(-1, "t");
@@ -218,7 +220,7 @@ TEST(MemTrackerTest, FindFunctionsTakeOwnership) {
 
   {
     shared_ptr<MemTracker> m = MemTracker::CreateTracker(-1, "test");
-    ref = MemTracker::FindOrCreateTracker(-1, m->id());
+    ref = MemTracker::FindOrCreateGlobalTracker(-1, m->id());
   }
   LOG(INFO) << ref->ToString();
   ref.reset();
@@ -337,7 +339,7 @@ TEST(MemTrackerTest, CollisionDetection) {
     MemTracker::FindTracker("parent", &found);
   }, kDeathMsg);
   EXPECT_DEATH({
-    MemTracker::FindOrCreateTracker(-1, "parent");
+    MemTracker::FindOrCreateGlobalTracker(-1, "parent");
   }, kDeathMsg);
 #endif
 }
@@ -348,7 +350,8 @@ TEST(MemTrackerTest, TestMultiThreadedRegisterAndDestroy) {
   for (int i = 0; i < 10; i++) {
     threads.emplace_back([&done]{
         while (!done.load()) {
-          shared_ptr<MemTracker> t = MemTracker::FindOrCreateTracker(1000, "foo");
+          shared_ptr<MemTracker> t = MemTracker::FindOrCreateGlobalTracker(
+              1000, "foo");
         }
       });
   }
@@ -360,4 +363,31 @@ TEST(MemTrackerTest, TestMultiThreadedRegisterAndDestroy) {
   }
 }
 
+TEST(MemTrackerTest, TestMultiThreadedCreateFind) {
+  shared_ptr<MemTracker> p = MemTracker::CreateTracker(-1, "p");
+  shared_ptr<MemTracker> c1 = MemTracker::CreateTracker(-1, "c1", p);
+  std::atomic<bool> done(false);
+  vector<std::thread> threads;
+  threads.emplace_back([&]{
+    while (!done.load()) {
+      shared_ptr<MemTracker> c1_copy;
+      CHECK(MemTracker::FindTracker(c1->id(), &c1_copy, p));
+    }
+  });
+  for (int i = 0; i < 5; i++) {
+    threads.emplace_back([&, i]{
+      while (!done.load()) {
+        shared_ptr<MemTracker> c2 =
+            MemTracker::CreateTracker(-1, Substitute("ci-$0", i), p);
+      }
+    });
+  }
+
+  SleepFor(MonoDelta::FromMilliseconds(500));
+  done.store(true);
+  for (auto& t : threads) {
+    t.join();
+  }
+}
+
 } // namespace kudu

http://git-wip-us.apache.org/repos/asf/kudu/blob/64f9ab34/src/kudu/util/mem_tracker.cc
----------------------------------------------------------------------
diff --git a/src/kudu/util/mem_tracker.cc b/src/kudu/util/mem_tracker.cc
index 51ee211..b247cb4 100644
--- a/src/kudu/util/mem_tracker.cc
+++ b/src/kudu/util/mem_tracker.cc
@@ -148,16 +148,9 @@ shared_ptr<MemTracker> MemTracker::CreateTracker(int64_t byte_limit,
                                                  const string& id,
                                                  const shared_ptr<MemTracker>& parent) {
   shared_ptr<MemTracker> real_parent = parent ? parent : GetRootTracker();
-  MutexLock l(real_parent->child_trackers_lock_);
-  return CreateTrackerUnlocked(byte_limit, id, real_parent);
-}
-
-shared_ptr<MemTracker> MemTracker::CreateTrackerUnlocked(int64_t byte_limit,
-                                                         const string& id,
-                                                         const shared_ptr<MemTracker>& parent) {
-  DCHECK(parent);
-  shared_ptr<MemTracker> tracker(new MemTracker(ConsumptionFunction(), byte_limit, id, parent));
-  parent->AddChildTrackerUnlocked(tracker);
+  shared_ptr<MemTracker> tracker(
+      new MemTracker(ConsumptionFunction(), byte_limit, id, real_parent));
+  real_parent->AddChildTracker(tracker);
   tracker->Init();
 
   return tracker;
@@ -213,18 +206,27 @@ string MemTracker::ToString() const {
 bool MemTracker::FindTracker(const string& id,
                              shared_ptr<MemTracker>* tracker,
                              const shared_ptr<MemTracker>& parent) {
-  shared_ptr<MemTracker> real_parent = parent ? parent : GetRootTracker();
-  MutexLock l(real_parent->child_trackers_lock_);
-  return FindTrackerUnlocked(id, tracker, real_parent);
+  return FindTrackerInternal(id, tracker, parent ? parent : GetRootTracker());
 }
 
-bool MemTracker::FindTrackerUnlocked(const string& id,
+bool MemTracker::FindTrackerInternal(const string& id,
                                      shared_ptr<MemTracker>* tracker,
                                      const shared_ptr<MemTracker>& parent) {
   DCHECK(parent != NULL);
-  parent->child_trackers_lock_.AssertAcquired();
+
+  list<weak_ptr<MemTracker>> children;
+  {
+    MutexLock l(parent->child_trackers_lock_);
+    children = parent->child_trackers_;
+  }
+
+  // Search for the matching child without holding the parent's lock.
+  //
+  // If the lock were held while searching, it'd be possible for 'child' to be
+  // the last live ref to a tracker, which would lead to a recursive
+  // acquisition of the parent lock during the 'child' destructor call.
   vector<shared_ptr<MemTracker>> found;
-  for (const auto& child_weak : parent->child_trackers_) {
+  for (const auto& child_weak : children) {
     shared_ptr<MemTracker> child = child_weak.lock();
     if (child && child->id() == id) {
       found.emplace_back(std::move(child));
@@ -243,16 +245,22 @@ bool MemTracker::FindTrackerUnlocked(const string& id,
   return false;
 }
 
-shared_ptr<MemTracker> MemTracker::FindOrCreateTracker(int64_t byte_limit,
-                                                       const string& id,
-                                                       const shared_ptr<MemTracker>& parent) {
-  shared_ptr<MemTracker> real_parent = parent ? parent : GetRootTracker();
-  MutexLock l(real_parent->child_trackers_lock_);
+shared_ptr<MemTracker> MemTracker::FindOrCreateGlobalTracker(
+    int64_t byte_limit,
+    const string& id) {
+  // The calls below comprise a critical section, but we can't use the root
+  // tracker's child_trackers_lock_ to synchronize it as the lock must be
+  // released during FindTrackerInternal(). Since this function creates
+  // globally-visible MemTrackers which are the exception rather than the rule,
+  // it's reasonable to synchronize their creation on a singleton lock.
+  static Mutex find_or_create_lock;
+  MutexLock l(find_or_create_lock);
+
   shared_ptr<MemTracker> found;
-  if (FindTrackerUnlocked(id, &found, real_parent)) {
+  if (FindTrackerInternal(id, &found, GetRootTracker())) {
     return found;
   }
-  return CreateTrackerUnlocked(byte_limit, id, real_parent);
+  return CreateTracker(byte_limit, id, GetRootTracker());
 }
 
 void MemTracker::ListTrackers(vector<shared_ptr<MemTracker>>* trackers) {
@@ -555,8 +563,8 @@ void MemTracker::Init() {
   DCHECK_EQ(all_trackers_[0], this);
 }
 
-void MemTracker::AddChildTrackerUnlocked(const shared_ptr<MemTracker>& tracker) {
-  child_trackers_lock_.AssertAcquired();
+void MemTracker::AddChildTracker(const shared_ptr<MemTracker>& tracker) {
+  MutexLock l(child_trackers_lock_);
   tracker->child_tracker_it_ = child_trackers_.insert(child_trackers_.end(), tracker);
 }
 

http://git-wip-us.apache.org/repos/asf/kudu/blob/64f9ab34/src/kudu/util/mem_tracker.h
----------------------------------------------------------------------
diff --git a/src/kudu/util/mem_tracker.h b/src/kudu/util/mem_tracker.h
index ae6de74..cd00e22 100644
--- a/src/kudu/util/mem_tracker.h
+++ b/src/kudu/util/mem_tracker.h
@@ -118,18 +118,14 @@ class MemTracker : public std::enable_shared_from_this<MemTracker> {
       std::shared_ptr<MemTracker>* tracker,
       const std::shared_ptr<MemTracker>& parent = std::shared_ptr<MemTracker>());
 
-  // If a tracker with the specified 'id' and 'parent' exists in the tree,
-  // returns a shared_ptr to that instance. Otherwise, creates a new
-  // MemTracker with the specified byte_limit, id, and parent.
-  //
-  // Use the two argument form if there is no parent.
+  // If a global tracker with the specified 'id' exists in the tree, returns a
+  // shared_ptr to that instance. Otherwise, creates a new MemTracker with the
+  // specified byte_limit and id, parented to the root MemTracker.
   //
   // Note: this function will enforce that 'id' is unique amongst the children
-  // of 'parent'.
-  static std::shared_ptr<MemTracker> FindOrCreateTracker(
-      int64_t byte_limit,
-      const std::string& id,
-      const std::shared_ptr<MemTracker>& parent = std::shared_ptr<MemTracker>());
+  // of the root MemTracker.
+  static std::shared_ptr<MemTracker> FindOrCreateGlobalTracker(
+      int64_t byte_limit, const std::string& id);
 
   // Returns a list of all the valid trackers.
   static void ListTrackers(std::vector<std::shared_ptr<MemTracker> >* trackers);
@@ -258,9 +254,7 @@ class MemTracker : public std::enable_shared_from_this<MemTracker> {
   void Init();
 
   // Adds tracker to child_trackers_.
-  //
-  // child_trackers_lock_ must be held.
-  void AddChildTrackerUnlocked(const std::shared_ptr<MemTracker>& tracker);
+  void AddChildTracker(const std::shared_ptr<MemTracker>& tracker);
 
   // Logs the stack of the current consume/release. Used for debugging only.
   void LogUpdate(bool is_consume, int64_t bytes) const;
@@ -268,18 +262,8 @@ class MemTracker : public std::enable_shared_from_this<MemTracker> {
   static std::string LogUsage(const std::string& prefix,
                               const std::list<std::weak_ptr<MemTracker>>& trackers);
 
-  // Variant of CreateTracker() that:
-  // 1. Must be called with a non-NULL parent, and
-  // 2. Must be called with parent->child_trackers_lock_ held.
-  static std::shared_ptr<MemTracker> CreateTrackerUnlocked(
-      int64_t byte_limit,
-      const std::string& id,
-      const std::shared_ptr<MemTracker>& parent);
-
-  // Variant of FindTracker() that:
-  // 1. Must be called with a non-NULL parent, and
-  // 2. Must be called with parent->child_trackers_lock_ held.
-  static bool FindTrackerUnlocked(
+  // Variant of FindTracker() that must be called with a non-NULL parent.
+  static bool FindTrackerInternal(
       const std::string& id,
       std::shared_ptr<MemTracker>* tracker,
       const std::shared_ptr<MemTracker>& parent);

[3/4] kudu git commit: kernel_stack_watchdog: avoid blocking threads starting

Posted by to...@apache.org.

kernel_stack_watchdog: avoid blocking threads starting

I've noticed recently that threads start particularly slowly in TSAN.
One culprit which seems to exacerbate this issue is the following:

- TSAN defers signal-handling in many cases, which causes the stack
  watchdog to be slow at collecting stacks.
- The stack watchdog was holding a lock while collecting stacks from
  stuck threads.
- This lock blocked other threads from starting, since every new thread
  needs to register itself with the watchdog.

The fix here is to make the synchronization more fine-grained: we only
hold this lock long enough to make a copy of the current map of
registered threads. However, it's still important to prevent these
threads from _exiting_ while we are looking at their TLS. So, this patch
adds a new 'unregister_lock_' which is used to prevent such exits.

Since 'lock_' is now held for only short periods of time, I switched it
out for a spinlock instead of a mutex.

Additionally, the lock protecting the log collector was also separated
out.

No new tests are included, but the watchdog is already covered and runs
as part of nearly every test.

Change-Id: I7af85ade6ec9050843ec5b146d26c2549c503d8f
Reviewed-on: http://gerrit.cloudera.org:8080/4626
Reviewed-by: Adar Dembo <ad...@cloudera.com>
Tested-by: Todd Lipcon <to...@apache.org>


Project: http://git-wip-us.apache.org/repos/asf/kudu/repo
Commit: http://git-wip-us.apache.org/repos/asf/kudu/commit/0558aaa4
Tree: http://git-wip-us.apache.org/repos/asf/kudu/tree/0558aaa4
Diff: http://git-wip-us.apache.org/repos/asf/kudu/diff/0558aaa4

Branch: refs/heads/master
Commit: 0558aaa48ed9bbfe1e680b2be9f9f3f60f7dd6c0
Parents: 98f42cd
Author: Todd Lipcon <to...@apache.org>
Authored: Tue Oct 4 19:09:11 2016 -0700
Committer: Todd Lipcon <to...@apache.org>
Committed: Wed Oct 5 20:58:33 2016 +0000

----------------------------------------------------------------------
 src/kudu/util/kernel_stack_watchdog.cc | 77 +++++++++++++++++------------
 src/kudu/util/kernel_stack_watchdog.h  | 23 +++++++--
 2 files changed, 64 insertions(+), 36 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/kudu/blob/0558aaa4/src/kudu/util/kernel_stack_watchdog.cc
----------------------------------------------------------------------
diff --git a/src/kudu/util/kernel_stack_watchdog.cc b/src/kudu/util/kernel_stack_watchdog.cc
index cbab345..2f7517a 100644
--- a/src/kudu/util/kernel_stack_watchdog.cc
+++ b/src/kudu/util/kernel_stack_watchdog.cc
@@ -34,6 +34,7 @@ DEFINE_int32(hung_task_check_interval_ms, 200,
              "Number of milliseconds in between checks for hung threads");
 TAG_FLAG(hung_task_check_interval_ms, hidden);
 
+using std::lock_guard;
 using strings::Substitute;
 
 namespace kudu {
@@ -61,7 +62,7 @@ KernelStackWatchdog::~KernelStackWatchdog() {
 }
 
 void KernelStackWatchdog::SaveLogsForTests(bool save_logs) {
-  MutexLock l(lock_);
+  lock_guard<simple_spinlock> l(log_lock_);
   if (save_logs) {
     log_collector_.reset(new vector<string>());
   } else {
@@ -70,20 +71,21 @@ void KernelStackWatchdog::SaveLogsForTests(bool save_logs) {
 }
 
 vector<string> KernelStackWatchdog::LoggedMessagesForTests() const {
-  MutexLock l(lock_);
+  lock_guard<simple_spinlock> l(log_lock_);
   CHECK(log_collector_) << "Must call SaveLogsForTests(true) first";
   return *log_collector_;
 }
 
 void KernelStackWatchdog::Register(TLS* tls) {
   int64_t tid = Thread::CurrentThreadId();
-  MutexLock l(lock_);
+  lock_guard<simple_spinlock> l(tls_lock_);
   InsertOrDie(&tls_by_tid_, tid, tls);
 }
 
-void KernelStackWatchdog::Unregister(TLS* tls) {
+void KernelStackWatchdog::Unregister() {
   int64_t tid = Thread::CurrentThreadId();
-  MutexLock l(lock_);
+  MutexLock l(unregister_lock_);
+  lock_guard<simple_spinlock> l2(tls_lock_);
   CHECK(tls_by_tid_.erase(tid));
 }
 
@@ -102,37 +104,48 @@ void KernelStackWatchdog::RunThread() {
       break;
     }
 
+    // Prevent threads from unregistering between the snapshot loop and the sending of
+    // signals. This makes it safe for us to access their TLS. We might delay the thread
+    // exit a bit, but it would be unusual for any code to block on a thread exit, whereas
+    // it's relatively important for threads to _start_ quickly.
+    MutexLock l(unregister_lock_);
+
+    // Take the snapshot of the thread information under a short lock.
+    //
+    // 'lock_' prevents new threads from starting, so we don't want to do any lengthy work
+    // (such as gathering stack traces) under this lock.
+    TLSMap tls_map_copy;
     {
-      MutexLock l(lock_);
-      MicrosecondsInt64 now = GetMonoTimeMicros();
-
-      for (const TLSMap::value_type& map_entry : tls_by_tid_) {
-        pid_t p = map_entry.first;
-        const TLS::Data* tls = &map_entry.second->data_;
-
-        TLS::Data tls_copy;
-        tls->SnapshotCopy(&tls_copy);
-
-        for (int i = 0; i < tls_copy.depth_; i++) {
-          TLS::Frame* frame = &tls_copy.frames_[i];
-
-          int paused_ms = (now - frame->start_time_) / 1000;
-          if (paused_ms > frame->threshold_ms_) {
-            string kernel_stack;
-            Status s = GetKernelStack(p, &kernel_stack);
-            if (!s.ok()) {
-              // Can't read the kernel stack of the pid -- it's possible that the thread exited
-              // while we were iterating, so just ignore it.
-              kernel_stack = "(could not read kernel stack)";
-            }
-
-            string user_stack = DumpThreadStack(p);
-            LOG_STRING(WARNING, log_collector_.get())
+      lock_guard<simple_spinlock> l(tls_lock_);
+      tls_map_copy = tls_by_tid_;
+    }
+
+    MicrosecondsInt64 now = GetMonoTimeMicros();
+    for (const auto& entry : tls_map_copy) {
+      pid_t p = entry.first;
+      TLS::Data* tls = &entry.second->data_;
+      TLS::Data tls_copy;
+      tls->SnapshotCopy(&tls_copy);
+      for (int i = 0; i < tls_copy.depth_; i++) {
+        const TLS::Frame* frame = &tls_copy.frames_[i];
+
+        int paused_ms = (now - frame->start_time_) / 1000;
+        if (paused_ms > frame->threshold_ms_) {
+          string kernel_stack;
+          Status s = GetKernelStack(p, &kernel_stack);
+          if (!s.ok()) {
+            // Can't read the kernel stack of the pid, just ignore it.
+            kernel_stack = "(could not read kernel stack)";
+          }
+
+          string user_stack = DumpThreadStack(p);
+
+          lock_guard<simple_spinlock> l(log_lock_);
+          LOG_STRING(WARNING, log_collector_.get())
               << "Thread " << p << " stuck at " << frame->status_
               << " for " << paused_ms << "ms" << ":\n"
               << "Kernel stack:\n" << kernel_stack << "\n"
               << "User stack:\n" << user_stack;
-          }
         }
       }
     }
@@ -150,7 +163,7 @@ KernelStackWatchdog::TLS::TLS() {
 }
 
 KernelStackWatchdog::TLS::~TLS() {
-  KernelStackWatchdog::GetInstance()->Unregister(this);
+  KernelStackWatchdog::GetInstance()->Unregister();
 }
 
 // Optimistic concurrency control approach to snapshot the value of another

http://git-wip-us.apache.org/repos/asf/kudu/blob/0558aaa4/src/kudu/util/kernel_stack_watchdog.h
----------------------------------------------------------------------
diff --git a/src/kudu/util/kernel_stack_watchdog.h b/src/kudu/util/kernel_stack_watchdog.h
index 914479f..79b6087 100644
--- a/src/kudu/util/kernel_stack_watchdog.h
+++ b/src/kudu/util/kernel_stack_watchdog.h
@@ -63,8 +63,9 @@
 #include "kudu/gutil/singleton.h"
 #include "kudu/gutil/walltime.h"
 #include "kudu/util/countdown_latch.h"
-#include "kudu/util/mutex.h"
+#include "kudu/util/locks.h"
 #include "kudu/util/monotime.h"
+#include "kudu/util/mutex.h"
 #include "kudu/util/threadlocal.h"
 
 #define SCOPED_WATCH_STACK(threshold_ms) \
@@ -167,7 +168,7 @@ class KernelStackWatchdog {
   void Register(TLS* tls);
 
   // Called when a thread's TLS is destructed (i.e. when the thread exits).
-  void Unregister(TLS* tls);
+  void Unregister();
 
   // The actual watchdog loop that the watchdog thread runs.
   void RunThread();
@@ -181,8 +182,22 @@ class KernelStackWatchdog {
   // Used by tests.
   gscoped_ptr<std::vector<std::string> > log_collector_;
 
-  // Lock protecting tls_by_tid_ and log_collector_.
-  mutable Mutex lock_;
+  // Lock protecting log_collector_.
+  mutable simple_spinlock log_lock_;
+
+  // Lock protecting tls_by_tid_.
+  mutable simple_spinlock tls_lock_;
+
+  // Lock which prevents threads from unregistering while the watchdog
+  // sends signals.
+  //
+  // This is used to prevent the watchdog from sending a signal to a pid just
+  // after the pid has actually exited and been reused. Sending a signal to
+  // a non-Kudu thread could have unintended consequences.
+  //
+  // When this lock is held concurrently with 'tls_lock_' or 'log_lock_',
+  // this lock must be acquired first.
+  Mutex unregister_lock_;
 
   // The watchdog thread itself.
   scoped_refptr<Thread> thread_;

[2/4] kudu git commit: delete_table-test: fix flakiness with table creation timeout

Posted by to...@apache.org.

delete_table-test: fix flakiness with table creation timeout

This test was timing out frequently when trying to create a
replication-2 table on a cluster with 3 tservers, one of which was
recently shut down. The master could try to place a replica on the
non-running server, which would then take some time to time out and try
a new placement.

The workaround here is to restart the master so it no longer sees the
crashed server as a valid placement option.

Change-Id: Ic61ad384e1b247f83bfc709528c4c7bda586c9d2
Reviewed-on: http://gerrit.cloudera.org:8080/4632
Reviewed-by: David Ribeiro Alves <dr...@apache.org>
Reviewed-by: Dinesh Bhat <di...@cloudera.com>
Tested-by: Kudu Jenkins


Project: http://git-wip-us.apache.org/repos/asf/kudu/repo
Commit: http://git-wip-us.apache.org/repos/asf/kudu/commit/98f42cdd
Tree: http://git-wip-us.apache.org/repos/asf/kudu/tree/98f42cdd
Diff: http://git-wip-us.apache.org/repos/asf/kudu/diff/98f42cdd

Branch: refs/heads/master
Commit: 98f42cdd878caa429377625a2288d22ed0d114f2
Parents: 0f99d40
Author: Todd Lipcon <to...@apache.org>
Authored: Wed Oct 5 10:52:29 2016 -0700
Committer: David Ribeiro Alves <dr...@apache.org>
Committed: Wed Oct 5 20:26:40 2016 +0000

----------------------------------------------------------------------
 src/kudu/integration-tests/delete_table-test.cc | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/kudu/blob/98f42cdd/src/kudu/integration-tests/delete_table-test.cc
----------------------------------------------------------------------
diff --git a/src/kudu/integration-tests/delete_table-test.cc b/src/kudu/integration-tests/delete_table-test.cc
index 6a0de2f..d331d43 100644
--- a/src/kudu/integration-tests/delete_table-test.cc
+++ b/src/kudu/integration-tests/delete_table-test.cc
@@ -432,7 +432,7 @@ TEST_F(DeleteTableTest, TestAutoTombstoneAfterCrashDuringTabletCopy) {
   ASSERT_OK(cluster_->master()->Restart());
   ASSERT_OK(cluster_->WaitForTabletServerCount(1, MonoDelta::FromSeconds(30)));
 
-  // Set up a table which has a table only on TS 0. This will be used to test for
+  // Set up a table which has a tablet only on TS 0. This will be used to test for
   // "collateral damage" bugs where incorrect handling of the main test tablet
   // accidentally removes blocks from another tablet.
   // We use a sequential workload so that we just flush and don't compact.
@@ -467,7 +467,15 @@ TEST_F(DeleteTableTest, TestAutoTombstoneAfterCrashDuringTabletCopy) {
   ASSERT_OK(cluster_->tablet_server(2)->Restart());
   cluster_->tablet_server(kTsIndex)->Shutdown();
 
-  // Create a new tablet which is replicated on the other two servers.
+  // Restart the master to be sure that it only sees the live servers.
+  // Otherwise it may try to create a tablet with a replica on the down server.
+  // The table creation would eventually succeed after picking a different set of
+  // replicas, but not before causing a timeout.
+  cluster_->master()->Shutdown();
+  ASSERT_OK(cluster_->master()->Restart());
+  ASSERT_OK(cluster_->WaitForTabletServerCount(2, MonoDelta::FromSeconds(30)));
+
+  // Create a new table with a single tablet replicated on the other two servers.
   // We use the same sequential workload. This produces block ID sequences
   // that look like:
   //   TS 0: |---- blocks from 'other-table' ---]