You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@kudu.apache.org by to...@apache.org on 2016/02/18 03:15:57 UTC

[1/4] incubator-kudu git commit: Add design-docs directory

Repository: incubator-kudu
Updated Branches:
  refs/heads/master 7b1ea74ee -> 3e6c30337


Add design-docs directory

Change-Id: I9caf1a1a1c9868e9bdf1774920b9eda326b4efa5
Reviewed-on: http://gerrit.cloudera.org:8080/2201
Reviewed-by: Misty Stanley-Jones <mi...@apache.org>
Tested-by: Misty Stanley-Jones <mi...@apache.org>


Project: http://git-wip-us.apache.org/repos/asf/incubator-kudu/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-kudu/commit/44400774
Tree: http://git-wip-us.apache.org/repos/asf/incubator-kudu/tree/44400774
Diff: http://git-wip-us.apache.org/repos/asf/incubator-kudu/diff/44400774

Branch: refs/heads/master
Commit: 444007746895122687fb9c8ea5274fb15974fe37
Parents: 7b1ea74
Author: Dan Burkert <da...@cloudera.com>
Authored: Wed Feb 17 10:14:25 2016 -0800
Committer: Misty Stanley-Jones <mi...@apache.org>
Committed: Wed Feb 17 18:25:30 2016 +0000

----------------------------------------------------------------------
 docs/design-docs/README.md | 10 ++++++++++
 1 file changed, 10 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-kudu/blob/44400774/docs/design-docs/README.md
----------------------------------------------------------------------
diff --git a/docs/design-docs/README.md b/docs/design-docs/README.md
new file mode 100644
index 0000000..da0963b
--- /dev/null
+++ b/docs/design-docs/README.md
@@ -0,0 +1,10 @@
+# Design Docs
+
+This directory holds Kudu design documents. These documents are typically
+written from a point-of-time view, and do not necessarily represent the current
+state of the system. They are useful for learning why design decisions were
+made.
+
+
+| Document | Component(s) | Discussion |
+| -------- | ------------ | ---------- |


[4/4] incubator-kudu git commit: build: fix ignoring of clock sync errors in flaky test list

Posted by to...@apache.org.
build: fix ignoring of clock sync errors in flaky test list

Previously, I tried to add this same functionality, but used 'grep' instead of
'zgrep'. In the production test slaves, the test logs are gzipped, so we need
to 'zgrep'. It turns out that zgrep is also smart enough to grep non-compressed
files, so it's safe to use it regardless of the file extension.

Change-Id: Ic8645f6f046700a28d3306e0db1be9dbfb06a433
Reviewed-on: http://gerrit.cloudera.org:8080/2218
Reviewed-by: Adar Dembo <ad...@cloudera.com>
Tested-by: Kudu Jenkins


Project: http://git-wip-us.apache.org/repos/asf/incubator-kudu/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-kudu/commit/3e6c3033
Tree: http://git-wip-us.apache.org/repos/asf/incubator-kudu/tree/3e6c3033
Diff: http://git-wip-us.apache.org/repos/asf/incubator-kudu/diff/3e6c3033

Branch: refs/heads/master
Commit: 3e6c303375eb82e59d86bcd4c6bdf999336342ff
Parents: 9ca4127
Author: Todd Lipcon <to...@apache.org>
Authored: Wed Feb 17 17:32:39 2016 -0800
Committer: Todd Lipcon <to...@apache.org>
Committed: Thu Feb 18 02:14:55 2016 +0000

----------------------------------------------------------------------
 build-support/report-test.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-kudu/blob/3e6c3033/build-support/report-test.sh
----------------------------------------------------------------------
diff --git a/build-support/report-test.sh b/build-support/report-test.sh
index f5298dc..e375c90 100755
--- a/build-support/report-test.sh
+++ b/build-support/report-test.sh
@@ -72,7 +72,7 @@ fi
 
 # We sometimes have flaky infrastructure where NTP is broken. In that case
 # do not report it as a failed test.
-if grep -q 'Clock considered unsynchronized' $LOGFILE ; then
+if zgrep -q 'Clock considered unsynchronized' $LOGFILE ; then
   echo Not reporting test that failed due to NTP issues.
   exit 1
 fi


[2/4] incubator-kudu git commit: mem_tracker: fix a potential bad_weak_ptr

Posted by to...@apache.org.
mem_tracker: fix a potential bad_weak_ptr

This fixes the following race:

T1 is calling MemTracker::FindOrCreateTracker for a tracker that does
not yet exist:
- T1 registers the new MemTracker as a child in the root tracker's child list.
  This was a raw pointer, prior to this patch.
- T1 constructs a shared_ptr, incrementing the ref count to 1.
- T1 returns the shared_ptr to the user.
- The user drops the reference to the shared_ptr back to 0, such that it is
  being deleted. The destructor is called, and the thread is switched out
  before the raw pointer can be removed from the root tracker's child list.

T2 now runs, where a caller is trying to FindOrCreateTracker() with the
same identifier:
- T2 looks in the child tracker and sees a matching MemTracker instance.
- T2 tries to call tracker->shared_from_this(), but the refcount is 0,
  so it throws bad_weak_ptr.

The fix is to use proper std::weak_ptrs for the child list instead of
raw pointers. A new unit test reproduces the bug consistently.

Change-Id: I1e1c225d6670025a9b4729013747652a30dfe608
Reviewed-on: http://gerrit.cloudera.org:8080/2212
Reviewed-by: Adar Dembo <ad...@cloudera.com>
Tested-by: Kudu Jenkins


Project: http://git-wip-us.apache.org/repos/asf/incubator-kudu/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-kudu/commit/e284a93c
Tree: http://git-wip-us.apache.org/repos/asf/incubator-kudu/tree/e284a93c
Diff: http://git-wip-us.apache.org/repos/asf/incubator-kudu/diff/e284a93c

Branch: refs/heads/master
Commit: e284a93c40740ad8bd3b5b95ee31e98c24aa4a6f
Parents: 4440077
Author: Todd Lipcon <to...@apache.org>
Authored: Wed Feb 17 16:13:34 2016 -0800
Committer: Todd Lipcon <to...@apache.org>
Committed: Thu Feb 18 02:12:04 2016 +0000

----------------------------------------------------------------------
 src/kudu/util/mem_tracker-test.cc | 20 ++++++++++++++++++++
 src/kudu/util/mem_tracker.cc      | 28 ++++++++++++++++++----------
 src/kudu/util/mem_tracker.h       |  8 ++++----
 3 files changed, 42 insertions(+), 14 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-kudu/blob/e284a93c/src/kudu/util/mem_tracker-test.cc
----------------------------------------------------------------------
diff --git a/src/kudu/util/mem_tracker-test.cc b/src/kudu/util/mem_tracker-test.cc
index 2a10bed..4c1b244 100644
--- a/src/kudu/util/mem_tracker-test.cc
+++ b/src/kudu/util/mem_tracker-test.cc
@@ -17,7 +17,9 @@
 
 #include "kudu/util/mem_tracker.h"
 
+#include <atomic>
 #include <string>
+#include <thread>
 #include <unordered_map>
 #include <utility>
 #include <vector>
@@ -337,4 +339,22 @@ TEST(MemTrackerTest, UnregisterFromParent) {
   c->UnregisterFromParent();
 }
 
+TEST(MemTrackerTest, TestMultiThreadedRegisterAndDestroy) {
+  std::atomic<bool> done(false);
+  vector<std::thread> threads;
+  for (int i = 0; i < 10; i++) {
+    threads.emplace_back([&done]{
+        while (!done.load()) {
+          shared_ptr<MemTracker> t = MemTracker::FindOrCreateTracker(1000, "foo");
+        }
+      });
+  }
+
+  SleepFor(MonoDelta::FromSeconds(AllowSlowTests() ? 5 : 1));
+  done.store(true);
+  for (auto& t : threads) {
+    t.join();
+  }
+}
+
 } // namespace kudu

http://git-wip-us.apache.org/repos/asf/incubator-kudu/blob/e284a93c/src/kudu/util/mem_tracker.cc
----------------------------------------------------------------------
diff --git a/src/kudu/util/mem_tracker.cc b/src/kudu/util/mem_tracker.cc
index 1f13347..1a73284 100644
--- a/src/kudu/util/mem_tracker.cc
+++ b/src/kudu/util/mem_tracker.cc
@@ -74,6 +74,7 @@ using std::string;
 using std::stringstream;
 using std::shared_ptr;
 using std::vector;
+using std::weak_ptr;
 
 using strings::Substitute;
 
@@ -153,7 +154,7 @@ shared_ptr<MemTracker> MemTracker::CreateTrackerUnlocked(int64_t byte_limit,
                                                          const shared_ptr<MemTracker>& parent) {
   DCHECK(parent);
   shared_ptr<MemTracker> tracker(new MemTracker(ConsumptionFunction(), byte_limit, id, parent));
-  parent->AddChildTrackerUnlocked(tracker.get());
+  parent->AddChildTrackerUnlocked(tracker);
   tracker->Init();
 
   return tracker;
@@ -223,9 +224,10 @@ bool MemTracker::FindTrackerUnlocked(const string& id,
                                      const shared_ptr<MemTracker>& parent) {
   DCHECK(parent != NULL);
   parent->child_trackers_lock_.AssertAcquired();
-  for (MemTracker* child : parent->child_trackers_) {
-    if (child->id() == id) {
-      *tracker = child->shared_from_this();
+  for (const auto& child_weak : parent->child_trackers_) {
+    shared_ptr<MemTracker> child = child_weak.lock();
+    if (child && child->id() == id) {
+      *tracker = std::move(child);
       return true;
     }
   }
@@ -255,8 +257,11 @@ void MemTracker::ListTrackers(vector<shared_ptr<MemTracker>>* trackers) {
     trackers->push_back(t);
     {
       MutexLock l(t->child_trackers_lock_);
-      for (MemTracker* child : t->child_trackers_) {
-        to_process.push_back(child->shared_from_this());
+      for (const auto& child_weak : t->child_trackers_) {
+        shared_ptr<MemTracker> child = child_weak.lock();
+        if (child) {
+          to_process.emplace_back(std::move(child));
+        }
       }
     }
   }
@@ -541,7 +546,7 @@ void MemTracker::Init() {
   DCHECK_EQ(all_trackers_[0], this);
 }
 
-void MemTracker::AddChildTrackerUnlocked(MemTracker* tracker) {
+void MemTracker::AddChildTrackerUnlocked(const shared_ptr<MemTracker>& tracker) {
   child_trackers_lock_.AssertAcquired();
 #ifndef NDEBUG
   shared_ptr<MemTracker> found;
@@ -563,10 +568,13 @@ void MemTracker::LogUpdate(bool is_consume, int64_t bytes) const {
 }
 
 string MemTracker::LogUsage(const string& prefix,
-                            const list<MemTracker*>& trackers) {
+                            const list<weak_ptr<MemTracker>>& trackers) {
   vector<string> usage_strings;
-  for (const MemTracker* child : trackers) {
-    usage_strings.push_back(child->LogUsage(prefix));
+  for (const auto& child_weak : trackers) {
+    shared_ptr<MemTracker> child = child_weak.lock();
+    if (child) {
+      usage_strings.push_back(child->LogUsage(prefix));
+    }
   }
   return JoinStrings(usage_strings, "\n");
 }

http://git-wip-us.apache.org/repos/asf/incubator-kudu/blob/e284a93c/src/kudu/util/mem_tracker.h
----------------------------------------------------------------------
diff --git a/src/kudu/util/mem_tracker.h b/src/kudu/util/mem_tracker.h
index 645f6e5..6cd980f 100644
--- a/src/kudu/util/mem_tracker.h
+++ b/src/kudu/util/mem_tracker.h
@@ -267,13 +267,13 @@ class MemTracker : public std::enable_shared_from_this<MemTracker> {
   // Adds tracker to child_trackers_.
   //
   // child_trackers_lock_ must be held.
-  void AddChildTrackerUnlocked(MemTracker* tracker);
+  void AddChildTrackerUnlocked(const std::shared_ptr<MemTracker>& tracker);
 
   // Logs the stack of the current consume/release. Used for debugging only.
   void LogUpdate(bool is_consume, int64_t bytes) const;
 
   static std::string LogUsage(const std::string& prefix,
-      const std::list<MemTracker*>& trackers);
+                              const std::list<std::weak_ptr<MemTracker>>& trackers);
 
   // Variant of CreateTracker() that:
   // 1. Must be called with a non-NULL parent, and
@@ -322,11 +322,11 @@ class MemTracker : public std::enable_shared_from_this<MemTracker> {
   // listing only (i.e. updating the consumption of a parent tracker does not
   // update that of its children).
   mutable Mutex child_trackers_lock_;
-  std::list<MemTracker*> child_trackers_;
+  std::list<std::weak_ptr<MemTracker>> child_trackers_;
 
   // Iterator into parent_->child_trackers_ for this object. Stored to have O(1)
   // remove.
-  std::list<MemTracker*>::iterator child_tracker_it_;
+  std::list<std::weak_ptr<MemTracker>>::iterator child_tracker_it_;
 
   // Functions to call after the limit is reached to free memory.
   std::vector<GcFunction> gc_functions_;


[3/4] incubator-kudu git commit: catalog_manager: fix a use-after-free on TabletInfo

Posted by to...@apache.org.
catalog_manager: fix a use-after-free on TabletInfo

This fixes an issue seen in a new stress test that Adar is working on:

T1:
- HandleAssignCreatingTablet calls CreateTabletInfo, which returns a raw
  pointer 'replacement'
- we added that raw pointer to the table's tablet map (which also stores it as raw)

T2:
- GetTableLocations() constructs a scoped_refptr to the TabletInfo object,
  acting as its first referer.
- GetTableLocations() finishes and drops the scoped_refptr, deleting the TabletInfo

T1:
- we try to dereference 'replacement', and crash, because T2 freed the pointer.

This patch fixes the issue by using a scoped_refptr to hold the 'replacement'
tablet before exposing it to other threads.

No new unit test here, because it was easily reproduced by the stress test that
Adar's working on, which will be committed soon.

Change-Id: I67259d54f03bded7c7bbbea677093f39e0efa528
Reviewed-on: http://gerrit.cloudera.org:8080/2213
Tested-by: Kudu Jenkins
Reviewed-by: Adar Dembo <ad...@cloudera.com>


Project: http://git-wip-us.apache.org/repos/asf/incubator-kudu/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-kudu/commit/9ca41277
Tree: http://git-wip-us.apache.org/repos/asf/incubator-kudu/tree/9ca41277
Diff: http://git-wip-us.apache.org/repos/asf/incubator-kudu/diff/9ca41277

Branch: refs/heads/master
Commit: 9ca41277167240084f0731a97255ccfce0bf3e80
Parents: e284a93
Author: Todd Lipcon <to...@apache.org>
Authored: Wed Feb 17 16:14:00 2016 -0800
Committer: Todd Lipcon <to...@apache.org>
Committed: Thu Feb 18 02:14:51 2016 +0000

----------------------------------------------------------------------
 src/kudu/master/catalog_manager.cc | 28 +++++++++++++++-------------
 src/kudu/master/catalog_manager.h  |  4 ++--
 2 files changed, 17 insertions(+), 15 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-kudu/blob/9ca41277/src/kudu/master/catalog_manager.cc
----------------------------------------------------------------------
diff --git a/src/kudu/master/catalog_manager.cc b/src/kudu/master/catalog_manager.cc
index b3c0468..e5958c2 100644
--- a/src/kudu/master/catalog_manager.cc
+++ b/src/kudu/master/catalog_manager.cc
@@ -785,15 +785,17 @@ Status CatalogManager::CreateTable(const CreateTableRequestPB* orig_req,
     for (const Partition& partition : partitions) {
       PartitionPB partition_pb;
       partition.ToPB(&partition_pb);
-      tablets.push_back(CreateTabletInfo(table.get(), partition_pb));
+      scoped_refptr<TabletInfo> t = CreateTabletInfo(table.get(), partition_pb);
+      tablets.push_back(t.get());
+
+      // Add the new tablet to the catalog-manager-wide map by tablet ID.
+      InsertOrDie(&tablet_map_, t->tablet_id(), std::move(t));
     }
 
-    // Add the table/tablets to the in-memory map for the assignment.
     resp->set_table_id(table->id());
+
+    // Add the tablets to the table's map.
     table->AddTablets(tablets);
-    for (TabletInfo* tablet : tablets) {
-      InsertOrDie(&tablet_map_, tablet->tablet_id(), tablet);
-    }
   }
   TRACE("Inserted new table and tablet info into CatalogManager maps");
 
@@ -892,9 +894,9 @@ TableInfo *CatalogManager::CreateTableInfo(const CreateTableRequestPB& req,
   return table;
 }
 
-TabletInfo* CatalogManager::CreateTabletInfo(TableInfo* table,
-                                             const PartitionPB& partition) {
-  TabletInfo* tablet = new TabletInfo(table, GenerateId());
+scoped_refptr<TabletInfo> CatalogManager::CreateTabletInfo(TableInfo* table,
+                                                           const PartitionPB& partition) {
+  scoped_refptr<TabletInfo> tablet(new TabletInfo(table, GenerateId()));
   tablet->mutable_metadata()->StartMutation();
   SysTabletsEntryPB *metadata = &tablet->mutable_metadata()->mutable_dirty()->pb;
   metadata->set_state(SysTabletsEntryPB::PREPARING);
@@ -2556,13 +2558,13 @@ void CatalogManager::HandleAssignCreatingTablet(TabletInfo* tablet,
 
   // The "tablet creation" was already sent, but we didn't receive an answer
   // within the timeout. So the tablet will be replaced by a new one.
-  TabletInfo *replacement = CreateTabletInfo(tablet->table().get(),
+  scoped_refptr<TabletInfo> replacement = CreateTabletInfo(tablet->table().get(),
                                              old_info.pb.partition());
   LOG(WARNING) << "Tablet " << tablet->ToString() << " was not created within "
                << "the allowed timeout. Replacing with a new tablet "
                << replacement->tablet_id();
 
-  tablet->table()->AddTablet(replacement);
+  tablet->table()->AddTablet(replacement.get());
   {
     boost::lock_guard<LockType> l_maps(lock_);
     tablet_map_[replacement->tablet_id()] = replacement;
@@ -2580,13 +2582,13 @@ void CatalogManager::HandleAssignCreatingTablet(TabletInfo* tablet,
     Substitute("Replacement for $0", tablet->tablet_id()));
 
   deferred->tablets_to_update.push_back(tablet);
-  deferred->tablets_to_add.push_back(replacement);
-  deferred->needs_create_rpc.push_back(replacement);
+  deferred->tablets_to_add.push_back(replacement.get());
+  deferred->needs_create_rpc.push_back(replacement.get());
   VLOG(1) << "Replaced tablet " << tablet->tablet_id()
           << " with " << replacement->tablet_id()
           << " (Table " << tablet->table()->ToString() << ")";
 
-  new_tablets->push_back(replacement);
+  new_tablets->emplace_back(std::move(replacement));
 }
 
 // TODO: we could batch the IO onto a background thread.

http://git-wip-us.apache.org/repos/asf/incubator-kudu/blob/9ca41277/src/kudu/master/catalog_manager.h
----------------------------------------------------------------------
diff --git a/src/kudu/master/catalog_manager.h b/src/kudu/master/catalog_manager.h
index b06cb5a..49e39f8 100644
--- a/src/kudu/master/catalog_manager.h
+++ b/src/kudu/master/catalog_manager.h
@@ -473,8 +473,8 @@ class CatalogManager : public tserver::TabletPeerLookupIf {
   // Helper for creating the initial TabletInfo state.
   // Leaves the tablet "write locked" with the new info in the
   // "dirty" state field.
-  TabletInfo *CreateTabletInfo(TableInfo* table,
-                               const PartitionPB& partition);
+  scoped_refptr<TabletInfo> CreateTabletInfo(TableInfo* table,
+                                             const PartitionPB& partition);
 
   // Builds the TabletLocationsPB for a tablet based on the provided TabletInfo.
   // Populates locs_pb and returns true on success.