You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@kudu.apache.org by ad...@apache.org on 2017/09/08 22:54:52 UTC

[2/2] kudu git commit: KUDU-2137: protect against concurrent schema version change and tablet drop

KUDU-2137: protect against concurrent schema version change and tablet drop

Try as I might, I could not reproduce the failure in the bug report. I
looped the failed test several thousand times. I also looped the entire
alter_table-test suite a thousand times. Finally, I wrote a unit test that
hammers one table with concurrent add column, add partition, and drop
partition operations. Nothing worked.

So, here's my best guess at what's going on: if a tablet is dropped
while the master is processing its report, it's conceivable that we could
wind up in TabletInfo::set_reported_schema_version() with the table spinlock
held just after TableInfo::AddRemoveTablets() dropped the tablet. This would
cause us to decrement the tablet's "old" schema version from the table's
count map twice: once when dropping the tablet and a second time in
set_reported_schema_version().

The fix is straight-forward: after acquiring both spinlocks, double check
that the tablet is still a member of the table. We need to take the tablet
metadata lock to do this, but the saving grace is that tablet partition keys
are immutable, so this lock acquisition needn't overlap with the spinlocks.

Change-Id: I371fc310a97ae94ec2ebf04405db99c5f2937e1a
Reviewed-on: http://gerrit.cloudera.org:8080/7996
Tested-by: Kudu Jenkins
Reviewed-by: Dan Burkert <da...@apache.org>
Reviewed-by: Alexey Serbin <as...@cloudera.com>


Project: http://git-wip-us.apache.org/repos/asf/kudu/repo
Commit: http://git-wip-us.apache.org/repos/asf/kudu/commit/9f064944
Tree: http://git-wip-us.apache.org/repos/asf/kudu/tree/9f064944
Diff: http://git-wip-us.apache.org/repos/asf/kudu/diff/9f064944

Branch: refs/heads/master
Commit: 9f064944816f154af079713adc79a0dff8270742
Parents: c0e18fd
Author: Adar Dembo <ad...@cloudera.com>
Authored: Thu Sep 7 02:06:23 2017 -0700
Committer: Adar Dembo <ad...@cloudera.com>
Committed: Fri Sep 8 22:54:06 2017 +0000

----------------------------------------------------------------------
 src/kudu/master/catalog_manager.cc | 23 +++++++++++++++++++++--
 1 file changed, 21 insertions(+), 2 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/kudu/blob/9f064944/src/kudu/master/catalog_manager.cc
----------------------------------------------------------------------
diff --git a/src/kudu/master/catalog_manager.cc b/src/kudu/master/catalog_manager.cc
index 509f3c5..4619129 100644
--- a/src/kudu/master/catalog_manager.cc
+++ b/src/kudu/master/catalog_manager.cc
@@ -4326,8 +4326,17 @@ void TabletInfo::set_reported_schema_version(int64_t version) {
 
   // Slow path: we have a schema version change.
   //
-  // We need to hold both the table and tablet locks to make the change. By
-  // convention, the table lock is always acquired first.
+  // We need to hold both the table and tablet spinlocks to make the change. By
+  // convention, table locks are always acquired first.
+  //
+  // We also need to hold the tablet metadata lock in order to read the partition
+  // key, but it's OK to make a local copy of it (and release the lock) because
+  // the key is immutable.
+  string key_start;
+  {
+    TabletMetadataLock l(this, TabletMetadataLock::READ);
+    key_start = l.data().pb.partition().partition_key_start();
+  }
   std::lock_guard<rw_spinlock> table_l(table_->lock_);
   std::lock_guard<simple_spinlock> tablet_l(lock_);
 
@@ -4337,6 +4346,16 @@ void TabletInfo::set_reported_schema_version(int64_t version) {
     return;
   }
 
+  // Check that we weren't dropped from the table before acquiring the table lock.
+  //
+  // We also have to compare the returned object to 'this' in case our entry in
+  // the map was replaced with a new tablet (i.e. DROP RANGE PARTITION followed
+  // by ADD RANGE PARTITION).
+  auto* t = FindPtrOrNull(table_->tablet_map_, key_start);
+  if (!t || t != this) {
+    return;
+  }
+
   // Perform the changes.
   VLOG(3) << Substitute("$0: schema version changed from $1 to $2",
                         ToString(), old_version, version);