You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@kudu.apache.org by to...@apache.org on 2016/10/05 21:29:43 UTC
[2/4] kudu git commit: delete_table-test: fix flakiness with table
creation timeout
delete_table-test: fix flakiness with table creation timeout
This test was timing out frequently when trying to create a
replication-2 table on a cluster with 3 tservers, one of which was
recently shut down. The master could try to place a replica on the
non-running server, which would then take some time to time out and try
a new placement.
The workaround here is to restart the master so it no longer sees the
crashed server as a valid placement option.
Change-Id: Ic61ad384e1b247f83bfc709528c4c7bda586c9d2
Reviewed-on: http://gerrit.cloudera.org:8080/4632
Reviewed-by: David Ribeiro Alves <dr...@apache.org>
Reviewed-by: Dinesh Bhat <di...@cloudera.com>
Tested-by: Kudu Jenkins
Project: http://git-wip-us.apache.org/repos/asf/kudu/repo
Commit: http://git-wip-us.apache.org/repos/asf/kudu/commit/98f42cdd
Tree: http://git-wip-us.apache.org/repos/asf/kudu/tree/98f42cdd
Diff: http://git-wip-us.apache.org/repos/asf/kudu/diff/98f42cdd
Branch: refs/heads/master
Commit: 98f42cdd878caa429377625a2288d22ed0d114f2
Parents: 0f99d40
Author: Todd Lipcon <to...@apache.org>
Authored: Wed Oct 5 10:52:29 2016 -0700
Committer: David Ribeiro Alves <dr...@apache.org>
Committed: Wed Oct 5 20:26:40 2016 +0000
----------------------------------------------------------------------
src/kudu/integration-tests/delete_table-test.cc | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
----------------------------------------------------------------------
http://git-wip-us.apache.org/repos/asf/kudu/blob/98f42cdd/src/kudu/integration-tests/delete_table-test.cc
----------------------------------------------------------------------
diff --git a/src/kudu/integration-tests/delete_table-test.cc b/src/kudu/integration-tests/delete_table-test.cc
index 6a0de2f..d331d43 100644
--- a/src/kudu/integration-tests/delete_table-test.cc
+++ b/src/kudu/integration-tests/delete_table-test.cc
@@ -432,7 +432,7 @@ TEST_F(DeleteTableTest, TestAutoTombstoneAfterCrashDuringTabletCopy) {
ASSERT_OK(cluster_->master()->Restart());
ASSERT_OK(cluster_->WaitForTabletServerCount(1, MonoDelta::FromSeconds(30)));
- // Set up a table which has a table only on TS 0. This will be used to test for
+ // Set up a table which has a tablet only on TS 0. This will be used to test for
// "collateral damage" bugs where incorrect handling of the main test tablet
// accidentally removes blocks from another tablet.
// We use a sequential workload so that we just flush and don't compact.
@@ -467,7 +467,15 @@ TEST_F(DeleteTableTest, TestAutoTombstoneAfterCrashDuringTabletCopy) {
ASSERT_OK(cluster_->tablet_server(2)->Restart());
cluster_->tablet_server(kTsIndex)->Shutdown();
- // Create a new tablet which is replicated on the other two servers.
+ // Restart the master to be sure that it only sees the live servers.
+ // Otherwise it may try to create a tablet with a replica on the down server.
+ // The table creation would eventually succeed after picking a different set of
+ // replicas, but not before causing a timeout.
+ cluster_->master()->Shutdown();
+ ASSERT_OK(cluster_->master()->Restart());
+ ASSERT_OK(cluster_->WaitForTabletServerCount(2, MonoDelta::FromSeconds(30)));
+
+ // Create a new table with a single tablet replicated on the other two servers.
// We use the same sequential workload. This produces block ID sequences
// that look like:
// TS 0: |---- blocks from 'other-table' ---]