You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Mike Percy (Code Review)" <ge...@cloudera.org> on 2018/02/08 08:48:51 UTC

[kudu-CR] KUDU-2274. Shut down tombstoned replica when replacing it

Hello Alexey Serbin, Kudu Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/9246

to look at the new patch set (#2).

Change subject: KUDU-2274. Shut down tombstoned replica when replacing it
......................................................................

KUDU-2274. Shut down tombstoned replica when replacing it

Failing to shut down a tombstoned replica after copying it can lead to
unfortunate interleavings resulting in the replica ending up in an
inconsistent state. This actually occurred in a test environment,
although it proved very hard to reproduce.

This patch includes several changes in addition to shutting down
tombstoned replicas before replacing them:

* Remove the thread safety properties of the ConsensusMetadata class

  ConsensusMetadata doesn't need to be thread-safe, even though it is
  ref-counted, because it is required to be externally synchronized.
  This patch replaces the mutex with a DFAKE_MUTEX from the thread
  collision warner utility class in order to easily detect concurrent
  access due to buggy external sychronization.

* Also improve destructor state checks in TabletReplica.

* Fix another case of unlocked cmeta access by TSTabletManager.

These fixes were verified by running tombstoned_voting-stress-test with
4 CPU stress threads on the dist-test cluster after applying only the
ConsensusMetadata thread-safety portion of this patch, and then again
with the unlocked access fix and shutdown portions of this patch.

After removing the cmeta mutex only (186/200 failed):
http://dist-test.cloudera.org/job?job_id=mpercy.1518077234.135005

This full patch (200/200 succeeded):
http://dist-test.cloudera.org/job?job_id=mpercy.1518078690.66599

Change-Id: Ia8d086c3fba52826ebe0d3a44842d53ecb6a9265
---
M src/kudu/consensus/consensus_meta.cc
M src/kudu/consensus/consensus_meta.h
M src/kudu/consensus/raft_consensus.cc
M src/kudu/consensus/raft_consensus.h
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tserver/ts_tablet_manager.cc
6 files changed, 87 insertions(+), 155 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/46/9246/2
-- 
To view, visit http://gerrit.cloudera.org:8080/9246
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia8d086c3fba52826ebe0d3a44842d53ecb6a9265
Gerrit-Change-Number: 9246
Gerrit-PatchSet: 2
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>