You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Mike Percy (Code Review)" <ge...@cloudera.org> on 2017/05/22 22:01:45 UTC

[kudu-CR] KUDU-871 (part 3). Support tombstoned voting

Hello David Ribeiro Alves, Todd Lipcon,

I'd like you to do a code review.  Please visit

    http://gerrit.cloudera.org:8080/6960

to review the following change.

Change subject: KUDU-871 (part 3). Support tombstoned voting
......................................................................

KUDU-871 (part 3). Support tombstoned voting

This patch makes it possible for tombstoned tablet replicas to vote in
Raft elections.

* Add ConsensusMetadata flush control locking mechanism and plumb it
  into TabletReplica.
* Make ConsensusMetadata "immortal" to prevent flush races /
  interleavings between TabletCopy, TabletBootstrap, and RaftConsensus
  tombstoned voting.
* Add new test TombstonedVotingITest
* Fix DeleteTableITest.TestMergeConsensusMetadata after tombstoned
  voting changed its assumption that tombstoned tablets would not vote.

TODO:

* Add negative test for tombstoned voting.
* Add a targeted concurrency test that induces lots of tombstoned voting
  while running TabletCopy, TabletBootstrap, and DeleteTablet.

Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
---
M src/kudu/consensus/consensus-test-util.h
M src/kudu/consensus/consensus.h
M src/kudu/consensus/consensus_meta.cc
M src/kudu/consensus/consensus_meta.h
M src/kudu/consensus/raft_consensus.cc
M src/kudu/consensus/raft_consensus.h
M src/kudu/consensus/raft_consensus_quorum-test.cc
M src/kudu/consensus/raft_consensus_state.cc
M src/kudu/consensus/raft_consensus_state.h
M src/kudu/integration-tests/CMakeLists.txt
M src/kudu/integration-tests/delete_table-itest.cc
A src/kudu/integration-tests/tombstoned_voting-itest.cc
M src/kudu/tablet/tablet_bootstrap.cc
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/tablet_copy_client.h
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/ts_tablet_manager.h
20 files changed, 443 insertions(+), 102 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/60/6960/1
-- 
To view, visit http://gerrit.cloudera.org:8080/6960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-871. Support tombstoned voting

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change.

Change subject: KUDU-871. Support tombstoned voting
......................................................................


Patch Set 6:

(36 comments)

http://gerrit.cloudera.org:8080/#/c/6960/6/src/kudu/consensus/raft_consensus.cc
File src/kudu/consensus/raft_consensus.cc:

PS6, Line 1485: && state_ == kRunning
> why is this condition necesary? if we've just been tombstoned it still make
Fair point. This isn't required so I removed it.


PS6, Line 1509: CheckSafeToVoteUnlocked
> the duplicate logic here with CheckSafeToVoteUnlocked is a little unnerving
good idea. done


PS6, Line 1514: DCHECK(tombstone_last_logged_opid) << LogPrefixUnlocked() << "expected last-logged opid";
              :     local_last_logged_opid = *tombstone_last_logged_opid;
> if this DCHECK fails, would we crash anyway trying to dereference the boost
Done


PS6, Line 1519:     LOG_WITH_PREFIX_UNLOCKED(INFO) << "voting while tombstoned based on last-logged opid "
              :                                    << local_last_logged_opid;
> move this to the else block above and thus avoid the extra if?
Done


PS6, Line 2538: const optional<OpId>& tombstone_last_logged_opid
> whats the point of passing an opid here if you're only checking for its pre
I guess the phrasing was a bit indirect; I meant "CheckSafeToVoteUnlocked() guarantees that tombstoned_last_logged_opid is set". However I took Todd's advice and inlined this with some other logic in RequestVote().


http://gerrit.cloudera.org:8080/#/c/6960/6/src/kudu/consensus/raft_consensus.h
File src/kudu/consensus/raft_consensus.h:

PS6, Line 124: static Status Create(ConsensusOptions options,
             :                        RaftPeerPB local_peer_pb,
             :                        scoped_refptr<ConsensusMetadataManager> cmeta_manager,
             :                        ThreadPool* raft_pool,
             :                        std::shared_ptr<RaftConsensus>* consensus_out);
> didn't you delete this a while back?
Yeah, that was part of a bunch of refactoring that paid off some tech debt that had accumulated while adding features since the early days of Consensus. While I was refactoring I thought that for tombstoned voting we would now need to handle the case where we couldn't load cmeta but I think we can handle that at a higher level (TabletReplica or TSTabletManager) so putting this factory method back as a thin wrapper around { new; Init() } to avoid having to handle a non-ininitialized cmeta in RaftConsensus is reasonable, I think.


Line 130:   RaftConsensus(ConsensusOptions options,
> can the ctor be made private? (may need ALLOW_MAKE_SHARED)
Done


PS6, Line 271: term
> I think CurrentTerm or Term would be better since this takes lock_ and that
Done


PS6, Line 292: Tombstoned voting
> agreed. Maybe better here to just say that it synchronously transitions to 
I filled in some details in RequestVote, SetStateUnlocked(), and here.


Line 297:   void Shutdown();
> same
Done


Line 339:   enum State {
> it would be nice to have a state transition diagram of sorts here. Or alter
Done


PS6, Line 340:     // RaftConsensus has been freshly constructed.
             :     kNew,
> is this state ever externally exposed, considering we now have the factory 
Done


PS6, Line 343:     // RaftConsensus has been initialized.
             :     kInitialized,
> in terms of behavior, how is this state different than Stopped? eg let's sa
kInitialized is very similar to kStopped. The difference is whether a tablet was tombstoned when it first came up (kInitialized) or whether it was tombstoned after having been running already (kStopped).

I didn't document that, though, I only documented that voting is allowed in kInitialized and kStopped states in the RequestVote() doc comment.


PS6, Line 354: RaftConsensus has been stopped.
> maybe explain what this means in the context of the other states? something
Done


PS6, Line 391:   // Initializes the RaftConsensus object. This should be called before
             :   // publishing this object to any thread other than the thread that invoked
             :   // the constructor.
             :   Status Init();
> since the constructor is only invoked by the factory method, maybe not nece
Done


PS6, Line 647:   // existence of a known last logged opid.
             :   Status CheckSafeToVoteUnlocked(const boost::optional<OpId>& tombstone_last_logged_opid)
             :  
> similar to David's comment on the implementation: I think it would make mor
inlined this based on a comment in the .cc file


Line 783:   AtomicBool stopped_;
> it seems it was already like this, but what's the distinction between this 
Added a comment. Also, this should actually be shutdown_, not stopped_, so reverted the name change.


http://gerrit.cloudera.org:8080/#/c/6960/6/src/kudu/integration-tests/tombstoned_voting-itest.cc
File src/kudu/integration-tests/tombstoned_voting-itest.cc:

Line 48: 
> Replicated masters can have tombstoned replicas too, is that right? Does th
Hmm. This should be very rare but could be possible. I'll look into it.


PS6, Line 80: call Stop() on
> you mean 'manually tombstone' right? I was confused by reading this comment
Ah, right.


PS6, Line 195:   // Ask TS1 for a vote that should be denied (old last-logged opid).
             :   s = itest::RequestVote(ts1_ets, tablet_id, "B", current_term, old_opid,
             :                          /*ignore_live_leader=*/ true, /*is_pre_election=*/ false, kTimeout);
> can you add a test for same candidate same term but with an opid that is gr
Maybe I'm misunderstanding. It sounds like you're asking for a test of re-voting for (A, current_term, last_logged_opid + 1) but in that case we will return an automatic "yes" because we already voted for "A" this term due to this code path in raft_consensus.cc:

  // We already voted this term.
  if (request->candidate_term() == CurrentTermUnlocked() &&
      HasVotedCurrentTermUnlocked()) {

    // Already voted for the same candidate in the current term.
    if (GetVotedForCurrentTermUnlocked() == request->candidate_uuid()) {
      return RequestVoteRespondVoteAlreadyGranted(request, response);
    }

Also note that that a bunch of election scenarios are tested in leader_election-test.cc and this test isn't really supposed to exhaustively test all possible voting scenarios.


http://gerrit.cloudera.org:8080/#/c/6960/6/src/kudu/integration-tests/tombstoned_voting-stress-test.cc
File src/kudu/integration-tests/tombstoned_voting-stress-test.cc:

PS6, Line 146: LOG(WARNING)
> Why warning?  Is it something non-regular or non-expected?  Would INFO be m
I just liked that it highlights the lines so it's easier to follow along. But no worries, I changed it to INFO.


PS6, Line 148: LOG(FATAL)
> nit: would FAIL() be more idiomatic in this case (given it's a test)?
Can't do that; this runs on a separate thread.


PS6, Line 159: WARNING
> INFO?  But I might be missing something here.
Done


PS6, Line 197: 2
> nit: maybe cluster_->num_tablet_servers() ?
Done


PS6, Line 228: voter_thread
> What happens to the thread if the test fails in one of the ASSERT_OK() belo
good catch; done


PS6, Line 232: SleepFor(MonoDelta::FromMilliseconds(500));
> Could you add some comments explaining why this delay is necessary?  What w
I wanted to give us some time in this state. Since this is supposed to be a stress test and run in slow mode only it seemed appropriate to give a reasonable amount of time for looping to take place.

If we only gave it 50ms I don't think there would be a lot of activity during that time in the thread running the votes because we fsync votes to disk.

Added a comment.


PS6, Line 240: SleepFor(MonoDelta::FromMilliseconds(500));
> ditto
added a comment


http://gerrit.cloudera.org:8080/#/c/6960/6/src/kudu/tablet/tablet_replica.cc
File src/kudu/tablet/tablet_replica.cc:

PS6, Line 142:   consensus_ = std::move(consensus);
             :   set_state(INITIALIZED);
             :  
> i'm surprised these lines don't need to take lock_? is this method always c
yes, it's always called before publishing


Line 315:     case STOPPED:
> hrm, no case for STOPPING?
I didn't think it was really important but I'll add one. Done


http://gerrit.cloudera.org:8080/#/c/6960/6/src/kudu/tablet/tablet_replica.h
File src/kudu/tablet/tablet_replica.h:

PS6, Line 81:   // Initializes RaftConsensus.
            :   Status Init(ThreadPool* raft_pool);
> left a similar question in the implementation: is it expected that this be 
Yes, it should be called before publishing. I'll add a comment.


PS6, Line 96: tombstoned voting
> I think it's better to generalize this a bit. Is this _only_ used in the ca
Yeah I suppose calling out tombstoned voting here is overly specific. I changed this to note that RaftConsensus also gets transitioned to a stopped state. RaftConsensus has its own documentation about what that means.


Line 102:   // Shut down this tablet replica permanently.
> does this require that Stop be called first? again would be nice to have so
Done


PS6, Line 272:  // Wait until the TabletReplica is fully in STOPPED or SHUTDOWN state.
> does this have some requirements (does the replica need to be in "stopping"
It doesn't have any such requirements for correctness, but if the replica isn't being stopped then this call will hang.


http://gerrit.cloudera.org:8080/#/c/6960/6/src/kudu/tserver/tablet_service.cc
File src/kudu/tserver/tablet_service.cc:

PS6, Line 979: replica->tablet_metadata()->tombstone_last_logged_opid
> maybe cleaner for it to just return an optional<OpId> itself rather than th
Done, in a separate patch.


http://gerrit.cloudera.org:8080/#/c/6960/6/src/kudu/tserver/ts_tablet_manager.cc
File src/kudu/tserver/ts_tablet_manager.cc:

PS6, Line 525: tombtone
> not from this patch, but typo
Done


Line 663:   Status s = DeleteTabletData(replica->tablet_metadata(), cmeta_manager_, delete_type,
> per discussion on Slack, we may need to modify this so that, if there was n
Done, in a separate patch.


-- 
To view, visit http://gerrit.cloudera.org:8080/6960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Edward Fancher <ef...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-871. Support tombstoned voting

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change.

Change subject: KUDU-871. Support tombstoned voting
......................................................................


Patch Set 13: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/6960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
Gerrit-PatchSet: 13
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Edward Fancher <ef...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] KUDU-871. Support tombstoned voting

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6960

to look at the new patch set (#5).

Change subject: KUDU-871. Support tombstoned voting
......................................................................

KUDU-871. Support tombstoned voting

This patch makes it possible for tombstoned tablet replicas to vote in
Raft elections.

Changes:

* Add Stop() method to TabletReplica + Consensus lifecycle.
  * Includes new STOPPED state.
  * Tombstoning a replica should call Stop().
  * Deleting a replica should call Shutdown().
* Add positive and negative tests for tombstoned voting.
* Add a stress test that induces lots of tombstoned voting
  while running TabletCopy, TabletBootstrap, and DeleteTablet.
* Protect TabletMetadata::tombstone_last_logged_opid_ with a lock.
* Fix DeleteTableITest.TestMergeConsensusMetadata after tombstoned
  voting changed its assumption that tombstoned tablets would not vote.
* Fix several tests that expected tombstoned tablets to be SHUTDOWN when
  now they are STOPPED.

Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
---
M src/kudu/consensus/consensus-test-util.h
M src/kudu/consensus/raft_consensus.cc
M src/kudu/consensus/raft_consensus.h
M src/kudu/consensus/raft_consensus_quorum-test.cc
M src/kudu/integration-tests/CMakeLists.txt
M src/kudu/integration-tests/cluster_itest_util.cc
M src/kudu/integration-tests/cluster_itest_util.h
M src/kudu/integration-tests/delete_table-itest.cc
M src/kudu/integration-tests/external_mini_cluster-itest-base.cc
A src/kudu/integration-tests/tombstoned_voting-itest.cc
A src/kudu/integration-tests/tombstoned_voting-stress-test.cc
M src/kudu/master/sys_catalog.cc
M src/kudu/tablet/metadata.proto
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tablet/tablet_replica-test.cc
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
M src/kudu/tools/kudu-admin-test.cc
M src/kudu/tools/kudu-ts-cli-test.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/tablet_copy_client.h
M src/kudu/tserver/tablet_copy_source_session-test.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/ts_tablet_manager.h
M src/kudu/tserver/tserver-path-handlers.cc
27 files changed, 1,149 insertions(+), 242 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/60/6960/5
-- 
To view, visit http://gerrit.cloudera.org:8080/6960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-871. Support tombstoned voting

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.

Change subject: KUDU-871. Support tombstoned voting
......................................................................


Patch Set 10:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/6960/10/src/kudu/consensus/raft_consensus.cc
File src/kudu/consensus/raft_consensus.cc:

Line 1856:     CHECK_EQ(kStopping, state_) << State_Name(state_);
this is already CHECKed in SetStateUnlocked, no?


Line 1863:       cmeta_->set_leader_uuid("");
ClearLeaderUnlocked() ?


http://gerrit.cloudera.org:8080/#/c/6960/10/src/kudu/integration-tests/delete_tablet-itest.cc
File src/kudu/integration-tests/delete_tablet-itest.cc:

Line 95:   // Tablet bootstrap failure should result in a SHUTDOWN tablet with an error
per discussion on slack, let's revert this part of this patch since it isn't directly related and has external effects


http://gerrit.cloudera.org:8080/#/c/6960/10/src/kudu/integration-tests/tombstoned_voting-itest.cc
File src/kudu/integration-tests/tombstoned_voting-itest.cc:

PS10, Line 263: error (nee failed
per other comment, back out this change


PS10, Line 294: Shut each TS
down


http://gerrit.cloudera.org:8080/#/c/6960/10/src/kudu/integration-tests/tombstoned_voting-stress-test.cc
File src/kudu/integration-tests/tombstoned_voting-stress-test.cc:

PS10, Line 57:       : num_workers_(1),
this test seems to be implemented for multiple workers but only has one worker. Should this be >1? or should the code be simplified?


-- 
To view, visit http://gerrit.cloudera.org:8080/6960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
Gerrit-PatchSet: 10
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Edward Fancher <ef...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-871. Support tombstoned voting

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6960

to look at the new patch set (#10).

Change subject: KUDU-871. Support tombstoned voting
......................................................................

KUDU-871. Support tombstoned voting

This patch makes it possible for tombstoned tablet replicas to vote in
Raft elections.

Changes:

* Add Stop() method to TabletReplica + Consensus lifecycle.
  * Includes new STOPPED state.
  * Tombstoning a replica should call Stop().
  * Deleting a replica should call Shutdown().
* Add positive and negative tests for tombstoned voting.
* Add a stress test that induces lots of tombstoned voting
  while running TabletCopy, TabletBootstrap, and DeleteTablet.
* Fix DeleteTableITest.TestMergeConsensusMetadata after tombstoned
  voting changed its assumption that tombstoned tablets would not vote.
* Fix several tests that expected tombstoned tablets to be SHUTDOWN when
  now they are STOPPED.
* Get rid of tablet::FAILED state, instead go to SHUTDOWN with an error.

Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
---
M src/kudu/consensus/consensus-test-util.h
M src/kudu/consensus/raft_consensus.cc
M src/kudu/consensus/raft_consensus.h
M src/kudu/consensus/raft_consensus_quorum-test.cc
M src/kudu/integration-tests/CMakeLists.txt
M src/kudu/integration-tests/cluster_itest_util.cc
M src/kudu/integration-tests/cluster_itest_util.h
M src/kudu/integration-tests/delete_table-itest.cc
M src/kudu/integration-tests/delete_tablet-itest.cc
M src/kudu/integration-tests/external_mini_cluster-itest-base.cc
M src/kudu/integration-tests/raft_consensus-itest.cc
A src/kudu/integration-tests/tombstoned_voting-itest.cc
A src/kudu/integration-tests/tombstoned_voting-stress-test.cc
M src/kudu/integration-tests/ts_recovery-itest.cc
M src/kudu/master/catalog_manager.cc
M src/kudu/master/sys_catalog.cc
M src/kudu/tablet/metadata.proto
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tablet/tablet_replica-test.cc
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
M src/kudu/tools/ksck-test.cc
M src/kudu/tools/kudu-admin-test.cc
M src/kudu/tools/kudu-ts-cli-test.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/tablet_copy_client.h
M src/kudu/tserver/tablet_copy_source_session-test.cc
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/ts_tablet_manager.h
M src/kudu/tserver/tserver-path-handlers.cc
M src/kudu/util/make_shared.h
33 files changed, 1,439 insertions(+), 340 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/60/6960/10
-- 
To view, visit http://gerrit.cloudera.org:8080/6960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
Gerrit-PatchSet: 10
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Edward Fancher <ef...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-871. Support tombstoned voting

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has submitted this change and it was merged.

Change subject: KUDU-871. Support tombstoned voting
......................................................................


KUDU-871. Support tombstoned voting

This patch makes it possible for tombstoned tablet replicas to vote in
Raft elections.

Changes:

* Add Stop() method to TabletReplica + Consensus lifecycle.
  * Includes new STOPPED state.
  * Tombstoning a replica should call Stop().
  * Deleting a replica should call Shutdown().
* Persist ConsensusMetadata before returning from
  TabletCopyClient::Start() because we need cmeta to Init()
  RaftConsensus, which happens when registering the replica in
  TSTabletManager.
* TSTabletManager::DeleteTablet() should not consider FAILED == deleted,
  since we no longer destroy RaftConsensus when tombstoning a replica.
* Add positive and negative tests for tombstoned voting.
* Add a stress test that induces lots of tombstoned voting
  while running TabletCopy, TabletBootstrap, and DeleteTablet.
* Fix DeleteTableITest.TestMergeConsensusMetadata after tombstoned
  voting changed its assumption that tombstoned tablets would not vote.
* Fix several tests that expected tombstoned tablets to be SHUTDOWN when
  now they are STOPPED.

Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
Reviewed-on: http://gerrit.cloudera.org:8080/6960
Tested-by: Kudu Jenkins
Reviewed-by: Alexey Serbin <as...@cloudera.com>
Reviewed-by: Todd Lipcon <to...@apache.org>
---
M src/kudu/consensus/consensus-test-util.h
M src/kudu/consensus/raft_consensus.cc
M src/kudu/consensus/raft_consensus.h
M src/kudu/consensus/raft_consensus_quorum-test.cc
M src/kudu/integration-tests/CMakeLists.txt
M src/kudu/integration-tests/cluster_itest_util.cc
M src/kudu/integration-tests/cluster_itest_util.h
M src/kudu/integration-tests/delete_table-itest.cc
M src/kudu/integration-tests/external_mini_cluster-itest-base.cc
A src/kudu/integration-tests/tombstoned_voting-itest.cc
A src/kudu/integration-tests/tombstoned_voting-stress-test.cc
M src/kudu/master/sys_catalog.cc
M src/kudu/tablet/metadata.proto
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tablet/tablet_replica-test.cc
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
M src/kudu/tools/kudu-admin-test.cc
M src/kudu/tools/kudu-ts-cli-test.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/tablet_copy_client.h
M src/kudu/tserver/tablet_copy_source_session-test.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/ts_tablet_manager.h
M src/kudu/tserver/tserver-path-handlers.cc
M src/kudu/util/make_shared.h
27 files changed, 1,439 insertions(+), 296 deletions(-)

Approvals:
  Todd Lipcon: Looks good to me, approved
  Alexey Serbin: Looks good to me, but someone else must approve
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/6960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
Gerrit-PatchSet: 14
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Edward Fancher <ef...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-871. Support tombstoned voting

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6960

to look at the new patch set (#8).

Change subject: KUDU-871. Support tombstoned voting
......................................................................

KUDU-871. Support tombstoned voting

This patch makes it possible for tombstoned tablet replicas to vote in
Raft elections.

Changes:

* Add Stop() method to TabletReplica + Consensus lifecycle.
  * Includes new STOPPED state.
  * Tombstoning a replica should call Stop().
  * Deleting a replica should call Shutdown().
* Add positive and negative tests for tombstoned voting.
* Add a stress test that induces lots of tombstoned voting
  while running TabletCopy, TabletBootstrap, and DeleteTablet.
* Fix DeleteTableITest.TestMergeConsensusMetadata after tombstoned
  voting changed its assumption that tombstoned tablets would not vote.
* Fix several tests that expected tombstoned tablets to be SHUTDOWN when
  now they are STOPPED.
* Get rid of tablet::FAILED state, instead go to SHUTDOWN with an error.

Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
---
M src/kudu/consensus/consensus-test-util.h
M src/kudu/consensus/raft_consensus.cc
M src/kudu/consensus/raft_consensus.h
M src/kudu/consensus/raft_consensus_quorum-test.cc
M src/kudu/integration-tests/CMakeLists.txt
M src/kudu/integration-tests/cluster_itest_util.cc
M src/kudu/integration-tests/cluster_itest_util.h
M src/kudu/integration-tests/delete_table-itest.cc
M src/kudu/integration-tests/delete_tablet-itest.cc
M src/kudu/integration-tests/external_mini_cluster-itest-base.cc
M src/kudu/integration-tests/raft_consensus-itest.cc
A src/kudu/integration-tests/tombstoned_voting-itest.cc
A src/kudu/integration-tests/tombstoned_voting-stress-test.cc
M src/kudu/integration-tests/ts_recovery-itest.cc
M src/kudu/master/catalog_manager.cc
M src/kudu/master/sys_catalog.cc
M src/kudu/tablet/metadata.proto
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tablet/tablet_replica-test.cc
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
M src/kudu/tools/ksck-test.cc
M src/kudu/tools/kudu-admin-test.cc
M src/kudu/tools/kudu-ts-cli-test.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/tablet_copy_client.h
M src/kudu/tserver/tablet_copy_source_session-test.cc
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/ts_tablet_manager.h
M src/kudu/tserver/tserver-path-handlers.cc
M src/kudu/util/make_shared.h
33 files changed, 1,393 insertions(+), 341 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/60/6960/8
-- 
To view, visit http://gerrit.cloudera.org:8080/6960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
Gerrit-PatchSet: 8
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Edward Fancher <ef...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-871. Support tombstoned voting

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change.

Change subject: KUDU-871. Support tombstoned voting
......................................................................


Patch Set 10:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/6960/10/src/kudu/consensus/raft_consensus.cc
File src/kudu/consensus/raft_consensus.cc:

PS10, Line 1494: local_last_logged_opid = queue_->GetLastOpIdInLog();
What if tombstone_last_logged_opid was provided as well?  Is it some some of programming error?  Or it's OK?  If the former, consider returning IllegalArgument or adding DCHECK()


http://gerrit.cloudera.org:8080/#/c/6960/10/src/kudu/consensus/raft_consensus.h
File src/kudu/consensus/raft_consensus.h:

PS10, Line 372: // Raft is in the process of stopping and will not accept writes.
              :     kStopping,
nit: is it possible to request a vote if in this state?  It seems it is, but it would be nice to explicitly specify that.


-- 
To view, visit http://gerrit.cloudera.org:8080/6960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
Gerrit-PatchSet: 10
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Edward Fancher <ef...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-871. Support tombstoned voting

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change.

Change subject: KUDU-871. Support tombstoned voting
......................................................................


Patch Set 7:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/6960/7/src/kudu/tserver/tablet_service.cc
File src/kudu/tserver/tablet_service.cc:

Line 115: using kudu::consensus::MinimumOpId;
> warning: using decl 'MinimumOpId' is unused [misc-unused-using-decls]
Done


Line 117: using kudu::consensus::OpIdBiggerThan;
> warning: using decl 'OpIdBiggerThan' is unused [misc-unused-using-decls]
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
Gerrit-PatchSet: 7
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Edward Fancher <ef...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-871. Support tombstoned voting

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.

Change subject: KUDU-871. Support tombstoned voting
......................................................................


Patch Set 13: Code-Review+1

lgtm. Did you do any last test loop of the relevant tests after the latest round of changes?

-- 
To view, visit http://gerrit.cloudera.org:8080/6960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
Gerrit-PatchSet: 13
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Edward Fancher <ef...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] KUDU-871. Support tombstoned voting

Posted by "David Ribeiro Alves (Code Review)" <ge...@cloudera.org>.
David Ribeiro Alves has posted comments on this change.

Change subject: KUDU-871. Support tombstoned voting
......................................................................


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6960/6/src/kudu/integration-tests/tombstoned_voting-itest.cc
File src/kudu/integration-tests/tombstoned_voting-itest.cc:

PS6, Line 195:   // Ask TS1 for a vote that should be denied (old last-logged opid).
             :   s = itest::RequestVote(ts1_ets, tablet_id, "B", current_term, old_opid,
             :                          /*ignore_live_leader=*/ true, /*is_pre_election=*/ false, kTimeout);
> Maybe I'm misunderstanding. It sounds like you're asking for a test of re-v
my request was to have a directed test where the last logged op id of the requestor is greater than the tombstoned replica's and not just precisely its last logged opid, but that should still be granted, assuming this will be the common case since the replica is tombstoned and doesn't actually increase it's last logged op id.

moreover I was also wondering if there would be a problem if the last logged op id of the tombstoned replica turns back due to its log being deleted (and it being restarted, for instance)


-- 
To view, visit http://gerrit.cloudera.org:8080/6960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Edward Fancher <ef...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-871. Support tombstoned voting

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6960

to look at the new patch set (#12).

Change subject: KUDU-871. Support tombstoned voting
......................................................................

KUDU-871. Support tombstoned voting

This patch makes it possible for tombstoned tablet replicas to vote in
Raft elections.

Changes:

* Add Stop() method to TabletReplica + Consensus lifecycle.
  * Includes new STOPPED state.
  * Tombstoning a replica should call Stop().
  * Deleting a replica should call Shutdown().
* Persist ConsensusMetadata before returning from
  TabletCopyClient::Start() because we need cmeta to Init()
  RaftConsensus, which happens when registering the replica in
  TSTabletManager.
* TSTabletManager::DeleteTablet() should not consider FAILED == deleted,
  since we no longer destroy RaftConsensus when tombstoning a replica.
* Add positive and negative tests for tombstoned voting.
* Add a stress test that induces lots of tombstoned voting
  while running TabletCopy, TabletBootstrap, and DeleteTablet.
* Fix DeleteTableITest.TestMergeConsensusMetadata after tombstoned
  voting changed its assumption that tombstoned tablets would not vote.
* Fix several tests that expected tombstoned tablets to be SHUTDOWN when
  now they are STOPPED.

Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
---
M src/kudu/consensus/consensus-test-util.h
M src/kudu/consensus/raft_consensus.cc
M src/kudu/consensus/raft_consensus.h
M src/kudu/consensus/raft_consensus_quorum-test.cc
M src/kudu/integration-tests/CMakeLists.txt
M src/kudu/integration-tests/cluster_itest_util.cc
M src/kudu/integration-tests/cluster_itest_util.h
M src/kudu/integration-tests/delete_table-itest.cc
M src/kudu/integration-tests/external_mini_cluster-itest-base.cc
A src/kudu/integration-tests/tombstoned_voting-itest.cc
A src/kudu/integration-tests/tombstoned_voting-stress-test.cc
M src/kudu/master/sys_catalog.cc
M src/kudu/tablet/metadata.proto
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tablet/tablet_replica-test.cc
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
M src/kudu/tools/kudu-admin-test.cc
M src/kudu/tools/kudu-ts-cli-test.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/tablet_copy_client.h
M src/kudu/tserver/tablet_copy_source_session-test.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/ts_tablet_manager.h
M src/kudu/tserver/tserver-path-handlers.cc
M src/kudu/util/make_shared.h
27 files changed, 1,439 insertions(+), 296 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/60/6960/12
-- 
To view, visit http://gerrit.cloudera.org:8080/6960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
Gerrit-PatchSet: 12
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Edward Fancher <ef...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-871. Support tombstoned voting

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.

Change subject: KUDU-871. Support tombstoned voting
......................................................................


Patch Set 13: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/6960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
Gerrit-PatchSet: 13
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Edward Fancher <ef...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] KUDU-871. Support tombstoned voting

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change.

Change subject: KUDU-871. Support tombstoned voting
......................................................................


Patch Set 13:

> lgtm. Did you do any last test loop of the relevant tests after the
 > latest round of changes?

Thanks for the review. I'll double-check that now and will post results. I did see some unrelated failures on some unrelated tests, but we've have a lot of code go in this week.

-- 
To view, visit http://gerrit.cloudera.org:8080/6960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
Gerrit-PatchSet: 13
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Edward Fancher <ef...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] KUDU-871. Support tombstoned voting

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.

Change subject: KUDU-871. Support tombstoned voting
......................................................................


Patch Set 6:

(25 comments)

http://gerrit.cloudera.org:8080/#/c/6960/6/src/kudu/consensus/raft_consensus.cc
File src/kudu/consensus/raft_consensus.cc:

PS6, Line 1485: && state_ == kRunning
why is this condition necesary? if we've just been tombstoned it still makes sense that we should withhold votes for a period of time, right?


PS6, Line 1509: CheckSafeToVoteUnlocked
the duplicate logic here with CheckSafeToVoteUnlocked is a little unnerving. What do you think about essentially inlining its logic here and merging with this if condition? eg something of this nature:

switch (state_) {
  case kShutdown:
    return IllegalState...
  case kRunning:
    local_last_logged = GetLatestFromLog()
    break;
  default:
    if (FLAGS_allow_tombstone_voting && tombstone_last_logged_opid) {
      local_last_logged = tombstone_last_logged;
    } else {
      return IllegalState
    }
    break;
}

I think that might be easier to follow and net less lines of code


PS6, Line 1514: DCHECK(tombstone_last_logged_opid) << LogPrefixUnlocked() << "expected last-logged opid";
              :     local_last_logged_opid = *tombstone_last_logged_opid;
if this DCHECK fails, would we crash anyway trying to dereference the boost::none? if so, I think we should either use CHECK (so we get a nice crash log) or we should try to handle this case by just denying the vote (if we aren't 100% sure that this would always be the case, we can always vote conservatively)


http://gerrit.cloudera.org:8080/#/c/6960/6/src/kudu/consensus/raft_consensus.h
File src/kudu/consensus/raft_consensus.h:

Line 130:   RaftConsensus(ConsensusOptions options,
can the ctor be made private? (may need ALLOW_MAKE_SHARED)

I don't see any external usage of it but maybe my grepping was bad.


PS6, Line 271: term
> nit: current_term() would be more explicit
I think CurrentTerm or Term would be better since this takes lock_ and that lock is often held for long periods of time (thus the call may be quite slow compared to your typical getter)


PS6, Line 292: Tombstoned voting
> not super clear what "tombstoned voting" is maybe explain it somewhere?
agreed. Maybe better here to just say that it synchronously transitions to kStopped state, and refer to the kStopped definition in the state enum for what behavior that entails?


Line 297:   void Shutdown();
same


Line 339:   enum State {
it would be nice to have a state transition diagram of sorts here. Or alternatively, for each state, list out the potential subsequent states and what triggers the transition? eg I can't remember now whether we can transition back from kStopped to kRunning or not.


PS6, Line 340:     // RaftConsensus has been freshly constructed.
             :     kNew,
is this state ever externally exposed, considering we now have the factory method that returns a post-init object? if not, worth noting as much


PS6, Line 343:     // RaftConsensus has been initialized.
             :     kInitialized,
in terms of behavior, how is this state different than Stopped? eg let's say I start up a server which has some tablets in tombstoned state, do their RaftConsensus instances end up in kInitialized state or kStopped state? Is it important to make a distinction?


PS6, Line 391:   // Initializes the RaftConsensus object. This should be called before
             :   // publishing this object to any thread other than the thread that invoked
             :   // the constructor.
             :   Status Init();
since the constructor is only invoked by the factory method, maybe not necessary to be so explicit?


PS6, Line 647:   // existence of a known last logged opid.
             :   Status CheckSafeToVoteUnlocked(const boost::optional<OpId>& tombstone_last_logged_opid)
             :  
similar to David's comment on the implementation: I think it would make more sense to pass a 'bool tombstoned_op_id_known' or somesuch, and in the implementation add a comment explaining the scenario(s) in which we would _not_ know the tombstoned op id and why it's not safe to vote in those scenarios.

Use of a bool instead of the opid makes it more clear that we just care about whether we know a valid opid, and aren't deciding here based on its value.


Line 783:   AtomicBool stopped_;
it seems it was already like this, but what's the distinction between this 'stopped_' member and just checking 'state_ == kStopped'? mind adding a comment explaining and perhaps a TODO to get rid of it if possible? (not going to make you get rid of it in this patch since it's pre-existing and I dont think you made it worse)


http://gerrit.cloudera.org:8080/#/c/6960/6/src/kudu/integration-tests/tombstoned_voting-itest.cc
File src/kudu/integration-tests/tombstoned_voting-itest.cc:

PS6, Line 80: call Stop() on
you mean 'manually tombstone' right? I was confused by reading this comment into thinking you were SIGSTOPping TS1 and then expecting it to vote or something, which didn't make much sense.


http://gerrit.cloudera.org:8080/#/c/6960/4/src/kudu/integration-tests/tombstoned_voting-stress-test.cc
File src/kudu/integration-tests/tombstoned_voting-stress-test.cc:

Line 60:  protected:
this seems unused?


Line 182:   // We want to control leader election manually and we only want 2 replicas.
would be nice to have another more stressful test that doesn't attempt to verify any kind of correctness or coordinate the lifecycle changes with the vote requests. I think the coordination here has some effect that reduces the probability of races


http://gerrit.cloudera.org:8080/#/c/6960/6/src/kudu/tablet/tablet_replica.cc
File src/kudu/tablet/tablet_replica.cc:

PS6, Line 142:   consensus_ = std::move(consensus);
             :   set_state(INITIALIZED);
             :  
i'm surprised these lines don't need to take lock_? is this method always called before publishing a TabletReplica instance?


Line 315:     case STOPPED:
hrm, no case for STOPPING?


http://gerrit.cloudera.org:8080/#/c/6960/6/src/kudu/tablet/tablet_replica.h
File src/kudu/tablet/tablet_replica.h:

PS6, Line 81:   // Initializes RaftConsensus.
            :   Status Init(ThreadPool* raft_pool);
left a similar question in the implementation: is it expected that this be called before publishing an instance? might also be nice to add a note that, even if this fails, it leaves the instance in a self-consistent state (which is in contrast to some other Init methods) so that it can be left registered in the replica map, etc?


PS6, Line 96: tombstoned voting
I think it's better to generalize this a bit. Is this _only_ used in the case of tombstoning? or can all stopped replicas vote? is it legal to call this without first putting the tablet into tombstoned state?


Line 102:   // Shut down this tablet replica permanently.
does this require that Stop be called first? again would be nice to have some diagram of the state transitions to refer to for these lifecycle methods, perhaps in the class doc


http://gerrit.cloudera.org:8080/#/c/6960/4/src/kudu/tserver/tablet_service.cc
File src/kudu/tserver/tablet_service.cc:

Line 947:   // from the TabletMetadata into RaftConsensus::RequestVote().
> Yeah, it's racy, but it's safe because we're checking after we have a refer
I guess my concern is less with the safety of the race but more with whether the state checks are even proper, given they are race.

I did some hacking on the stress test locally to inject some sleeps after grabbing the state and managed to catch the case where the tablet transitioned from READY to TOMBSTONED. This resulted in a call into RaftConsensus::RequestVote() with no last_logged_opid, so it returned:

> Illegal state: RaftConsensus is not running: State = Stopped

given we already need to handle this racy check within the RaftConsensus vote code itself, I'm not sure what extra value this is bringing to check it additionally here?


BTW, it would be really nice to include some class-level lifecycle comments somewhere. I keep forgetting what happens to the TabletReplica, Tablet, TabletMetadata, and RaftConsensus instances in the case of a tombstone and a re-copy -- which ones are reused, how we ensure that the old ones aren't used anymore after the new ones are created, etc. I remember drawing such a diagram with you a couple weeks ago, and would be nice to have it in the code for reference.


PS4, Line 966:   LOG(INFO) << "Received RequestConsensusVote() RPC: " << Secu
> > when would this equal MinimumOpId? if it's never been tombstoned?
> Additionally, just using max() seems wrong in case there was a log truncation involved. We want to use our latest opid state.

If there is truncation then the truncating OpId would be 'higher' than the other one, because it has a higher term, right? i.e comparison is lexicographic comparison of <term, index> tuples


http://gerrit.cloudera.org:8080/#/c/6960/6/src/kudu/tserver/tablet_service.cc
File src/kudu/tserver/tablet_service.cc:

PS6, Line 979: replica->tablet_metadata()->tombstone_last_logged_opid
maybe cleaner for it to just return an optional<OpId> itself rather than the special uninitialized value?


http://gerrit.cloudera.org:8080/#/c/6960/6/src/kudu/tserver/ts_tablet_manager.cc
File src/kudu/tserver/ts_tablet_manager.cc:

Line 663:   Status s = DeleteTabletData(replica->tablet_metadata(), cmeta_manager_, delete_type,
per discussion on Slack, we may need to modify this so that, if there was no valid log prior to the deletion, we explicitly _clear_ the last_logged_opid, or set to some sentinel value or something to indicate that it's not safe to vote due to potentially lost log state


-- 
To view, visit http://gerrit.cloudera.org:8080/6960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-871. Support tombstoned voting

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change.

Change subject: KUDU-871. Support tombstoned voting
......................................................................


Patch Set 10:

(12 comments)

http://gerrit.cloudera.org:8080/#/c/6960/10/src/kudu/integration-tests/cluster_itest_util.h
File src/kudu/integration-tests/cluster_itest_util.h:

PS10, Line 236: // Request the given replica to vote.
nit: could you document at least first 5 parameters of this method?


http://gerrit.cloudera.org:8080/#/c/6960/10/src/kudu/integration-tests/tombstoned_voting-itest.cc
File src/kudu/integration-tests/tombstoned_voting-itest.cc:

Line 18: #include "kudu/consensus/metadata.pb.h"
nit: from IWYU

src/kudu/integration-tests/tombstoned_voting-itest.cc should add these
 lines:
#include <gflags/gflags_declare.h>
#include <glog/logging.h>
#include <gtest/gtest.h>
#include <boost/optional/optional.hpp>
#include <cstdint>
#include <memory>
#include <ostream>
#include <string>
#include <unordered_map>
#include <vector>
#include "kudu/consensus/consensus.pb.h"
#include "kudu/consensus/opid.pb.h"
#include "kudu/consensus/opid_util.h"
#include "kudu/fs/fs_manager.h"
#include "kudu/gutil/ref_counted.h"
#include "kudu/integration-tests/internal_mini_cluster.h"
#include "kudu/tablet/metadata.pb.h"
#include "kudu/tserver/tserver.pb.h"
#include "kudu/util/env.h"
#include "kudu/util/monotime.h"
#include "kudu/util/status.h"
#include "kudu/util/test_macros.h"
#include "kudu/util/test_util.h"


PS10, Line 167: TS1
nit: the tablet?  Or it's possible to tombstone a tablet server now?


PS10, Line 172: TServerDetails* ts1_ets
nit: consider moving this closer to the call site where it's used.


Line 173:   scoped_refptr<TabletReplica> replica;
nit: does it make sense to verify the scenario below works without errors even after restarting the tablet server TS1?


PS10, Line 181: "A"
Could you add a comment explaining that the actual candidate UUID is not crucial in this test?


PS10, Line 249: ts0_replica->consensus()->StepDown(&resp)
nit: should it be some selective error handling based on the code set in the 'resp'?  What if it failed due to some other reason -- would the test handle that properly?


PS10, Line 253: ts0
nit: 'TS0' since it has been referred as that already.


PS10, Line 256: FOLLOWER
Could it change to something like 'LEARNER' in the middle and bring some flakiness?


http://gerrit.cloudera.org:8080/#/c/6960/10/src/kudu/integration-tests/tombstoned_voting-stress-test.cc
File src/kudu/integration-tests/tombstoned_voting-stress-test.cc:

Line 18: #include <atomic>
nit: from IWYU report:

src/kudu/integration-tests/tombstoned_voting-stress-test.cc should add these lines:
#include <glog/logging.h>
#include <gtest/gtest.h>
#include <boost/optional/optional.hpp>
#include <cstdint>
#include <memory>
#include <ostream>
#include <string>
#include <unordered_map>
#include <vector>
#include "kudu/common/wire_protocol.pb.h"
#include "kudu/consensus/consensus.pb.h"
#include "kudu/consensus/opid.pb.h"
#include "kudu/gutil/macros.h"
#include "kudu/integration-tests/external_mini_cluster.h"
#include "kudu/integration-tests/external_mini_cluster_fs_inspector.h"
#include "kudu/tablet/metadata.pb.h"
#include "kudu/util/monotime.h"
#include "kudu/util/net/net_util.h"
#include "kudu/util/status.h"
#include "kudu/util/test_macros.h"
#include "kudu/util/test_util.h"

src/kudu/integration-tests/tombstoned_voting-stress-test.cc should remove these lines:
- #include "kudu/consensus/metadata.pb.h"  // lines 25-25
- #include "kudu/consensus/raft_consensus.h"  // lines 26-26
- #include "kudu/util/random.h"  // lines 32-32


http://gerrit.cloudera.org:8080/#/c/6960/10/src/kudu/master/catalog_manager.cc
File src/kudu/master/catalog_manager.cc:

Line 2489:     LOG(WARNING) << "Tablet " << tablet->ToString() << " has failed on TS "
wouldn't it be natural to have tablet::SHUTDOWN check here instead?  Or it's not guaranteed to be SHUTDOWN at this point?


http://gerrit.cloudera.org:8080/#/c/6960/10/src/kudu/tablet/tablet_replica.cc
File src/kudu/tablet/tablet_replica.cc:

PS10, Line 304: LOG(DFATAL)
The rest of the CHECKs would work for non-debug builds as well.  What is the reason of having LOG(DFATAL), not LOG(FATAL) here?  If there is any, please add a comment.


-- 
To view, visit http://gerrit.cloudera.org:8080/6960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
Gerrit-PatchSet: 10
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Edward Fancher <ef...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-871. Support tombstoned voting

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change.

Change subject: KUDU-871. Support tombstoned voting
......................................................................


Patch Set 6:

(7 comments)

Glanced through one of the tests.  Will take a closer look today.

http://gerrit.cloudera.org:8080/#/c/6960/6/src/kudu/integration-tests/tombstoned_voting-stress-test.cc
File src/kudu/integration-tests/tombstoned_voting-stress-test.cc:

PS6, Line 146: LOG(WARNING)
Why warning?  Is it something non-regular or non-expected?  Would INFO be more appropriate here?

Also, is it crucial to have these INFO logs output from the test?


PS6, Line 148: LOG(FATAL)
nit: would FAIL() be more idiomatic in this case (given it's a test)?


PS6, Line 159: WARNING
INFO?  But I might be missing something here.


PS6, Line 197: 2
nit: maybe cluster_->num_tablet_servers() ?


PS6, Line 228: voter_thread
What happens to the thread if the test fails in one of the ASSERT_OK() below?  Consider joining the thread in that case as well (it might be SIGSEGV or something like that otherwise).


PS6, Line 232: SleepFor(MonoDelta::FromMilliseconds(500));
Could you add some comments explaining why this delay is necessary?  What would happen if this delay were 10 times less?


PS6, Line 240: SleepFor(MonoDelta::FromMilliseconds(500));
ditto


-- 
To view, visit http://gerrit.cloudera.org:8080/6960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] WIP: KUDU-871. Support tombstoned voting

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Hello David Ribeiro Alves, Alexey Serbin, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6960

to look at the new patch set (#2).

Change subject: WIP: KUDU-871. Support tombstoned voting
......................................................................

WIP: KUDU-871. Support tombstoned voting

This patch makes it possible for tombstoned tablet replicas to vote in
Raft elections.

Changes:

* Add new test TombstonedVotingITest
* Fix DeleteTableITest.TestMergeConsensusMetadata after tombstoned
  voting changed its assumption that tombstoned tablets would not vote.

TODO:

* Add a negative test for tombstoned voting.
* Add a targeted concurrency test that induces lots of tombstoned voting
  while running TabletCopy, TabletBootstrap, and DeleteTablet.

Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
---
M src/kudu/consensus/consensus-test-util.h
M src/kudu/consensus/raft_consensus.cc
M src/kudu/consensus/raft_consensus.h
M src/kudu/consensus/raft_consensus_quorum-test.cc
M src/kudu/integration-tests/CMakeLists.txt
M src/kudu/integration-tests/delete_table-itest.cc
A src/kudu/integration-tests/tombstoned_voting-itest.cc
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/tablet_copy_client.h
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/tserver-path-handlers.cc
13 files changed, 248 insertions(+), 57 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/60/6960/2
-- 
To view, visit http://gerrit.cloudera.org:8080/6960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-871. Support tombstoned voting

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.

Change subject: KUDU-871. Support tombstoned voting
......................................................................


Patch Set 4:

(21 comments)

did a first pass, probably will have more comments after a second pass but figured I'd send these along

http://gerrit.cloudera.org:8080/#/c/6960/4/src/kudu/consensus/raft_consensus.cc
File src/kudu/consensus/raft_consensus.cc:

Line 188: Status RaftConsensus::Init() {
I think the DCHECK on state here was useful, even if it needs to be broadened a bit now


PS4, Line 1493: tombstone_last_logged_opid ? *tombstone_last_logged_opid :
              :                                                                     GetLatestOpIdFromLog();
could we simplify by always using 'max()' of the entry from the metadata and the entry from our log (if we have a log)


Line 1497:     LOG_WITH_PREFIX_UNLOCKED(INFO) << "voting while tombstoned";
any way to improve this with a bit more info?


Line 2004:       state_ = new_state;
given this is in every case, move it down below the switch?


PS4, Line 2515: (!tombstone_last_logged_opid
not 100% following why this portion of the condition is necessary


PS4, Line 2520: state_ == kShutdown
should we instead be checking the expected state (ie kStopped)? aren't there some other states we might be in where we don't want to vote?


Line 2634:   return kMinimumTerm;
under what circumstances do we hit this case? It seems like cmeta_ should be set in Init() and we shouldn't ever be calling functions on RaftConsensus until after a successful Init() (see other comment about changing to a factory function so it's more clear that we never expose a constructed-by-not-initialized instance)


Line 2685:   if (cmeta_) {
same


http://gerrit.cloudera.org:8080/#/c/6960/4/src/kudu/integration-tests/cluster_itest_util.h
File src/kudu/integration-tests/cluster_itest_util.h:

PS4, Line 239:                    boost::optional<bool> ignore_live_leader,
             :                    boost::optional<bool> is_pre_election,
what's the point of making these optional? don't they have well-defined defaults (false) anyway?


http://gerrit.cloudera.org:8080/#/c/6960/4/src/kudu/tablet/metadata.proto
File src/kudu/tablet/metadata.proto:

Line 135: // Tablet states are sent in TabletReports and kept in TabletReplica.
what effect will these changes have if there is a rolling upgrade with an old master and new tablet servers? do we need to force an incompatibility so that people upgrade masters first? or is it safe to have these show up as 'UNKNOWN' on an old-version master?


http://gerrit.cloudera.org:8080/#/c/6960/4/src/kudu/tablet/tablet_replica.cc
File src/kudu/tablet/tablet_replica.cc:

Line 133:   set_state(INITIALIZED);
maybe moot based on the suggestion of a factory method, but if a method like this remains, the state change should wait until the method is successful


Line 272:     // Release mem tracker resources.
is this comment stale?


PS4, Line 407:   LOG_WITH_PREFIX(INFO) << "Fetched status for addr " << &state_
             :                         << ": " << SecureDebugString(*status_pb_out);
leftover debug print?


Line 427:   CHECK_EQ(INITIALIZED, state_);
redundant with the DCHECK in set_state


http://gerrit.cloudera.org:8080/#/c/6960/4/src/kudu/tablet/tablet_replica.h
File src/kudu/tablet/tablet_replica.h:

PS4, Line 75:   TabletReplica(scoped_refptr<TabletMetadata> meta,
            :                 scoped_refptr<consensus::ConsensusMetadataManager> cmeta_manager,
            :                 consensus::RaftPeerPB local_peer_pb,
            :                 ThreadPool* apply_pool,
            :                 Callback<void(const std::string& reason)> mark_dirty_clbk);
            : 
            :   // Initializes RaftConsensus.
            :   Status Init(ThreadPool* raft_pool);
instead of having this split "construct and then init" how about a factory method which would have a similar signature to the original constructor?

eg:

static Status Create(scoped_refptr<TabletMetadata> meta,
                     ...,
                     scoped_refptr<TabletReplica>* replica);

which does the construction and Init()?

That way there is less externally-visible lifecycle for callers to think about (and maybe could reduce one of the enum values too)?


PS4, Line 96: tombstoned 
voting in general, right? ie what would happen if you called Stop() but the data wasn't tombstoned, and you asked to vote?


PS4, Line 169:   // If any peers in the consensus configuration lack permanent uuids, get them via an
             :   // RPC call and update.
             :   // TODO: move this to raft_consensus.h.
             :   Status UpdatePermanentUuids();
can you remove this while you're here? seems to be dead


Line 277:   // Wait until the TabletReplica is fully in STOPPED state.
or SHUTDOWN


http://gerrit.cloudera.org:8080/#/c/6960/4/src/kudu/tserver/tablet_service.cc
File src/kudu/tserver/tablet_service.cc:

Line 947:   tablet::TabletDataState data_state = replica->tablet_metadata()->tablet_data_state();
aren't all these state checks racy? after you grab this here, there's nothing preventing the tablet from entering DELETED or COPYING state? It seems like at this layer it should just be conditioned on whether we can find a Consensus object, and then the actual RaftConsensus::RequestVote() call should be responsible for ensuring it's in an appropriate state?


PS4, Line 966:     if (OpIdBiggerThan(tmp_last_logged_opid, MinimumOpId())) {
when would this equal MinimumOpId? if it's never been tombstoned?

Would it be safe to just pass this in always as a non-optional, and then in RequestVote always just take max(log_->GetLastLoggedOpId(), tombstone_last_logged_opid())?


http://gerrit.cloudera.org:8080/#/c/6960/4/src/kudu/tserver/tserver-path-handlers.cc
File src/kudu/tserver/tserver-path-handlers.cc:

Line 460:   if (!consensus || !consensus->IsRunning()) {
instead of this, can you make Consensus::DumpStatusHtml dump something reasonable in the case that it is STOPPED?

Otherwise it seems like this is a racy check since you have nothing ensuring that the state doesn't change between here and L464


-- 
To view, visit http://gerrit.cloudera.org:8080/6960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-871. Support tombstoned voting

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change.

Change subject: KUDU-871. Support tombstoned voting
......................................................................


Patch Set 13:

The new tests still look good. 100% reliable after 100 loops w/ TSAN + 8x stress:

tombstoned_voting-itest: http://dist-test.cloudera.org/job?job_id=mpercy.1503540656.2157
tombstoned_voting-stress-test: http://dist-test.cloudera.org/job?job_id=mpercy.1503540557.1392

If other tests get flaky, they'll show up on the flaky test dashboard and I'll fix them.

-- 
To view, visit http://gerrit.cloudera.org:8080/6960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
Gerrit-PatchSet: 13
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Edward Fancher <ef...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] KUDU-871. Support tombstoned voting

Posted by "David Ribeiro Alves (Code Review)" <ge...@cloudera.org>.
David Ribeiro Alves has posted comments on this change.

Change subject: KUDU-871. Support tombstoned voting
......................................................................


Patch Set 6:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/6960/6/src/kudu/tablet/tablet_replica.h
File src/kudu/tablet/tablet_replica.h:

PS6, Line 272:  // Wait until the TabletReplica is fully in STOPPED or SHUTDOWN state.
does this have some requirements (does the replica need to be in "stopping" state or somesuch?)


http://gerrit.cloudera.org:8080/#/c/6960/6/src/kudu/tserver/ts_tablet_manager.cc
File src/kudu/tserver/ts_tablet_manager.cc:

PS6, Line 525: tombtone
not from this patch, but typo


-- 
To view, visit http://gerrit.cloudera.org:8080/6960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] WIP: KUDU-871. Support tombstoned voting

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6960

to look at the new patch set (#3).

Change subject: WIP: KUDU-871. Support tombstoned voting
......................................................................

WIP: KUDU-871. Support tombstoned voting

This patch makes it possible for tombstoned tablet replicas to vote in
Raft elections.

Changes:

* Add new test TombstonedVotingITest
* Fix DeleteTableITest.TestMergeConsensusMetadata after tombstoned
  voting changed its assumption that tombstoned tablets would not vote.

TODO:

* Add a negative test for tombstoned voting.
* Add a targeted concurrency test that induces lots of tombstoned voting
  while running TabletCopy, TabletBootstrap, and DeleteTablet.

Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
---
M src/kudu/consensus/consensus-test-util.h
M src/kudu/consensus/raft_consensus.cc
M src/kudu/consensus/raft_consensus.h
M src/kudu/consensus/raft_consensus_quorum-test.cc
M src/kudu/integration-tests/CMakeLists.txt
M src/kudu/integration-tests/delete_table-itest.cc
A src/kudu/integration-tests/tombstoned_voting-itest.cc
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/tablet_copy_client.h
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/tserver-path-handlers.cc
13 files changed, 248 insertions(+), 57 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/60/6960/3
-- 
To view, visit http://gerrit.cloudera.org:8080/6960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-871. Support tombstoned voting

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change.

Change subject: KUDU-871. Support tombstoned voting
......................................................................


Patch Set 10:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/6960/10/src/kudu/consensus/raft_consensus.cc
File src/kudu/consensus/raft_consensus.cc:

PS10, Line 1452: boost::
nit: "using boost::optional" is already specified


http://gerrit.cloudera.org:8080/#/c/6960/10/src/kudu/consensus/raft_consensus.h
File src/kudu/consensus/raft_consensus.h:

PS10, Line 351:    |                    ^
              :   //                    |                    |
              :   //                    +--------------------+
nit: prefer the slimmer version at L424
Or have this in one place and have SetStateUnlocked() refer here?


http://gerrit.cloudera.org:8080/#/c/6960/10/src/kudu/integration-tests/raft_consensus-itest.cc
File src/kudu/integration-tests/raft_consensus-itest.cc:

PS10, Line 3044: back come 
nit: come back


http://gerrit.cloudera.org:8080/#/c/6960/10/src/kudu/integration-tests/tombstoned_voting-itest.cc
File src/kudu/integration-tests/tombstoned_voting-itest.cc:

PS10, Line 46: using std::string;
             : using std::vector;
nit: include these?


http://gerrit.cloudera.org:8080/#/c/6960/10/src/kudu/integration-tests/tombstoned_voting-stress-test.cc
File src/kudu/integration-tests/tombstoned_voting-stress-test.cc:

PS10, Line 44: using std::atomic;
             : using std::string;
             : using std::thread;
             : using std::unique_lock;
             : using std::vector;
nit: includes?


http://gerrit.cloudera.org:8080/#/c/6960/10/src/kudu/master/sys_catalog.cc
File src/kudu/master/sys_catalog.cc:

PS10, Line 347: // TODO(mpercy): Set to failed if Init() fails?
Does this still apply?


http://gerrit.cloudera.org:8080/#/c/6960/10/src/kudu/tablet/tablet_metadata.h
File src/kudu/tablet/tablet_metadata.h:

PS10, Line 357: Has no meaning for non-tombstoned tablets
Is this to say that this will be boost::none for non-tombstoned tablets?


http://gerrit.cloudera.org:8080/#/c/6960/10/src/kudu/tablet/tablet_replica.h
File src/kudu/tablet/tablet_replica.h:

PS10, Line 199: SetError
nit: sort of brought this up on Slack, I think the naming for this could be a bit clearer. SetError() makes it seem like a simple setter, but it's not. Maybe SetErrorAndShutdown()? Or SetFailed() and keep the FAILED state as we discussed.

The latter has the added benefit of being an external indicator of failure, rather than having each external accessor (e.g. web UI, ksck, etc.) get the error_ member.


http://gerrit.cloudera.org:8080/#/c/6960/10/src/kudu/tserver/tablet_copy_client.cc
File src/kudu/tserver/tablet_copy_client.cc:

PS10, Line 298: RETURN_NOT_OK(WriteConsensusMetadata());
This change isn't noted in the commit message. Mind documenting it, or at least adding a comment explaining?


-- 
To view, visit http://gerrit.cloudera.org:8080/6960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
Gerrit-PatchSet: 10
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Edward Fancher <ef...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-871. Support tombstoned voting

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6960

to look at the new patch set (#13).

Change subject: KUDU-871. Support tombstoned voting
......................................................................

KUDU-871. Support tombstoned voting

This patch makes it possible for tombstoned tablet replicas to vote in
Raft elections.

Changes:

* Add Stop() method to TabletReplica + Consensus lifecycle.
  * Includes new STOPPED state.
  * Tombstoning a replica should call Stop().
  * Deleting a replica should call Shutdown().
* Persist ConsensusMetadata before returning from
  TabletCopyClient::Start() because we need cmeta to Init()
  RaftConsensus, which happens when registering the replica in
  TSTabletManager.
* TSTabletManager::DeleteTablet() should not consider FAILED == deleted,
  since we no longer destroy RaftConsensus when tombstoning a replica.
* Add positive and negative tests for tombstoned voting.
* Add a stress test that induces lots of tombstoned voting
  while running TabletCopy, TabletBootstrap, and DeleteTablet.
* Fix DeleteTableITest.TestMergeConsensusMetadata after tombstoned
  voting changed its assumption that tombstoned tablets would not vote.
* Fix several tests that expected tombstoned tablets to be SHUTDOWN when
  now they are STOPPED.

Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
---
M src/kudu/consensus/consensus-test-util.h
M src/kudu/consensus/raft_consensus.cc
M src/kudu/consensus/raft_consensus.h
M src/kudu/consensus/raft_consensus_quorum-test.cc
M src/kudu/integration-tests/CMakeLists.txt
M src/kudu/integration-tests/cluster_itest_util.cc
M src/kudu/integration-tests/cluster_itest_util.h
M src/kudu/integration-tests/delete_table-itest.cc
M src/kudu/integration-tests/external_mini_cluster-itest-base.cc
A src/kudu/integration-tests/tombstoned_voting-itest.cc
A src/kudu/integration-tests/tombstoned_voting-stress-test.cc
M src/kudu/master/sys_catalog.cc
M src/kudu/tablet/metadata.proto
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tablet/tablet_replica-test.cc
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
M src/kudu/tools/kudu-admin-test.cc
M src/kudu/tools/kudu-ts-cli-test.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/tablet_copy_client.h
M src/kudu/tserver/tablet_copy_source_session-test.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/ts_tablet_manager.h
M src/kudu/tserver/tserver-path-handlers.cc
M src/kudu/util/make_shared.h
27 files changed, 1,439 insertions(+), 296 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/60/6960/13
-- 
To view, visit http://gerrit.cloudera.org:8080/6960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
Gerrit-PatchSet: 13
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Edward Fancher <ef...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-871. Support tombstoned voting

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Hello David Ribeiro Alves, Todd Lipcon, Alexey Serbin, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6960

to look at the new patch set (#7).

Change subject: KUDU-871. Support tombstoned voting
......................................................................

KUDU-871. Support tombstoned voting

This patch makes it possible for tombstoned tablet replicas to vote in
Raft elections.

Changes:

* Add Stop() method to TabletReplica + Consensus lifecycle.
  * Includes new STOPPED state.
  * Tombstoning a replica should call Stop().
  * Deleting a replica should call Shutdown().
* Add positive and negative tests for tombstoned voting.
* Add a stress test that induces lots of tombstoned voting
  while running TabletCopy, TabletBootstrap, and DeleteTablet.
* Fix DeleteTableITest.TestMergeConsensusMetadata after tombstoned
  voting changed its assumption that tombstoned tablets would not vote.
* Fix several tests that expected tombstoned tablets to be SHUTDOWN when
  now they are STOPPED.
* Get rid of tablet::FAILED state, instead go to SHUTDOWN with an error.

Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
---
M src/kudu/consensus/consensus-test-util.h
M src/kudu/consensus/raft_consensus.cc
M src/kudu/consensus/raft_consensus.h
M src/kudu/consensus/raft_consensus_quorum-test.cc
M src/kudu/integration-tests/CMakeLists.txt
M src/kudu/integration-tests/cluster_itest_util.cc
M src/kudu/integration-tests/cluster_itest_util.h
M src/kudu/integration-tests/delete_table-itest.cc
M src/kudu/integration-tests/delete_tablet-itest.cc
M src/kudu/integration-tests/external_mini_cluster-itest-base.cc
M src/kudu/integration-tests/raft_consensus-itest.cc
A src/kudu/integration-tests/tombstoned_voting-itest.cc
A src/kudu/integration-tests/tombstoned_voting-stress-test.cc
M src/kudu/integration-tests/ts_recovery-itest.cc
M src/kudu/master/catalog_manager.cc
M src/kudu/master/sys_catalog.cc
M src/kudu/tablet/metadata.proto
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tablet/tablet_replica-test.cc
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
M src/kudu/tools/ksck-test.cc
M src/kudu/tools/kudu-admin-test.cc
M src/kudu/tools/kudu-ts-cli-test.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/tablet_copy_client.h
M src/kudu/tserver/tablet_copy_source_session-test.cc
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/ts_tablet_manager.h
M src/kudu/tserver/tserver-path-handlers.cc
M src/kudu/util/make_shared.h
33 files changed, 1,397 insertions(+), 345 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/60/6960/7
-- 
To view, visit http://gerrit.cloudera.org:8080/6960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
Gerrit-PatchSet: 7
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Edward Fancher <ef...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-871. Support tombstoned voting

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change.

Change subject: KUDU-871. Support tombstoned voting
......................................................................


Patch Set 13:

I did an analysis of failed raft_consensus-itest failures before and after this patch. There was no change in the trend, except that RaftConsensusITest.TestAutoCreateReplica stopped being flaky. :)

I looked into it, and I think it was just coincidence. That test doesn't even have failure detection enabled, plus it doesn't tombstone any tablets.

I think this patch is ready to ship.

-- 
To view, visit http://gerrit.cloudera.org:8080/6960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
Gerrit-PatchSet: 13
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Edward Fancher <ef...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] KUDU-871. Support tombstoned voting

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change.

Change subject: KUDU-871. Support tombstoned voting
......................................................................


Patch Set 10:

I think the failure was caused by already-flaky java tests but I am investigating. I'd like to figure out how to run the java tests on dist-test to do a quick comparison on failure rate w/ master.

-- 
To view, visit http://gerrit.cloudera.org:8080/6960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
Gerrit-PatchSet: 10
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Edward Fancher <ef...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] KUDU-871. Support tombstoned voting

Posted by "Edward Fancher (Code Review)" <ge...@cloudera.org>.
Edward Fancher has posted comments on this change.

Change subject: KUDU-871. Support tombstoned voting
......................................................................


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6960/6/src/kudu/integration-tests/tombstoned_voting-itest.cc
File src/kudu/integration-tests/tombstoned_voting-itest.cc:

Line 48: 
Replicated masters can have tombstoned replicas too, is that right? Does the tombstone voting apply to them? If so, would it be worthwhile to also have a test that includes tombstoned voting for a master tablet replica?


-- 
To view, visit http://gerrit.cloudera.org:8080/6960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Edward Fancher <ef...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-871. Support tombstoned voting

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6960

to look at the new patch set (#11).

Change subject: KUDU-871. Support tombstoned voting
......................................................................

KUDU-871. Support tombstoned voting

This patch makes it possible for tombstoned tablet replicas to vote in
Raft elections.

Changes:

* Add Stop() method to TabletReplica + Consensus lifecycle.
  * Includes new STOPPED state.
  * Tombstoning a replica should call Stop().
  * Deleting a replica should call Shutdown().
* Persist ConsensusMetadata before returning from
  TabletCopyClient::Start() because we need cmeta to Init()
  RaftConsensus, which happens when registering the replica in
  TSTabletManager.
* Add positive and negative tests for tombstoned voting.
* Add a stress test that induces lots of tombstoned voting
  while running TabletCopy, TabletBootstrap, and DeleteTablet.
* Fix DeleteTableITest.TestMergeConsensusMetadata after tombstoned
  voting changed its assumption that tombstoned tablets would not vote.
* Fix several tests that expected tombstoned tablets to be SHUTDOWN when
  now they are STOPPED.

Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
---
M src/kudu/consensus/consensus-test-util.h
M src/kudu/consensus/raft_consensus.cc
M src/kudu/consensus/raft_consensus.h
M src/kudu/consensus/raft_consensus_quorum-test.cc
M src/kudu/integration-tests/CMakeLists.txt
M src/kudu/integration-tests/cluster_itest_util.cc
M src/kudu/integration-tests/cluster_itest_util.h
M src/kudu/integration-tests/delete_table-itest.cc
M src/kudu/integration-tests/external_mini_cluster-itest-base.cc
A src/kudu/integration-tests/tombstoned_voting-itest.cc
A src/kudu/integration-tests/tombstoned_voting-stress-test.cc
M src/kudu/master/sys_catalog.cc
M src/kudu/tablet/metadata.proto
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tablet/tablet_replica-test.cc
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
M src/kudu/tools/kudu-admin-test.cc
M src/kudu/tools/kudu-ts-cli-test.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/tablet_copy_client.h
M src/kudu/tserver/tablet_copy_source_session-test.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/ts_tablet_manager.h
M src/kudu/tserver/tserver-path-handlers.cc
M src/kudu/util/make_shared.h
27 files changed, 1,437 insertions(+), 294 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/60/6960/11
-- 
To view, visit http://gerrit.cloudera.org:8080/6960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
Gerrit-PatchSet: 11
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Edward Fancher <ef...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-871. Support tombstoned voting

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6960

to look at the new patch set (#6).

Change subject: KUDU-871. Support tombstoned voting
......................................................................

KUDU-871. Support tombstoned voting

This patch makes it possible for tombstoned tablet replicas to vote in
Raft elections.

Changes:

* Add Stop() method to TabletReplica + Consensus lifecycle.
  * Includes new STOPPED state.
  * Tombstoning a replica should call Stop().
  * Deleting a replica should call Shutdown().
* Add positive and negative tests for tombstoned voting.
* Add a stress test that induces lots of tombstoned voting
  while running TabletCopy, TabletBootstrap, and DeleteTablet.
* Protect TabletMetadata::tombstone_last_logged_opid_ with a lock.
* Fix DeleteTableITest.TestMergeConsensusMetadata after tombstoned
  voting changed its assumption that tombstoned tablets would not vote.
* Fix several tests that expected tombstoned tablets to be SHUTDOWN when
  now they are STOPPED.

Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
---
M src/kudu/consensus/consensus-test-util.h
M src/kudu/consensus/raft_consensus.cc
M src/kudu/consensus/raft_consensus.h
M src/kudu/consensus/raft_consensus_quorum-test.cc
M src/kudu/integration-tests/CMakeLists.txt
M src/kudu/integration-tests/cluster_itest_util.cc
M src/kudu/integration-tests/cluster_itest_util.h
M src/kudu/integration-tests/delete_table-itest.cc
M src/kudu/integration-tests/external_mini_cluster-itest-base.cc
A src/kudu/integration-tests/tombstoned_voting-itest.cc
A src/kudu/integration-tests/tombstoned_voting-stress-test.cc
M src/kudu/master/sys_catalog.cc
M src/kudu/tablet/metadata.proto
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tablet/tablet_replica-test.cc
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
M src/kudu/tools/kudu-admin-test.cc
M src/kudu/tools/kudu-ts-cli-test.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/tablet_copy_client.h
M src/kudu/tserver/tablet_copy_source_session-test.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/ts_tablet_manager.h
M src/kudu/tserver/tserver-path-handlers.cc
27 files changed, 1,145 insertions(+), 239 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/60/6960/6
-- 
To view, visit http://gerrit.cloudera.org:8080/6960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-871. Support tombstoned voting

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change.

Change subject: KUDU-871. Support tombstoned voting
......................................................................


Patch Set 13:

Well, actually, let me hit raft_consensus-itest a little harder. It's possible that I made it flaky and hadn't previously noticed.

-- 
To view, visit http://gerrit.cloudera.org:8080/6960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
Gerrit-PatchSet: 13
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Edward Fancher <ef...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] KUDU-871. Support tombstoned voting

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6960

to look at the new patch set (#9).

Change subject: KUDU-871. Support tombstoned voting
......................................................................

KUDU-871. Support tombstoned voting

This patch makes it possible for tombstoned tablet replicas to vote in
Raft elections.

Changes:

* Add Stop() method to TabletReplica + Consensus lifecycle.
  * Includes new STOPPED state.
  * Tombstoning a replica should call Stop().
  * Deleting a replica should call Shutdown().
* Add positive and negative tests for tombstoned voting.
* Add a stress test that induces lots of tombstoned voting
  while running TabletCopy, TabletBootstrap, and DeleteTablet.
* Fix DeleteTableITest.TestMergeConsensusMetadata after tombstoned
  voting changed its assumption that tombstoned tablets would not vote.
* Fix several tests that expected tombstoned tablets to be SHUTDOWN when
  now they are STOPPED.
* Get rid of tablet::FAILED state, instead go to SHUTDOWN with an error.

Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
---
M src/kudu/consensus/consensus-test-util.h
M src/kudu/consensus/raft_consensus.cc
M src/kudu/consensus/raft_consensus.h
M src/kudu/consensus/raft_consensus_quorum-test.cc
M src/kudu/integration-tests/CMakeLists.txt
M src/kudu/integration-tests/cluster_itest_util.cc
M src/kudu/integration-tests/cluster_itest_util.h
M src/kudu/integration-tests/delete_table-itest.cc
M src/kudu/integration-tests/delete_tablet-itest.cc
M src/kudu/integration-tests/external_mini_cluster-itest-base.cc
M src/kudu/integration-tests/raft_consensus-itest.cc
A src/kudu/integration-tests/tombstoned_voting-itest.cc
A src/kudu/integration-tests/tombstoned_voting-stress-test.cc
M src/kudu/integration-tests/ts_recovery-itest.cc
M src/kudu/master/catalog_manager.cc
M src/kudu/master/sys_catalog.cc
M src/kudu/tablet/metadata.proto
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tablet/tablet_replica-test.cc
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
M src/kudu/tools/ksck-test.cc
M src/kudu/tools/kudu-admin-test.cc
M src/kudu/tools/kudu-ts-cli-test.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/tablet_copy_client.h
M src/kudu/tserver/tablet_copy_source_session-test.cc
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/ts_tablet_manager.h
M src/kudu/tserver/tserver-path-handlers.cc
M src/kudu/util/make_shared.h
33 files changed, 1,438 insertions(+), 341 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/60/6960/9
-- 
To view, visit http://gerrit.cloudera.org:8080/6960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
Gerrit-PatchSet: 9
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Edward Fancher <ef...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-871. Support tombstoned voting

Posted by "David Ribeiro Alves (Code Review)" <ge...@cloudera.org>.
David Ribeiro Alves has posted comments on this change.

Change subject: KUDU-871. Support tombstoned voting
......................................................................


Patch Set 6:

(7 comments)

leaving for luch, more comments forthcoming

http://gerrit.cloudera.org:8080/#/c/6960/6/src/kudu/consensus/raft_consensus.cc
File src/kudu/consensus/raft_consensus.cc:

PS6, Line 1519:     LOG_WITH_PREFIX_UNLOCKED(INFO) << "voting while tombstoned based on last-logged opid "
              :                                    << local_last_logged_opid;
move this to the else block above and thus avoid the extra if?


PS6, Line 2538: const optional<OpId>& tombstone_last_logged_opid
whats the point of passing an opid here if you're only checking for its presence?
also you mention on LOC 1508 "in which case we are guaranteed that
  // 'tombstone_last_logged_opid' is set by CheckSafeToVoteUnlocked()"

but that is not happening or am I missing something?


http://gerrit.cloudera.org:8080/#/c/6960/6/src/kudu/consensus/raft_consensus.h
File src/kudu/consensus/raft_consensus.h:

PS6, Line 124: static Status Create(ConsensusOptions options,
             :                        RaftPeerPB local_peer_pb,
             :                        scoped_refptr<ConsensusMetadataManager> cmeta_manager,
             :                        ThreadPool* raft_pool,
             :                        std::shared_ptr<RaftConsensus>* consensus_out);
didn't you delete this a while back?


PS6, Line 271: term
nit: current_term() would be more explicit


PS6, Line 292: Tombstoned voting
not super clear what "tombstoned voting" is maybe explain it somewhere?


PS6, Line 354: RaftConsensus has been stopped.
maybe explain what this means in the context of the other states? something like: "RaftConsensus has been stopped. No more transactions or commits are accepted but will still reply to election requests if tombstoned."


http://gerrit.cloudera.org:8080/#/c/6960/6/src/kudu/integration-tests/tombstoned_voting-itest.cc
File src/kudu/integration-tests/tombstoned_voting-itest.cc:

PS6, Line 195:   // Ask TS1 for a vote that should be denied (old last-logged opid).
             :   s = itest::RequestVote(ts1_ets, tablet_id, "B", current_term, old_opid,
             :                          /*ignore_live_leader=*/ true, /*is_pre_election=*/ false, kTimeout);
can you add a test for same candidate same term but with an opid that is greater that the tombstoned replica's op_id?


-- 
To view, visit http://gerrit.cloudera.org:8080/6960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-871. Support tombstoned voting

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change.

Change subject: KUDU-871. Support tombstoned voting
......................................................................


Patch Set 4:

(32 comments)

http://gerrit.cloudera.org:8080/#/c/6960/4/src/kudu/consensus/raft_consensus.cc
File src/kudu/consensus/raft_consensus.cc:

Line 188: Status RaftConsensus::Init() {
> I think the DCHECK on state here was useful, even if it needs to be broaden
Done


PS4, Line 1493: tombstone_last_logged_opid ? *tombstone_last_logged_opid :
              :                                                                     GetLatestOpIdFromLog();
> could we simplify by always using 'max()' of the entry from the metadata an
I don't think so; see my response to the same question in tablet_service.cc

However, I do think we should use the latest opid in the log if Consensus is currently running, since it is guaranteed to be correct in that case.


Line 1497:     LOG_WITH_PREFIX_UNLOCKED(INFO) << "voting while tombstoned";
> any way to improve this with a bit more info?
I'll add the tombstone_last_logged_opid here but I can't think of a reason to log more than that because we already log the details whenever we vote.


Line 2004:       state_ = new_state;
> given this is in every case, move it down below the switch?
Done


Line 2513: Status RaftConsensus::CheckSafeToVoteUnlocked(optional<OpId> tombstone_last_logged_opid) const {
> warning: the parameter 'tombstone_last_logged_opid' is copied for each invo
Done


PS4, Line 2515: (!tombstone_last_logged_opid
> not 100% following why this portion of the condition is necessary
if we are voting while tombstoned then we don't have to be running


PS4, Line 2520: state_ == kShutdown
> should we instead be checking the expected state (ie kStopped)? aren't ther
We can tombstoned vote in kInit and kStopped state, even in kRunning actually due to a race between the check in TabletService and execution in RaftConsensus. It's not possible to expose RaftConsensus in kNew state due to code in TabletReplica however it would be better to encapsulate that guarantee within this class so I'll add a Create() factory method.


Line 2634:   return kMinimumTerm;
> under what circumstances do we hit this case? It seems like cmeta_ should b
Good catch; this was left over from a previous attempt on this patch and is not possible with the current rev. However I agree with you that a factory method for RaftConsensus would be preferable here.


Line 2685:   if (cmeta_) {
> same
removed


http://gerrit.cloudera.org:8080/#/c/6960/4/src/kudu/integration-tests/cluster_itest_util.h
File src/kudu/integration-tests/cluster_itest_util.h:

Line 237:                    const int64_t candidate_term,
> warning: parameter 'candidate_term' is const-qualified in the function decl
Done


PS4, Line 239:                    boost::optional<bool> ignore_live_leader,
             :                    boost::optional<bool> is_pre_election,
> what's the point of making these optional? don't they have well-defined def
I thought it was nice to pass them as optional so that if you don't specify them you get the default behavior as specified in the proto file.


http://gerrit.cloudera.org:8080/#/c/6960/4/src/kudu/integration-tests/tombstoned_voting-stress-test.cc
File src/kudu/integration-tests/tombstoned_voting-stress-test.cc:

Line 38: using kudu::consensus::MakeOpId;
> warning: using decl 'MakeOpId' is unused [misc-unused-using-decls]
Done


Line 39: using kudu::consensus::LeaderStepDownResponsePB;
> warning: using decl 'LeaderStepDownResponsePB' is unused [misc-unused-using
Done


Line 41: using kudu::consensus::RaftConsensus;
> warning: using decl 'RaftConsensus' is unused [misc-unused-using-decls]
Done


Line 42: using kudu::consensus::RaftPeerPB;
> warning: using decl 'RaftPeerPB' is unused [misc-unused-using-decls]
Done


Line 47: using kudu::tablet::TabletStatePB;
> warning: using decl 'TabletStatePB' is unused [misc-unused-using-decls]
Done


Line 48: using kudu::tserver::TabletServerErrorPB;
> warning: using decl 'TabletServerErrorPB' is unused [misc-unused-using-decl
Done


http://gerrit.cloudera.org:8080/#/c/6960/4/src/kudu/tablet/metadata.proto
File src/kudu/tablet/metadata.proto:

Line 135: // Tablet states are sent in TabletReports and kept in TabletReplica.
> what effect will these changes have if there is a rolling upgrade with an o
Based on a grep through the code, the master only knows about tablet::RUNNING and tablet::FAILED so it shouldn't even notice.


http://gerrit.cloudera.org:8080/#/c/6960/4/src/kudu/tablet/tablet_metadata.h
File src/kudu/tablet/tablet_metadata.h:

Line 216:   void SetPreFlushCallback(StatusClosure callback) { pre_flush_callback_ = callback; }
> warning: parameter 'callback' is passed by value and only copied once; cons
Done


http://gerrit.cloudera.org:8080/#/c/6960/4/src/kudu/tablet/tablet_replica.cc
File src/kudu/tablet/tablet_replica.cc:

Line 133:   set_state(INITIALIZED);
> maybe moot based on the suggestion of a factory method, but if a method lik
Done


Line 247:   // TODO: KUDU-183: Keep track of the pending tasks and send an "abort" message.
> warning: missing username/bug in TODO [google-readability-todo]
Done


Line 272:     // Release mem tracker resources.
> is this comment stale?
Maybe stale? Either way it seems pretty irrelevant. I'll remove it.


PS4, Line 407:   LOG_WITH_PREFIX(INFO) << "Fetched status for addr " << &state_
             :                         << ": " << SecureDebugString(*status_pb_out);
> leftover debug print?
yep. removed


Line 427:   CHECK_EQ(INITIALIZED, state_);
> redundant with the DCHECK in set_state
Done


http://gerrit.cloudera.org:8080/#/c/6960/4/src/kudu/tablet/tablet_replica.h
File src/kudu/tablet/tablet_replica.h:

PS4, Line 75:   TabletReplica(scoped_refptr<TabletMetadata> meta,
            :                 scoped_refptr<consensus::ConsensusMetadataManager> cmeta_manager,
            :                 consensus::RaftPeerPB local_peer_pb,
            :                 ThreadPool* apply_pool,
            :                 Callback<void(const std::string& reason)> mark_dirty_clbk);
            : 
            :   // Initializes RaftConsensus.
            :   Status Init(ThreadPool* raft_pool);
> instead of having this split "construct and then init" how about a factory 
The problem with a monolithic construction and initialization of a TabletReplica is that it prevents replicas in certain failure states from being visible to the master or the web UI.

For example, replicas that fail to Init() at startup time due to a missing or corrupted cmeta should still be reported on the web UI and to the master -- at least, I'm pretty sure that's what we want. I suppose an alternative would be to automatically tombstone a tablet that failed to start up but that seems awfully aggressive, especially in the case where there is only 1 replica per tablet.


PS4, Line 96: tombstoned 
> voting in general, right? ie what would happen if you called Stop() but the
The log would be closed so we wouldn't be able to vote on a stopped-but-not-tombstoned replica.


PS4, Line 169:   // If any peers in the consensus configuration lack permanent uuids, get them via an
             :   // RPC call and update.
             :   // TODO: move this to raft_consensus.h.
             :   Status UpdatePermanentUuids();
> can you remove this while you're here? seems to be dead
Done


Line 277:   // Wait until the TabletReplica is fully in STOPPED state.
> or SHUTDOWN
Done


http://gerrit.cloudera.org:8080/#/c/6960/4/src/kudu/tserver/tablet_service.cc
File src/kudu/tserver/tablet_service.cc:

Line 136: using kudu::tablet::TabletStatusPB;
> warning: using decl 'TabletStatusPB' is unused [misc-unused-using-decls]
Done


Line 947:   tablet::TabletDataState data_state = replica->tablet_metadata()->tablet_data_state();
> aren't all these state checks racy? after you grab this here, there's nothi
Yeah, it's racy, but it's safe because we're checking after we have a reference to the replica, and the replica can only have one consensus object, which will be shut down if the tablet starts copying again, for example.

I'll add a block comment here to explain; LMK what you think.

I plan to move the last-logged opid from TabletMetadata to ConsensusMetadata in a follow-up change.


PS4, Line 966:     if (OpIdBiggerThan(tmp_last_logged_opid, MinimumOpId())) {
> when would this equal MinimumOpId? if it's never been tombstoned?
> when would this equal MinimumOpId? if it's never been tombstoned?

That's right.

> Would it be safe to just pass this in always as a non-optional, and then in RequestVote always just take max(log_->GetLastLoggedOpId(), tombstone_last_logged_opid())?

I don't think so. The case I am worried about is if a tablet fails to start up, so it doesn't have a log running, and then it gets tombstoned by the master. In that case we will not currently store the last-logged opid at DeleteTablet() time. So if we don't know what our last-logged opid was (even though we did actually have one) then I don't think it's safe for us to simply start voting based on a last-logged opid of (0,0).

Additionally, just using max() seems wrong in case there was a log truncation involved. We want to use our latest opid state.


http://gerrit.cloudera.org:8080/#/c/6960/4/src/kudu/tserver/tserver-path-handlers.cc
File src/kudu/tserver/tserver-path-handlers.cc:

Line 460:   if (!consensus || !consensus->IsRunning()) {
> instead of this, can you make Consensus::DumpStatusHtml dump something reas
good catch


-- 
To view, visit http://gerrit.cloudera.org:8080/6960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-871. Support tombstoned voting

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6960

to look at the new patch set (#4).

Change subject: KUDU-871. Support tombstoned voting
......................................................................

KUDU-871. Support tombstoned voting

This patch makes it possible for tombstoned tablet replicas to vote in
Raft elections.

Changes:

* Add Stop() method to TabletReplica + Consensus lifecycle.
  * Includes new STOPPED state.
  * Tombstoning a replica should call Stop().
  * Deleting a replica should call Shutdown().
* Add positive and negative tests for tombstoned voting.
* Add a stress test that induces lots of tombstoned voting
  while running TabletCopy, TabletBootstrap, and DeleteTablet.
* Protect TabletMetadata::tombstone_last_logged_opid_ with a lock.
* Fix DeleteTableITest.TestMergeConsensusMetadata after tombstoned
  voting changed its assumption that tombstoned tablets would not vote.
* Fix several tests that expected tombstoned tablets to be SHUTDOWN when
  now they are STOPPED.

Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
---
M src/kudu/consensus/consensus-test-util.h
M src/kudu/consensus/raft_consensus.cc
M src/kudu/consensus/raft_consensus.h
M src/kudu/consensus/raft_consensus_quorum-test.cc
M src/kudu/integration-tests/CMakeLists.txt
M src/kudu/integration-tests/cluster_itest_util.cc
M src/kudu/integration-tests/cluster_itest_util.h
M src/kudu/integration-tests/delete_table-itest.cc
M src/kudu/integration-tests/external_mini_cluster-itest-base.cc
A src/kudu/integration-tests/tombstoned_voting-itest.cc
A src/kudu/integration-tests/tombstoned_voting-stress-test.cc
M src/kudu/master/sys_catalog.cc
M src/kudu/tablet/metadata.proto
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tablet/tablet_replica-test.cc
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
M src/kudu/tools/kudu-admin-test.cc
M src/kudu/tools/kudu-ts-cli-test.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/tablet_copy_client.h
M src/kudu/tserver/tablet_copy_source_session-test.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/ts_tablet_manager.h
M src/kudu/tserver/tserver-path-handlers.cc
27 files changed, 1,091 insertions(+), 216 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/60/6960/4
-- 
To view, visit http://gerrit.cloudera.org:8080/6960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-871. Support tombstoned voting

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change.

Change subject: KUDU-871. Support tombstoned voting
......................................................................


Patch Set 10:

(29 comments)

http://gerrit.cloudera.org:8080/#/c/6960/10/src/kudu/consensus/raft_consensus.cc
File src/kudu/consensus/raft_consensus.cc:

PS10, Line 1452: boost::
> nit: "using boost::optional" is already specified
Done


PS10, Line 1494: local_last_logged_opid = queue_->GetLastOpIdInLog();
> What if tombstone_last_logged_opid was provided as well?  Is it some some o
It's OK. I added a comment to explain.


Line 1856:     CHECK_EQ(kStopping, state_) << State_Name(state_);
> this is already CHECKed in SetStateUnlocked, no?
True, removed


Line 1863:       cmeta_->set_leader_uuid("");
> ClearLeaderUnlocked() ?
Done


http://gerrit.cloudera.org:8080/#/c/6960/10/src/kudu/consensus/raft_consensus.h
File src/kudu/consensus/raft_consensus.h:

PS10, Line 351:    |                    ^
              :   //                    |                    |
              :   //                    +--------------------+
> nit: prefer the slimmer version at L424
Done


PS10, Line 372: // Raft is in the process of stopping and will not accept writes.
              :     kStopping,
> nit: is it possible to request a vote if in this state?  It seems it is, bu
yes; done


http://gerrit.cloudera.org:8080/#/c/6960/10/src/kudu/integration-tests/cluster_itest_util.h
File src/kudu/integration-tests/cluster_itest_util.h:

PS10, Line 236: // Request the given replica to vote.
> nit: could you document at least first 5 parameters of this method?
It's just a thin wrapper around an RPC call. I added a comment referring to its definition.


http://gerrit.cloudera.org:8080/#/c/6960/10/src/kudu/integration-tests/delete_tablet-itest.cc
File src/kudu/integration-tests/delete_tablet-itest.cc:

Line 95:   // Tablet bootstrap failure should result in a SHUTDOWN tablet with an error
> per discussion on slack, let's revert this part of this patch since it isn'
Done


http://gerrit.cloudera.org:8080/#/c/6960/10/src/kudu/integration-tests/raft_consensus-itest.cc
File src/kudu/integration-tests/raft_consensus-itest.cc:

PS10, Line 3044: back come 
> nit: come back
removed this change per convo w/ Todd on slack and here in the comments


http://gerrit.cloudera.org:8080/#/c/6960/10/src/kudu/integration-tests/tombstoned_voting-itest.cc
File src/kudu/integration-tests/tombstoned_voting-itest.cc:

Line 18: #include "kudu/consensus/metadata.pb.h"
> nit: from IWYU
Done, thank you


PS10, Line 46: using std::string;
             : using std::vector;
> nit: include these?
Done


PS10, Line 167: TS1
> nit: the tablet?  Or it's possible to tombstone a tablet server now?
added a comment to explain


PS10, Line 172: TServerDetails* ts1_ets
> nit: consider moving this closer to the call site where it's used.
Done


Line 173:   scoped_refptr<TabletReplica> replica;
> nit: does it make sense to verify the scenario below works without errors e
Makes sense. Done.


PS10, Line 181: "A"
> Could you add a comment explaining that the actual candidate UUID is not cr
Done


PS10, Line 249: ts0_replica->consensus()->StepDown(&resp)
> nit: should it be some selective error handling based on the code set in th
I'm not sure why else it would fail. This is a pretty controlled environment. If a disk fails, all the tests will fail on that Jenkins run so we don't really care about that.


PS10, Line 253: ts0
> nit: 'TS0' since it has been referred as that already.
Is your comment parser case-sensitive? :) Done


PS10, Line 256: FOLLOWER
> Could it change to something like 'LEARNER' in the middle and bring some fl
Nope.


PS10, Line 263: error (nee failed
> per other comment, back out this change
Done


PS10, Line 294: Shut each TS
> down
Done


http://gerrit.cloudera.org:8080/#/c/6960/10/src/kudu/integration-tests/tombstoned_voting-stress-test.cc
File src/kudu/integration-tests/tombstoned_voting-stress-test.cc:

Line 18: #include <atomic>
> nit: from IWYU report:
thanks!


PS10, Line 44: using std::atomic;
             : using std::string;
             : using std::thread;
             : using std::unique_lock;
             : using std::vector;
> nit: includes?
Done


PS10, Line 57:       : num_workers_(1),
> this test seems to be implemented for multiple workers but only has one wor
I initially wanted >1 worker but had trouble making it deterministic enough to assert on things I wanted to assert on, so I dropped it to 1 worker. When I tried to simplify the code for a single worker it didn't really seem better, so I kept the more general implementation.


http://gerrit.cloudera.org:8080/#/c/6960/10/src/kudu/master/catalog_manager.cc
File src/kudu/master/catalog_manager.cc:

Line 2489:     LOG(WARNING) << "Tablet " << tablet->ToString() << " has failed on TS "
> wouldn't it be natural to have tablet::SHUTDOWN check here instead?  Or it'
I thought it was a little unwise to have a crash based on remote state reporting in, but I reverted this change per other comments related to FAILED, and since it's only a DCHECK and not a CHECK it's generally not a big deal to leave it in so I have left it like it was before.


http://gerrit.cloudera.org:8080/#/c/6960/10/src/kudu/master/sys_catalog.cc
File src/kudu/master/sys_catalog.cc:

PS10, Line 347: // TODO(mpercy): Set to failed if Init() fails?
> Does this still apply?
Without automatic re-replication of the master the utility of doing much of anything is limited. Anyway, thanks for the catch, I'll do that.


http://gerrit.cloudera.org:8080/#/c/6960/10/src/kudu/tablet/tablet_metadata.h
File src/kudu/tablet/tablet_metadata.h:

PS10, Line 357: Has no meaning for non-tombstoned tablets
> Is this to say that this will be boost::none for non-tombstoned tablets?
It will be boost::none for TABLET_DATA_READY tablets.


http://gerrit.cloudera.org:8080/#/c/6960/10/src/kudu/tablet/tablet_replica.cc
File src/kudu/tablet/tablet_replica.cc:

PS10, Line 304: LOG(DFATAL)
> The rest of the CHECKs would work for non-debug builds as well.  What is th
changed to FATAL


http://gerrit.cloudera.org:8080/#/c/6960/10/src/kudu/tablet/tablet_replica.h
File src/kudu/tablet/tablet_replica.h:

PS10, Line 199: SetError
> nit: sort of brought this up on Slack, I think the naming for this could be
reverted per other convo


http://gerrit.cloudera.org:8080/#/c/6960/10/src/kudu/tserver/tablet_copy_client.cc
File src/kudu/tserver/tablet_copy_client.cc:

PS10, Line 298: RETURN_NOT_OK(WriteConsensusMetadata());
> This change isn't noted in the commit message. Mind documenting it, or at l
Added a comment here and a bullet point to the commit message.


-- 
To view, visit http://gerrit.cloudera.org:8080/6960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
Gerrit-PatchSet: 10
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Edward Fancher <ef...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-871. Support tombstoned voting

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change.

Change subject: KUDU-871. Support tombstoned voting
......................................................................


Patch Set 12:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/6960/12/src/kudu/integration-tests/tombstoned_voting-itest.cc
File src/kudu/integration-tests/tombstoned_voting-itest.cc:

Line 45: #include "kudu/tserver/tserver.pb.h"
> warning: #includes are not sorted properly [llvm-include-order]
Done


http://gerrit.cloudera.org:8080/#/c/6960/12/src/kudu/integration-tests/tombstoned_voting-stress-test.cc
File src/kudu/integration-tests/tombstoned_voting-stress-test.cc:

Line 39: #include "kudu/integration-tests/external_mini_cluster.h"
> warning: #includes are not sorted properly [llvm-include-order]
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia19d75b185299443b27f41e468bbae20065e7570
Gerrit-PatchSet: 12
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Edward Fancher <ef...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes