You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Will Berkeley (Code Review)" <ge...@cloudera.org> on 2018/10/01 18:34:18 UTC

[kudu-CR] KUDU-2245 Graceful leadership transfer

Hello Fengling Wang, Tidy Bot, Mike Percy, Alexey Serbin, Kudu Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/11251

to look at the new patch set (#13).

Change subject: KUDU-2245 Graceful leadership transfer
......................................................................

KUDU-2245 Graceful leadership transfer

This patch implements graceful leadership transfer, as described in the
original Raft thesis. It has the following steps:

1. An admin client sends a request to the tablet leader for it to
   transfer leadership. The client can indicate a specific voter that it
   wants to become the leader, or it can allow the current leader to
   choose its successor.
2. The leader receives the request and beings a leader transfer period.
   During a leader transfer period, the leader does not accept writes or
   config change requests. This allows followers to catch up to the
   leader. A background timer expires the transfer period after one
   election timeout, since clients should be able to ride over
   interruptions in service lasting at least that long.
3. During the transfer period, the leader continues to update peers.
   When it receives a response from a peer, it checks if that peer is
   a voter and fully caught up to the leader's log. If it is, and if it
   is the designated successor if one was provided, the leader signals
   the peer to start an election, which it should win. If no eligible
   successor appears, the transfer period expires and the leader resumes
   normal operation.

This is an improvement over the current leader step down method, which
causes the leader to simply relinquish leadership and snooze its
election timer for an extra long period, so another voter will likely
become leader. Leadership transfer should usually be much faster and it
allows the client to select the new leader among current voters.
However, note that it does not provide strictly better guarantees- it is
still possible that leadership will not be transferred.

I ran TestRepeatLeaderStepDown and TestGracefulLeaderStepDown 1000 times
and 200 times each, in debug and TSAN modes, with 4 stress threads, and
saw no failures.

Still WIP because I want to
* Run some dist-test loops of the rebalancer tests, which now use
  graceful leadership transfer.
* Add a test or two for bad cases where the leadership transfer period
  should expire.
* Quantify how much faster leadership transfer is than abrupt stepdown,
  at least in a lab environment.

Change-Id: Ic97343af9eb349556424c999799ed5e2941f0083
---
M src/kudu/consensus/consensus-test-util.h
M src/kudu/consensus/consensus.proto
M src/kudu/consensus/consensus_peers.cc
M src/kudu/consensus/consensus_peers.h
M src/kudu/consensus/consensus_queue.cc
M src/kudu/consensus/consensus_queue.h
M src/kudu/consensus/peer_manager.cc
M src/kudu/consensus/peer_manager.h
M src/kudu/consensus/raft_consensus.cc
M src/kudu/consensus/raft_consensus.h
M src/kudu/integration-tests/raft_consensus-itest.cc
M src/kudu/tools/kudu-admin-test.cc
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_tablet.cc
M src/kudu/tools/tool_replica_util.cc
M src/kudu/tools/tool_replica_util.h
M src/kudu/tserver/tablet_service.cc
17 files changed, 816 insertions(+), 55 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/51/11251/13
-- 
To view, visit http://gerrit.cloudera.org:8080/11251
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic97343af9eb349556424c999799ed5e2941f0083
Gerrit-Change-Number: 11251
Gerrit-PatchSet: 13
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Fengling Wang <fw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>