You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Adar Dembo (Code Review)" <ge...@cloudera.org> on 2018/07/18 23:37:42 UTC
[kudu-CR](branch-1.5.x) KUDU-2149: avoid election stacking by restoring failure monitor semantics
Adar Dembo has uploaded this change for review. ( http://gerrit.cloudera.org:8080/10987
Change subject: KUDU-2149: avoid election stacking by restoring failure monitor semantics
......................................................................
KUDU-2149: avoid election stacking by restoring failure monitor semantics
Prior to commit 21b0f3d, the dedicated failure monitor thread invoked
RaftConsensus::StartElection() synchronously, thus preventing it from
surfacing additional failures during that time. This patch attempts to
restore these semantics by short-circuiting and ignoring any failures
detected while a Raft thread is in StartElection().
This is a super targeted fix geared towards a point release; a more correct
fix would be to completely disable failure detection while an election is
running, but that'll require more work.
Originally I had written a test that injects latency into
ConsensusMetadata::Flush(), toggles the fix, and compares the number of vote
request RPCs. I couldn't get it to be totally robust, and the "feature flag"
used in the toggle is likely to become obselete quickly. So in the end I
decided to drop the test from the patch.
Change-Id: Ifeaf99ce57f7d5cd01a6c786c178567a98438ced
Reviewed-on: http://gerrit.cloudera.org:8080/8107
Reviewed-by: Mike Percy <mp...@apache.org>
Tested-by: Kudu Jenkins
(cherry picked from commit edd41cb40fbad206e2c356983baba8fbc57199b5)
---
M src/kudu/consensus/raft_consensus.cc
M src/kudu/consensus/raft_consensus.h
2 files changed, 23 insertions(+), 3 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/87/10987/1
--
To view, visit http://gerrit.cloudera.org:8080/10987
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: branch-1.5.x
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ifeaf99ce57f7d5cd01a6c786c178567a98438ced
Gerrit-Change-Number: 10987
Gerrit-PatchSet: 1
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
[kudu-CR](branch-1.5.x) KUDU-2149: avoid election stacking by restoring failure monitor semantics
Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has removed Kudu Jenkins from this change. ( http://gerrit.cloudera.org:8080/10987 )
Change subject: KUDU-2149: avoid election stacking by restoring failure monitor semantics
......................................................................
Removed reviewer Kudu Jenkins with the following votes:
* Verified-1 by Kudu Jenkins (120)
--
To view, visit http://gerrit.cloudera.org:8080/10987
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: branch-1.5.x
Gerrit-MessageType: deleteReviewer
Gerrit-Change-Id: Ifeaf99ce57f7d5cd01a6c786c178567a98438ced
Gerrit-Change-Number: 10987
Gerrit-PatchSet: 1
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
[kudu-CR](branch-1.5.x) KUDU-2149: avoid election stacking by restoring failure monitor semantics
Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/10987 )
Change subject: KUDU-2149: avoid election stacking by restoring failure monitor semantics
......................................................................
KUDU-2149: avoid election stacking by restoring failure monitor semantics
Prior to commit 21b0f3d, the dedicated failure monitor thread invoked
RaftConsensus::StartElection() synchronously, thus preventing it from
surfacing additional failures during that time. This patch attempts to
restore these semantics by short-circuiting and ignoring any failures
detected while a Raft thread is in StartElection().
This is a super targeted fix geared towards a point release; a more correct
fix would be to completely disable failure detection while an election is
running, but that'll require more work.
Originally I had written a test that injects latency into
ConsensusMetadata::Flush(), toggles the fix, and compares the number of vote
request RPCs. I couldn't get it to be totally robust, and the "feature flag"
used in the toggle is likely to become obselete quickly. So in the end I
decided to drop the test from the patch.
Change-Id: Ifeaf99ce57f7d5cd01a6c786c178567a98438ced
Reviewed-on: http://gerrit.cloudera.org:8080/8107
Reviewed-by: Mike Percy <mp...@apache.org>
Tested-by: Kudu Jenkins
(cherry picked from commit edd41cb40fbad206e2c356983baba8fbc57199b5)
Reviewed-on: http://gerrit.cloudera.org:8080/10987
Reviewed-by: Adar Dembo <ad...@cloudera.com>
Tested-by: Adar Dembo <ad...@cloudera.com>
---
M src/kudu/consensus/raft_consensus.cc
M src/kudu/consensus/raft_consensus.h
2 files changed, 23 insertions(+), 3 deletions(-)
Approvals:
Adar Dembo: Looks good to me, approved; Verified
--
To view, visit http://gerrit.cloudera.org:8080/10987
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: branch-1.5.x
Gerrit-MessageType: merged
Gerrit-Change-Id: Ifeaf99ce57f7d5cd01a6c786c178567a98438ced
Gerrit-Change-Number: 10987
Gerrit-PatchSet: 2
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
[kudu-CR](branch-1.5.x) KUDU-2149: avoid election stacking by restoring failure monitor semantics
Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/10987 )
Change subject: KUDU-2149: avoid election stacking by restoring failure monitor semantics
......................................................................
Patch Set 1: Verified+1 Code-Review+2
Overriding Jenkins, the Python build failed due to a versioning issue but the C++ tests all passed.
--
To view, visit http://gerrit.cloudera.org:8080/10987
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: branch-1.5.x
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifeaf99ce57f7d5cd01a6c786c178567a98438ced
Gerrit-Change-Number: 10987
Gerrit-PatchSet: 1
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Thu, 19 Jul 2018 00:28:45 +0000
Gerrit-HasComments: No