You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Will Berkeley (Code Review)" <ge...@cloudera.org> on 2019/01/09 21:07:54 UTC

[kudu-CR] Reduce election-related logging

Will Berkeley has uploaded this change for review. ( http://gerrit.cloudera.org:8080/12200


Change subject: Reduce election-related logging
......................................................................

Reduce election-related logging

Frequent simultaneous pre-elections on lots of tablets are a common
symptom of underlying problems on Kudu clusters. This results in a ton
of log spam because a [pre-]election results in multiple log messages.

This patch makes a few conservative changes to reduce the amount of
election-related logging, especially on candidates.

1. No log message per successful VoteRequest response. The candidate
   previously logged a message for every granted or denied vote from a
   voter, e.g.

     I1221 20:16:10.479441  1852 leader_election.cc:387] T cdd0e560bc5c451c978e38c0d7620e52 P dd658299c5194f4b88ae408cd3130103 [CANDIDATE]: Term 1 pre-election: Vote granted by peer 1a228ec03dbc4bb79ef66cd036a99bca (127.1.92.67:38265)

   This seems like overkill. Instead, these messages have been moved to
   VLOG(1) and the candidate will print a more detailed summary of the
   election once it is decided:

      I0109 12:39:42.203627 158707712 leader_election.cc:304] T 00000000000000000000000000000000 P 99b33389ebdc47b1ae18947dea49a5b7 [CANDIDATE]: Term 2 pre-election: Election decided. Result: candidate won. Election summary: received 2 responses out of 3 voters: 2 yes votes; 0 no votes. yes voters: 465a338e21d84ad1a4e03867eb608623, 99b33389ebdc47b1ae18947dea49a5b7; no voters:

   Compared to the current state:

     I1221 20:16:10.479472  1852 leader_election.cc:278] T cdd0e560bc5c451c978e38c0d7620e52 P dd658299c5194f4b88ae408cd3130103 [CANDIDATE]: Term 1 pre-election: Election decided. Result: candidate won.

Note that it is still possible to find a record of each vote in each
voter's INFO logs, and error condition logging has not been changed.

2. No log message per vote request. The candidate previously logged a
   message for each voter it was requesting a vote from. e.g.

     I1221 20:16:10.474704  1993 leader_election.cc:251] T cdd0e560bc5c451c978e38c0d7620e52 P 1a228ec03dbc4bb79ef66cd036a99bca [CANDIDATE]: Term 1 pre-election: Requesting pre-vote from peer dd658299c5194f4b88ae408cd3130103 (127.1.92.66:39093)
     I1221 20:16:10.475266  1993 leader_election.cc:251] T cdd0e560bc5c451c978e38c0d7620e52 P 1a228ec03dbc4bb79ef66cd036a99bca [CANDIDATE]: Term 1 pre-election: Requesting pre-vote from peer c624ece2977e486fbab958afb000f122 (127.1.92.65:45651)

   Instead, the candidate will summarize all the voters it sent requests
   to:

     I0109 12:39:42.206408 156561408 leader_election.cc:291] T 00000000000000000000000000000000 P 99b33389ebdc47b1ae18947dea49a5b7 [CANDIDATE]: Term 2 election: Requested vote from peers 465a338e21d84ad1a4e03867eb608623 (127.0.0.1:7053), 08b8de9c7a7c4ec5b72c9eb275d29cfd (127.0.0.1:7052)

3. No logging of failure detector sleeps. It seems like it might be
   useful to log when the FD sleeps, and why, e.g.

     I1221 20:16:12.399258  2803 raft_consensus.cc:2834] T cdd0e560bc5c451c978e38c0d7620e52 P c624ece2977e486fbab958afb000f122: Snoozing failure detection for 0.224s (vote granted)

   But frankly in practice it's never been relevant or useful. Since a
   message like this is produced multiple times on the candidate and
   possibly on the voters during an election, I moved it to VLOG(1)
   level to reduce the total amount of logging from elections.

Change-Id: Ie0ab1a37c2b2e9a3f37baf74772b694b907c6adf
---
M src/kudu/consensus/leader_election.cc
M src/kudu/consensus/leader_election.h
M src/kudu/consensus/raft_consensus.cc
3 files changed, 46 insertions(+), 12 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/00/12200/1
-- 
To view, visit http://gerrit.cloudera.org:8080/12200
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ie0ab1a37c2b2e9a3f37baf74772b694b907c6adf
Gerrit-Change-Number: 12200
Gerrit-PatchSet: 1
Gerrit-Owner: Will Berkeley <wd...@gmail.com>

[kudu-CR] Reduce election-related logging

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Will Berkeley has posted comments on this change. ( http://gerrit.cloudera.org:8080/12200 )

Change subject: Reduce election-related logging
......................................................................


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/12200/1/src/kudu/consensus/leader_election.cc
File src/kudu/consensus/leader_election.cc:

http://gerrit.cloudera.org:8080/#/c/12200/1/src/kudu/consensus/leader_election.cc@306
PS1, Line 306: "Election decided. Resul
> nit: pull this out into a local variable and use it below as well?
Done


http://gerrit.cloudera.org:8080/#/c/12200/1/src/kudu/consensus/leader_election.cc@415
PS1, Line 415: DCHECK(state.response.vote_granted())
> VLOG_WITH_PREFIX + LogPrefix() will double up on prefixes
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/12200
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie0ab1a37c2b2e9a3f37baf74772b694b907c6adf
Gerrit-Change-Number: 12200
Gerrit-PatchSet: 2
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-Comment-Date: Wed, 09 Jan 2019 23:07:09 +0000
Gerrit-HasComments: Yes

[kudu-CR] Reduce election-related logging

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Will Berkeley has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/12200 )

Change subject: Reduce election-related logging
......................................................................

Reduce election-related logging

Frequent simultaneous pre-elections on lots of tablets are a common
symptom of underlying problems on Kudu clusters. This results in a ton
of log spam because a [pre-]election results in multiple log messages.

This patch makes a few conservative changes to reduce the amount of
election-related logging, especially on candidates.

1. No log message per successful VoteRequest response. The candidate
   previously logged a message for every granted or denied vote from a
   voter, e.g.

     I1221 20:16:10.479441  1852 leader_election.cc:387] T cdd0e560bc5c451c978e38c0d7620e52 P dd658299c5194f4b88ae408cd3130103 [CANDIDATE]: Term 1 pre-election: Vote granted by peer 1a228ec03dbc4bb79ef66cd036a99bca (127.1.92.67:38265)

   This seems like overkill. Instead, these messages have been moved to
   VLOG(1) and the candidate will print a more detailed summary of the
   election once it is decided:

      I0109 12:39:42.203627 158707712 leader_election.cc:304] T 00000000000000000000000000000000 P 99b33389ebdc47b1ae18947dea49a5b7 [CANDIDATE]: Term 2 pre-election: Election decided. Result: candidate won. Election summary: received 2 responses out of 3 voters: 2 yes votes; 0 no votes. yes voters: 465a338e21d84ad1a4e03867eb608623, 99b33389ebdc47b1ae18947dea49a5b7; no voters:

   Compared to the current state:

     I1221 20:16:10.479472  1852 leader_election.cc:278] T cdd0e560bc5c451c978e38c0d7620e52 P dd658299c5194f4b88ae408cd3130103 [CANDIDATE]: Term 1 pre-election: Election decided. Result: candidate won.

Note that it is still possible to find a record of each vote in each
voter's INFO logs, and error condition logging has not been changed.

2. No log message per vote request. The candidate previously logged a
   message for each voter it was requesting a vote from. e.g.

     I1221 20:16:10.474704  1993 leader_election.cc:251] T cdd0e560bc5c451c978e38c0d7620e52 P 1a228ec03dbc4bb79ef66cd036a99bca [CANDIDATE]: Term 1 pre-election: Requesting pre-vote from peer dd658299c5194f4b88ae408cd3130103 (127.1.92.66:39093)
     I1221 20:16:10.475266  1993 leader_election.cc:251] T cdd0e560bc5c451c978e38c0d7620e52 P 1a228ec03dbc4bb79ef66cd036a99bca [CANDIDATE]: Term 1 pre-election: Requesting pre-vote from peer c624ece2977e486fbab958afb000f122 (127.1.92.65:45651)

   Instead, the candidate will summarize all the voters it sent requests
   to:

     I0109 12:39:42.206408 156561408 leader_election.cc:291] T 00000000000000000000000000000000 P 99b33389ebdc47b1ae18947dea49a5b7 [CANDIDATE]: Term 2 election: Requested vote from peers 465a338e21d84ad1a4e03867eb608623 (127.0.0.1:7053), 08b8de9c7a7c4ec5b72c9eb275d29cfd (127.0.0.1:7052)

3. No logging of failure detector sleeps. It seems like it might be
   useful to log when the FD sleeps, and why, e.g.

     I1221 20:16:12.399258  2803 raft_consensus.cc:2834] T cdd0e560bc5c451c978e38c0d7620e52 P c624ece2977e486fbab958afb000f122: Snoozing failure detection for 0.224s (vote granted)

   But frankly in practice it's never been relevant or useful. Since a
   message like this is produced multiple times on the candidate and
   possibly on the voters during an election, I moved it to VLOG(1)
   level to reduce the total amount of logging from elections.

Change-Id: Ie0ab1a37c2b2e9a3f37baf74772b694b907c6adf
Reviewed-on: http://gerrit.cloudera.org:8080/12200
Tested-by: Kudu Jenkins
Reviewed-by: Andrew Wong <aw...@cloudera.com>
---
M src/kudu/consensus/leader_election.cc
M src/kudu/consensus/leader_election.h
M src/kudu/consensus/raft_consensus.cc
3 files changed, 48 insertions(+), 13 deletions(-)

Approvals:
  Kudu Jenkins: Verified
  Andrew Wong: Looks good to me, approved

-- 
To view, visit http://gerrit.cloudera.org:8080/12200
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ie0ab1a37c2b2e9a3f37baf74772b694b907c6adf
Gerrit-Change-Number: 12200
Gerrit-PatchSet: 3
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] Reduce election-related logging

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Hello Mike Percy, Kudu Jenkins, Andrew Wong, Adar Dembo, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/12200

to look at the new patch set (#2).

Change subject: Reduce election-related logging
......................................................................

Reduce election-related logging

Frequent simultaneous pre-elections on lots of tablets are a common
symptom of underlying problems on Kudu clusters. This results in a ton
of log spam because a [pre-]election results in multiple log messages.

This patch makes a few conservative changes to reduce the amount of
election-related logging, especially on candidates.

1. No log message per successful VoteRequest response. The candidate
   previously logged a message for every granted or denied vote from a
   voter, e.g.

     I1221 20:16:10.479441  1852 leader_election.cc:387] T cdd0e560bc5c451c978e38c0d7620e52 P dd658299c5194f4b88ae408cd3130103 [CANDIDATE]: Term 1 pre-election: Vote granted by peer 1a228ec03dbc4bb79ef66cd036a99bca (127.1.92.67:38265)

   This seems like overkill. Instead, these messages have been moved to
   VLOG(1) and the candidate will print a more detailed summary of the
   election once it is decided:

      I0109 12:39:42.203627 158707712 leader_election.cc:304] T 00000000000000000000000000000000 P 99b33389ebdc47b1ae18947dea49a5b7 [CANDIDATE]: Term 2 pre-election: Election decided. Result: candidate won. Election summary: received 2 responses out of 3 voters: 2 yes votes; 0 no votes. yes voters: 465a338e21d84ad1a4e03867eb608623, 99b33389ebdc47b1ae18947dea49a5b7; no voters:

   Compared to the current state:

     I1221 20:16:10.479472  1852 leader_election.cc:278] T cdd0e560bc5c451c978e38c0d7620e52 P dd658299c5194f4b88ae408cd3130103 [CANDIDATE]: Term 1 pre-election: Election decided. Result: candidate won.

Note that it is still possible to find a record of each vote in each
voter's INFO logs, and error condition logging has not been changed.

2. No log message per vote request. The candidate previously logged a
   message for each voter it was requesting a vote from. e.g.

     I1221 20:16:10.474704  1993 leader_election.cc:251] T cdd0e560bc5c451c978e38c0d7620e52 P 1a228ec03dbc4bb79ef66cd036a99bca [CANDIDATE]: Term 1 pre-election: Requesting pre-vote from peer dd658299c5194f4b88ae408cd3130103 (127.1.92.66:39093)
     I1221 20:16:10.475266  1993 leader_election.cc:251] T cdd0e560bc5c451c978e38c0d7620e52 P 1a228ec03dbc4bb79ef66cd036a99bca [CANDIDATE]: Term 1 pre-election: Requesting pre-vote from peer c624ece2977e486fbab958afb000f122 (127.1.92.65:45651)

   Instead, the candidate will summarize all the voters it sent requests
   to:

     I0109 12:39:42.206408 156561408 leader_election.cc:291] T 00000000000000000000000000000000 P 99b33389ebdc47b1ae18947dea49a5b7 [CANDIDATE]: Term 2 election: Requested vote from peers 465a338e21d84ad1a4e03867eb608623 (127.0.0.1:7053), 08b8de9c7a7c4ec5b72c9eb275d29cfd (127.0.0.1:7052)

3. No logging of failure detector sleeps. It seems like it might be
   useful to log when the FD sleeps, and why, e.g.

     I1221 20:16:12.399258  2803 raft_consensus.cc:2834] T cdd0e560bc5c451c978e38c0d7620e52 P c624ece2977e486fbab958afb000f122: Snoozing failure detection for 0.224s (vote granted)

   But frankly in practice it's never been relevant or useful. Since a
   message like this is produced multiple times on the candidate and
   possibly on the voters during an election, I moved it to VLOG(1)
   level to reduce the total amount of logging from elections.

Change-Id: Ie0ab1a37c2b2e9a3f37baf74772b694b907c6adf
---
M src/kudu/consensus/leader_election.cc
M src/kudu/consensus/leader_election.h
M src/kudu/consensus/raft_consensus.cc
3 files changed, 48 insertions(+), 13 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/00/12200/2
-- 
To view, visit http://gerrit.cloudera.org:8080/12200
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie0ab1a37c2b2e9a3f37baf74772b694b907c6adf
Gerrit-Change-Number: 12200
Gerrit-PatchSet: 2
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>

[kudu-CR] Reduce election-related logging

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/12200 )

Change subject: Reduce election-related logging
......................................................................


Patch Set 2: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/12200
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie0ab1a37c2b2e9a3f37baf74772b694b907c6adf
Gerrit-Change-Number: 12200
Gerrit-PatchSet: 2
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-Comment-Date: Wed, 09 Jan 2019 23:52:08 +0000
Gerrit-HasComments: No

[kudu-CR] Reduce election-related logging

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change. ( http://gerrit.cloudera.org:8080/12200 )

Change subject: Reduce election-related logging
......................................................................


Patch Set 1: Code-Review+1

(1 comment)

Nice log cleanup patch, thanks for doing this Will.

http://gerrit.cloudera.org:8080/#/c/12200/1/src/kudu/consensus/leader_election.cc
File src/kudu/consensus/leader_election.cc:

http://gerrit.cloudera.org:8080/#/c/12200/1/src/kudu/consensus/leader_election.cc@415
PS1, Line 415: VLOG_WITH_PREFIX(1) << Substitute("$0
VLOG_WITH_PREFIX + LogPrefix() will double up on prefixes



-- 
To view, visit http://gerrit.cloudera.org:8080/12200
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie0ab1a37c2b2e9a3f37baf74772b694b907c6adf
Gerrit-Change-Number: 12200
Gerrit-PatchSet: 1
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Comment-Date: Wed, 09 Jan 2019 22:02:20 +0000
Gerrit-HasComments: Yes

[kudu-CR] Reduce election-related logging

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/12200 )

Change subject: Reduce election-related logging
......................................................................


Patch Set 1: Code-Review+2

(1 comment)

Just a small nit, feel free to ignore.

http://gerrit.cloudera.org:8080/#/c/12200/1/src/kudu/consensus/leader_election.cc
File src/kudu/consensus/leader_election.cc:

http://gerrit.cloudera.org:8080/#/c/12200/1/src/kudu/consensus/leader_election.cc@306
PS1, Line 306: decision == VOTE_GRANTED
nit: pull this out into a local variable and use it below as well?



-- 
To view, visit http://gerrit.cloudera.org:8080/12200
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie0ab1a37c2b2e9a3f37baf74772b694b907c6adf
Gerrit-Change-Number: 12200
Gerrit-PatchSet: 1
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Comment-Date: Wed, 09 Jan 2019 22:02:00 +0000
Gerrit-HasComments: Yes