You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Alexey Serbin (Code Review)" <ge...@cloudera.org> on 2020/10/27 05:32:45 UTC
[kudu-CR] [tserver] validator for --scanner max wait ms
Alexey Serbin has uploaded this change for review. ( http://gerrit.cloudera.org:8080/16655
Change subject: [tserver] validator for --scanner_max_wait_ms
......................................................................
[tserver] validator for --scanner_max_wait_ms
This patch adds a group validator for the --scanner_max_wait_ms vs
--raft_heartbeat_interval_ms flag's value. As of now, the validator
output warning if --scanner_max_wait_ms is set too low compared
with --raft_heartbeat_interval_ms. In addition, --scanner_max_wait_ms
is now tagged as 'runtime' to reflect its de facto behavior.
I also did a minor clean in the code around.
I didn't add any test, but I verified that the warning is output upon
kudu-tserver's startup as intended when --scanner_max_wait_ms is set too
low compared with current setting for --raft_heartbeat_interval_ms.
Change-Id: I1dec4173a9ae50a4de34b909283c5a2ee4ef9166
---
M src/kudu/consensus/time_manager.cc
M src/kudu/tserver/tablet_service.cc
2 files changed, 64 insertions(+), 27 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/55/16655/1
--
To view, visit http://gerrit.cloudera.org:8080/16655
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I1dec4173a9ae50a4de34b909283c5a2ee4ef9166
Gerrit-Change-Number: 16655
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
[kudu-CR] [tserver] validator for --scanner max wait ms
Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins, Andrew Wong,
I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/16655
to look at the new patch set (#2).
Change subject: [tserver] validator for --scanner_max_wait_ms
......................................................................
[tserver] validator for --scanner_max_wait_ms
This patch adds a group validator for the --scanner_max_wait_ms vs
--raft_heartbeat_interval_ms flag's value. As of now, the validator
outputs warning if --scanner_max_wait_ms is set too low compared
with --raft_heartbeat_interval_ms. In addition, --scanner_max_wait_ms
is now tagged as 'runtime' to reflect its de facto behavior.
I also did a minor clean-up of the related code.
I didn't add any test, but I verified that the warning is output upon
kudu-tserver's startup as intended when --scanner_max_wait_ms is set too
low compared with current setting for --raft_heartbeat_interval_ms.
Change-Id: I1dec4173a9ae50a4de34b909283c5a2ee4ef9166
---
M src/kudu/consensus/time_manager.cc
M src/kudu/tserver/tablet_service.cc
2 files changed, 64 insertions(+), 27 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/55/16655/2
--
To view, visit http://gerrit.cloudera.org:8080/16655
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1dec4173a9ae50a4de34b909283c5a2ee4ef9166
Gerrit-Change-Number: 16655
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
[kudu-CR] [tserver] validator for --scanner max wait ms
Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16655 )
Change subject: [tserver] validator for --scanner_max_wait_ms
......................................................................
Patch Set 2:
(1 comment)
http://gerrit.cloudera.org:8080/#/c/16655/2/src/kudu/tserver/tablet_service.cc
File src/kudu/tserver/tablet_service.cc:
http://gerrit.cloudera.org:8080/#/c/16655/2/src/kudu/tserver/tablet_service.cc@210
PS2, Line 210: return true;
> Some background behind this: I was looking at one issue from the fields and
I see. Thanks for clarifying!
--
To view, visit http://gerrit.cloudera.org:8080/16655
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1dec4173a9ae50a4de34b909283c5a2ee4ef9166
Gerrit-Change-Number: 16655
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Wed, 28 Oct 2020 05:29:04 +0000
Gerrit-HasComments: Yes
[kudu-CR] [tserver] validator for --scanner max wait ms
Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/16655 )
Change subject: [tserver] validator for --scanner_max_wait_ms
......................................................................
Patch Set 2:
(2 comments)
http://gerrit.cloudera.org:8080/#/c/16655/2/src/kudu/tserver/tablet_service.cc
File src/kudu/tserver/tablet_service.cc:
http://gerrit.cloudera.org:8080/#/c/16655/2/src/kudu/tserver/tablet_service.cc@204
PS2, Line 204: at least up to $2
> nit: "to at least $2", otherwise this may read as though the user should in
Done
http://gerrit.cloudera.org:8080/#/c/16655/2/src/kudu/tserver/tablet_service.cc@210
PS2, Line 210: return true;
> Is this to say that it's not actually that critical an issue? Will snapshot
Some background behind this: I was looking at one issue from the fields and I initially thought that the problem was related to a difference in local time between tablet servers. I had a theory about distribution of tablet replicas when the client was writing and reading back from a follower replica with a lagging local clock.
However, after deeper investigation I realized that was not likely the case. I found that the issue is most likely related to a fact that the follower replica which was a source for a timed out snapshot scan had accumulated many operations because and was slow to apply those.
With these findings, I guess trying to force this relationship between --scanner_max_wait_ms and --raft_heartbeat_interval_ms doesn't make much sense unless we think it's common to have few seconds difference in local clock among different tablet servers. I guess the latter is very unlikely, and it should rather be considered as an anomaly.
With that, I don't think we want to introduce this validator, actually. So, I moved the rest of the changes into a separate changelist and posted it for review: https://gerrit.cloudera.org/#/c/16669/
Meanwhile, I'm abandoning this changelist.
--
To view, visit http://gerrit.cloudera.org:8080/16655
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1dec4173a9ae50a4de34b909283c5a2ee4ef9166
Gerrit-Change-Number: 16655
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Wed, 28 Oct 2020 03:52:13 +0000
Gerrit-HasComments: Yes
[kudu-CR] [tserver] validator for --scanner max wait ms
Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has removed a vote on this change.
Change subject: [tserver] validator for --scanner_max_wait_ms
......................................................................
Removed Verified-1 by Kudu Jenkins (120)
--
To view, visit http://gerrit.cloudera.org:8080/16655
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: I1dec4173a9ae50a4de34b909283c5a2ee4ef9166
Gerrit-Change-Number: 16655
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
[kudu-CR] [tserver] validator for --scanner max wait ms
Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has abandoned this change. ( http://gerrit.cloudera.org:8080/16655 )
Change subject: [tserver] validator for --scanner_max_wait_ms
......................................................................
Abandoned
--
To view, visit http://gerrit.cloudera.org:8080/16655
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: abandon
Gerrit-Change-Id: I1dec4173a9ae50a4de34b909283c5a2ee4ef9166
Gerrit-Change-Number: 16655
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
[kudu-CR] [tserver] validator for --scanner max wait ms
Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16655 )
Change subject: [tserver] validator for --scanner_max_wait_ms
......................................................................
Patch Set 2: Code-Review+1
(2 comments)
http://gerrit.cloudera.org:8080/#/c/16655/2/src/kudu/tserver/tablet_service.cc
File src/kudu/tserver/tablet_service.cc:
http://gerrit.cloudera.org:8080/#/c/16655/2/src/kudu/tserver/tablet_service.cc@204
PS2, Line 204: at least up to $2
nit: "to at least $2", otherwise this may read as though the user should increase by an additional heartbeat interval.
http://gerrit.cloudera.org:8080/#/c/16655/2/src/kudu/tserver/tablet_service.cc@210
PS2, Line 210: return true;
Is this to say that it's not actually that critical an issue? Will snapshot scans with the latest timestamp _mostly_ pass even if improperly set? My concern is that a warning is indeed not severe enough to spur action on the operator, though I do understand why a soft validation is desirable, assuming the service runs ok in a lot of cases even if improperly set.
--
To view, visit http://gerrit.cloudera.org:8080/16655
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1dec4173a9ae50a4de34b909283c5a2ee4ef9166
Gerrit-Change-Number: 16655
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Tue, 27 Oct 2020 06:29:27 +0000
Gerrit-HasComments: Yes
[kudu-CR] [tserver] validator for --scanner max wait ms
Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/16655 )
Change subject: [tserver] validator for --scanner_max_wait_ms
......................................................................
Patch Set 2: Verified+1
unrelated test failure in ToolTest.TestHmsList
--
To view, visit http://gerrit.cloudera.org:8080/16655
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1dec4173a9ae50a4de34b909283c5a2ee4ef9166
Gerrit-Change-Number: 16655
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Tue, 27 Oct 2020 06:20:47 +0000
Gerrit-HasComments: No