You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Alexey Serbin (Code Review)" <ge...@cloudera.org> on 2019/12/27 06:31:13 UTC
[kudu-CR] [tests] address flakiness in raft consensus election-itest
Alexey Serbin has uploaded this change for review. ( http://gerrit.cloudera.org:8080/14953
Change subject: [tests] address flakiness in raft_consensus_election-itest
......................................................................
[tests] address flakiness in raft_consensus_election-itest
Few test scenarios of the raft_consensus_election-itest suite
involving churny elections were showing flakiness when run in slow
mode with --stress_cpu_threads=16. The common root of the problem
was failing writer test thread due to timeout.
This patch addresses the issue, increasing Raft heartbeat interval
from 1 to 2 milliseconds. With this change, the above mentioned
tests become more stable, no longer failing due to the timeout error.
I ran the raft_consensus_election-itest built in DEBUG mode multiple
1K batches to confirm that.
Even with this patch, the above mentioned test scenarios sometimes fail
due to the DCHECK_GE assert in PeerMessageQueue::CheckMonotonicTerms().
The latter issues will be addressed separately.
Change-Id: If6f54643c9c066b31a74e1082260225e60324e4e
---
M src/kudu/consensus/consensus_queue.cc
M src/kudu/consensus/consensus_queue.h
M src/kudu/integration-tests/raft_consensus_election-itest.cc
3 files changed, 9 insertions(+), 9 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/53/14953/1
--
To view, visit http://gerrit.cloudera.org:8080/14953
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: If6f54643c9c066b31a74e1082260225e60324e4e
Gerrit-Change-Number: 14953
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
[kudu-CR] [tests] address flakiness in raft consensus election-itest
Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins, Adar Dembo,
I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/14953
to look at the new patch set (#2).
Change subject: [tests] address flakiness in raft_consensus_election-itest
......................................................................
[tests] address flakiness in raft_consensus_election-itest
Few test scenarios of the raft_consensus_election-itest suite
involving churny elections were showing flakiness when run in slow
mode with --stress_cpu_threads=16. The common root of the problem
was failing writer test thread due to timeout.
This patch addresses the issue, increasing Raft heartbeat interval
from 1 to 2 milliseconds. With this change, the above mentioned
tests become more stable, no longer failing due to the timeout error.
I ran the raft_consensus_election-itest built in DEBUG mode multiple
1K batches to confirm that.
Even with this patch, the above mentioned test scenarios sometimes fail
due to the DCHECK_GE assert in PeerMessageQueue::CheckMonotonicTerms().
The latter issues will be addressed separately.
Change-Id: If6f54643c9c066b31a74e1082260225e60324e4e
---
M src/kudu/consensus/consensus_queue.cc
M src/kudu/consensus/consensus_queue.h
M src/kudu/integration-tests/raft_consensus_election-itest.cc
3 files changed, 18 insertions(+), 18 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/53/14953/2
--
To view, visit http://gerrit.cloudera.org:8080/14953
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If6f54643c9c066b31a74e1082260225e60324e4e
Gerrit-Change-Number: 14953
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
[kudu-CR] [tests] address flakiness in raft consensus election-itest
Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/14953 )
Change subject: [tests] address flakiness in raft_consensus_election-itest
......................................................................
Patch Set 4: Code-Review+2
--
To view, visit http://gerrit.cloudera.org:8080/14953
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If6f54643c9c066b31a74e1082260225e60324e4e
Gerrit-Change-Number: 14953
Gerrit-PatchSet: 4
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Mon, 06 Jan 2020 21:11:15 +0000
Gerrit-HasComments: No
[kudu-CR] [tests] address flakiness in raft consensus election-itest
Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/14953 )
Change subject: [tests] address flakiness in raft_consensus_election-itest
......................................................................
[tests] address flakiness in raft_consensus_election-itest
Few test scenarios of the raft_consensus_election-itest suite
involving churny elections were showing flakiness when run in slow
mode with --stress_cpu_threads=16. The common root of the problem
was failing writer test thread due to timeout.
This patch addresses the issue, increasing Raft heartbeat interval
from 1 to 2 milliseconds. With this change, the above mentioned
tests become more stable, no longer failing due to the timeout error.
I ran the raft_consensus_election-itest built in DEBUG mode multiple
1K batches to confirm that.
Even with this patch, the above mentioned test scenarios sometimes fail
due to the DCHECK_GE assert in PeerMessageQueue::CheckMonotonicTerms().
The latter issues will be addressed separately.
Change-Id: If6f54643c9c066b31a74e1082260225e60324e4e
Reviewed-on: http://gerrit.cloudera.org:8080/14953
Tested-by: Kudu Jenkins
Reviewed-by: Adar Dembo <ad...@cloudera.com>
---
M src/kudu/consensus/consensus_queue.cc
M src/kudu/consensus/consensus_queue.h
M src/kudu/integration-tests/raft_consensus_election-itest.cc
3 files changed, 19 insertions(+), 20 deletions(-)
Approvals:
Kudu Jenkins: Verified
Adar Dembo: Looks good to me, approved
--
To view, visit http://gerrit.cloudera.org:8080/14953
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: If6f54643c9c066b31a74e1082260225e60324e4e
Gerrit-Change-Number: 14953
Gerrit-PatchSet: 5
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
[kudu-CR] [tests] address flakiness in raft consensus election-itest
Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Kudu Jenkins, Adar Dembo,
I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/14953
to look at the new patch set (#4).
Change subject: [tests] address flakiness in raft_consensus_election-itest
......................................................................
[tests] address flakiness in raft_consensus_election-itest
Few test scenarios of the raft_consensus_election-itest suite
involving churny elections were showing flakiness when run in slow
mode with --stress_cpu_threads=16. The common root of the problem
was failing writer test thread due to timeout.
This patch addresses the issue, increasing Raft heartbeat interval
from 1 to 2 milliseconds. With this change, the above mentioned
tests become more stable, no longer failing due to the timeout error.
I ran the raft_consensus_election-itest built in DEBUG mode multiple
1K batches to confirm that.
Even with this patch, the above mentioned test scenarios sometimes fail
due to the DCHECK_GE assert in PeerMessageQueue::CheckMonotonicTerms().
The latter issues will be addressed separately.
Change-Id: If6f54643c9c066b31a74e1082260225e60324e4e
---
M src/kudu/consensus/consensus_queue.cc
M src/kudu/consensus/consensus_queue.h
M src/kudu/integration-tests/raft_consensus_election-itest.cc
3 files changed, 19 insertions(+), 20 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/53/14953/4
--
To view, visit http://gerrit.cloudera.org:8080/14953
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If6f54643c9c066b31a74e1082260225e60324e4e
Gerrit-Change-Number: 14953
Gerrit-PatchSet: 4
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
[kudu-CR] [tests] address flakiness in raft consensus election-itest
Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/14953 )
Change subject: [tests] address flakiness in raft_consensus_election-itest
......................................................................
Patch Set 1:
(1 comment)
http://gerrit.cloudera.org:8080/#/c/14953/1/src/kudu/integration-tests/raft_consensus_election-itest.cc
File src/kudu/integration-tests/raft_consensus_election-itest.cc:
http://gerrit.cloudera.org:8080/#/c/14953/1/src/kudu/integration-tests/raft_consensus_election-itest.cc@129
PS1, Line 129: workload->set_write_timeout_millis((AllowSlowTests() ? 120 : 60) * 1000);
Hmm, why condition this on slow tests and not something like build type? AFAICT max_rows_to_insert is the only other factor that changes in slow tests; is that directly correlated with these timeouts?
--
To view, visit http://gerrit.cloudera.org:8080/14953
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If6f54643c9c066b31a74e1082260225e60324e4e
Gerrit-Change-Number: 14953
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Sun, 29 Dec 2019 18:20:25 +0000
Gerrit-HasComments: Yes
[kudu-CR] [tests] address flakiness in raft consensus election-itest
Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Kudu Jenkins, Adar Dembo,
I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/14953
to look at the new patch set (#3).
Change subject: [tests] address flakiness in raft_consensus_election-itest
......................................................................
[tests] address flakiness in raft_consensus_election-itest
Few test scenarios of the raft_consensus_election-itest suite
involving churny elections were showing flakiness when run in slow
mode with --stress_cpu_threads=16. The common root of the problem
was failing writer test thread due to timeout.
This patch addresses the issue, increasing Raft heartbeat interval
from 1 to 2 milliseconds. With this change, the above mentioned
tests become more stable, no longer failing due to the timeout error.
I ran the raft_consensus_election-itest built in DEBUG mode multiple
1K batches to confirm that.
Even with this patch, the above mentioned test scenarios sometimes fail
due to the DCHECK_GE assert in PeerMessageQueue::CheckMonotonicTerms().
The latter issues will be addressed separately.
Change-Id: If6f54643c9c066b31a74e1082260225e60324e4e
---
M src/kudu/consensus/consensus_queue.cc
M src/kudu/consensus/consensus_queue.h
M src/kudu/integration-tests/raft_consensus_election-itest.cc
3 files changed, 20 insertions(+), 20 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/53/14953/3
--
To view, visit http://gerrit.cloudera.org:8080/14953
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If6f54643c9c066b31a74e1082260225e60324e4e
Gerrit-Change-Number: 14953
Gerrit-PatchSet: 3
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
[kudu-CR] [tests] address flakiness in raft consensus election-itest
Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/14953 )
Change subject: [tests] address flakiness in raft_consensus_election-itest
......................................................................
Patch Set 1:
(1 comment)
http://gerrit.cloudera.org:8080/#/c/14953/1/src/kudu/integration-tests/raft_consensus_election-itest.cc
File src/kudu/integration-tests/raft_consensus_election-itest.cc:
http://gerrit.cloudera.org:8080/#/c/14953/1/src/kudu/integration-tests/raft_consensus_election-itest.cc@129
PS1, Line 129: workload->set_write_timeout_millis((AllowSlowTests() ? 120 : 60) * 1000);
> Hmm, why condition this on slow tests and not something like build type? AF
Yes, I agree -- this looks strange. Actually, this mirrors the logic of the call sites passing 'max_rows_to_insert': that depends on AllowSlowTests(). I think I'll better change the signature of the DoTestChurnyElections() function to make it more consistent.
--
To view, visit http://gerrit.cloudera.org:8080/14953
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If6f54643c9c066b31a74e1082260225e60324e4e
Gerrit-Change-Number: 14953
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Mon, 30 Dec 2019 08:17:24 +0000
Gerrit-HasComments: Yes
[kudu-CR] [tests] address flakiness in raft consensus election-itest
Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/14953 )
Change subject: [tests] address flakiness in raft_consensus_election-itest
......................................................................
Patch Set 3:
(1 comment)
http://gerrit.cloudera.org:8080/#/c/14953/3/src/kudu/integration-tests/raft_consensus_election-itest.cc
File src/kudu/integration-tests/raft_consensus_election-itest.cc:
http://gerrit.cloudera.org:8080/#/c/14953/3/src/kudu/integration-tests/raft_consensus_election-itest.cc@131
PS3, Line 131: //workload->set_write_timeout_millis((AllowSlowTests() ? 120 : 60) * 1000);
> Remove this commented out line?
Done
--
To view, visit http://gerrit.cloudera.org:8080/14953
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If6f54643c9c066b31a74e1082260225e60324e4e
Gerrit-Change-Number: 14953
Gerrit-PatchSet: 3
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Mon, 06 Jan 2020 18:47:28 +0000
Gerrit-HasComments: Yes
[kudu-CR] [tests] address flakiness in raft consensus election-itest
Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/14953 )
Change subject: [tests] address flakiness in raft_consensus_election-itest
......................................................................
Patch Set 2:
(2 comments)
http://gerrit.cloudera.org:8080/#/c/14953/2/src/kudu/integration-tests/raft_consensus_election-itest.cc
File src/kudu/integration-tests/raft_consensus_election-itest.cc:
http://gerrit.cloudera.org:8080/#/c/14953/2/src/kudu/integration-tests/raft_consensus_election-itest.cc@217
PS2, Line 217: TEST_F(RaftConsensusElectionITest, ChurnyElections_WithNotificationLatency) {
> warning: avoid using "_" in test name "ChurnyElections_WithNotificationLate
Done
http://gerrit.cloudera.org:8080/#/c/14953/2/src/kudu/integration-tests/raft_consensus_election-itest.cc@230
PS2, Line 230: TEST_F(RaftConsensusElectionITest, ChurnyElections_WithDuplicateKeys) {
> warning: avoid using "_" in test name "ChurnyElections_WithDuplicateKeys" a
Done
--
To view, visit http://gerrit.cloudera.org:8080/14953
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If6f54643c9c066b31a74e1082260225e60324e4e
Gerrit-Change-Number: 14953
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Mon, 30 Dec 2019 16:52:57 +0000
Gerrit-HasComments: Yes
[kudu-CR] [tests] address flakiness in raft consensus election-itest
Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/14953 )
Change subject: [tests] address flakiness in raft_consensus_election-itest
......................................................................
Patch Set 3: Code-Review+2
(1 comment)
http://gerrit.cloudera.org:8080/#/c/14953/3/src/kudu/integration-tests/raft_consensus_election-itest.cc
File src/kudu/integration-tests/raft_consensus_election-itest.cc:
http://gerrit.cloudera.org:8080/#/c/14953/3/src/kudu/integration-tests/raft_consensus_election-itest.cc@131
PS3, Line 131: //workload->set_write_timeout_millis((AllowSlowTests() ? 120 : 60) * 1000);
Remove this commented out line?
--
To view, visit http://gerrit.cloudera.org:8080/14953
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If6f54643c9c066b31a74e1082260225e60324e4e
Gerrit-Change-Number: 14953
Gerrit-PatchSet: 3
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Mon, 30 Dec 2019 18:41:59 +0000
Gerrit-HasComments: Yes