You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Alexey Serbin (Code Review)" <ge...@cloudera.org> on 2017/06/19 21:10:36 UTC

[kudu-CR] KUDU-2042 fix flakiness in raft consensus-itest

Alexey Serbin has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7227

Change subject: KUDU-2042 fix flakiness in raft_consensus-itest
......................................................................

KUDU-2042 fix flakiness in raft_consensus-itest

Fixed flakiness in the raft_consensus-itest test exposed by recent
change 86116e4c515a9c89e728dd699decaf20d097edac.

The source of the flakiness was the race between possible leader
election and calls of GetLeaderReplicaWithRetries() and
GetOnlyLiveFollowerReplicas() methods.

Change-Id: Ia5fb2b82afb15b50659d068ef83d11b7cc291ca9
---
M src/kudu/integration-tests/raft_consensus-itest.cc
1 file changed, 24 insertions(+), 9 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/27/7227/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7227
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ia5fb2b82afb15b50659d068ef83d11b7cc291ca9
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>

[kudu-CR] KUDU-2042 fix flakiness in raft consensus-itest

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/7227

to look at the new patch set (#4).

Change subject: KUDU-2042 fix flakiness in raft_consensus-itest
......................................................................

KUDU-2042 fix flakiness in raft_consensus-itest

Fixed flakiness in the raft_consensus-itest test exposed by recent
change 86116e4c515a9c89e728dd699decaf20d097edac.

The source of the flakiness was the race between possible leader
election and calls of GetLeaderReplicaWithRetries() and
GetOnlyLiveFollowerReplicas() methods.

DEBUG build, run with --cpu-stress-threads=8 before the fix:
  http://dist-test.cloudera.org//job?job_id=aserbin.1498102286.18617
    13 out of 2048 failed (~0.6% failure rate)

DEBUG build, run with --cpu-stress-threads=8, after the fix:
  http://dist-test.cloudera.org//job?job_id=aserbin.1498103715.1697
    0  out of 2048 failed (~0.0% failure rate)

Change-Id: Ia5fb2b82afb15b50659d068ef83d11b7cc291ca9
---
M src/kudu/integration-tests/raft_consensus-itest.cc
M src/kudu/integration-tests/ts_itest-base.h
2 files changed, 75 insertions(+), 30 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/27/7227/4
-- 
To view, visit http://gerrit.cloudera.org:8080/7227
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia5fb2b82afb15b50659d068ef83d11b7cc291ca9
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-2042 fix flakiness in raft consensus-itest

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/7227

to look at the new patch set (#3).

Change subject: KUDU-2042 fix flakiness in raft_consensus-itest
......................................................................

KUDU-2042 fix flakiness in raft_consensus-itest

Fixed flakiness in the raft_consensus-itest test exposed by recent
change 86116e4c515a9c89e728dd699decaf20d097edac.

The source of the flakiness was the race between possible leader
election and calls of GetLeaderReplicaWithRetries() and
GetOnlyLiveFollowerReplicas() methods.

DEBUG build, run with --cpu-stress-threads=8 before the fix:
  http://dist-test.cloudera.org//job?job_id=aserbin.1498102286.18617
    13 out of 2048 failed (~0.6% failure rate)

DEBUG build, run with --cpu-stress-threads=8, after the fix:
  http://dist-test.cloudera.org//job?job_id=aserbin.1498103715.1697
    0  out of 2048 failed (~0.0% failure rate)

Change-Id: Ia5fb2b82afb15b50659d068ef83d11b7cc291ca9
---
M src/kudu/integration-tests/raft_consensus-itest.cc
M src/kudu/integration-tests/ts_itest-base.h
2 files changed, 86 insertions(+), 39 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/27/7227/3
-- 
To view, visit http://gerrit.cloudera.org:8080/7227
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia5fb2b82afb15b50659d068ef83d11b7cc291ca9
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-2042 fix flakiness in raft consensus-itest

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.

Change subject: KUDU-2042 fix flakiness in raft_consensus-itest
......................................................................


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7227/4/src/kudu/integration-tests/raft_consensus-itest.cc
File src/kudu/integration-tests/raft_consensus-itest.cc:

Line 1083:   for (auto latch : latches) {
nit: use auto* here so it's more obviously a pointer


-- 
To view, visit http://gerrit.cloudera.org:8080/7227
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia5fb2b82afb15b50659d068ef83d11b7cc291ca9
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2042 fix flakiness in raft consensus-itest

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has submitted this change and it was merged.

Change subject: KUDU-2042 fix flakiness in raft_consensus-itest
......................................................................


KUDU-2042 fix flakiness in raft_consensus-itest

Fixed flakiness in the raft_consensus-itest test exposed by recent
change 86116e4c515a9c89e728dd699decaf20d097edac.

The source of the flakiness was the race between possible leader
election and calls of GetLeaderReplicaWithRetries() and
GetOnlyLiveFollowerReplicas() methods.

DEBUG build, run with --cpu-stress-threads=8 before the fix:
  http://dist-test.cloudera.org//job?job_id=aserbin.1498102286.18617
    13 out of 2048 failed (0.6% failure rate)

DEBUG build, run with --cpu-stress-threads=8, after the fix:
  http://dist-test.cloudera.org//job?job_id=aserbin.1498103715.1697
    0  out of 2048 failed (0.0% failure rate)

Change-Id: Ia5fb2b82afb15b50659d068ef83d11b7cc291ca9
Reviewed-on: http://gerrit.cloudera.org:8080/7227
Tested-by: Kudu Jenkins
Reviewed-by: Todd Lipcon <to...@apache.org>
---
M src/kudu/integration-tests/raft_consensus-itest.cc
M src/kudu/integration-tests/ts_itest-base.h
2 files changed, 75 insertions(+), 30 deletions(-)

Approvals:
  Todd Lipcon: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/7227
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ia5fb2b82afb15b50659d068ef83d11b7cc291ca9
Gerrit-PatchSet: 7
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-2042 fix flakiness in raft consensus-itest

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change.

Change subject: KUDU-2042 fix flakiness in raft_consensus-itest
......................................................................


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7227/3/src/kudu/integration-tests/raft_consensus-itest.cc
File src/kudu/integration-tests/raft_consensus-itest.cc:

Line 294:       RETURN_NOT_OK(ets->Pause());
> I think these should either stay as CHECK or have unique PREPENDs so it's e
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7227
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia5fb2b82afb15b50659d068ef83d11b7cc291ca9
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2042 fix flakiness in raft consensus-itest

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/7227

to look at the new patch set (#5).

Change subject: KUDU-2042 fix flakiness in raft_consensus-itest
......................................................................

KUDU-2042 fix flakiness in raft_consensus-itest

Fixed flakiness in the raft_consensus-itest test exposed by recent
change 86116e4c515a9c89e728dd699decaf20d097edac.

The source of the flakiness was the race between possible leader
election and calls of GetLeaderReplicaWithRetries() and
GetOnlyLiveFollowerReplicas() methods.

DEBUG build, run with --cpu-stress-threads=8 before the fix:
  http://dist-test.cloudera.org//job?job_id=aserbin.1498102286.18617
    13 out of 2048 failed (0.6% failure rate)

DEBUG build, run with --cpu-stress-threads=8, after the fix:
  http://dist-test.cloudera.org//job?job_id=aserbin.1498103715.1697
    0  out of 2048 failed (0.0% failure rate)

Change-Id: Ia5fb2b82afb15b50659d068ef83d11b7cc291ca9
---
M src/kudu/integration-tests/raft_consensus-itest.cc
M src/kudu/integration-tests/ts_itest-base.h
2 files changed, 75 insertions(+), 30 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/27/7227/5
-- 
To view, visit http://gerrit.cloudera.org:8080/7227
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia5fb2b82afb15b50659d068ef83d11b7cc291ca9
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-2042 fix flakiness in raft consensus-itest

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.

Change subject: KUDU-2042 fix flakiness in raft_consensus-itest
......................................................................


Patch Set 6: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7227
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia5fb2b82afb15b50659d068ef83d11b7cc291ca9
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] KUDU-2042 fix flakiness in raft consensus-itest

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.

Change subject: KUDU-2042 fix flakiness in raft_consensus-itest
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7227/1/src/kudu/integration-tests/raft_consensus-itest.cc
File src/kudu/integration-tests/raft_consensus-itest.cc:

Line 304:     } while (true);
instead of this loop, could we introduce a function like:

GetLeaderAndLiveFollowers(&leader, &followers);

which would get an "atomic" snapshot of both pieces of state? probably easier to follow.


-- 
To view, visit http://gerrit.cloudera.org:8080/7227
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia5fb2b82afb15b50659d068ef83d11b7cc291ca9
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2042 fix flakiness in raft consensus-itest

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.

Change subject: KUDU-2042 fix flakiness in raft_consensus-itest
......................................................................


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7227/3/src/kudu/integration-tests/raft_consensus-itest.cc
File src/kudu/integration-tests/raft_consensus-itest.cc:

Line 294:       RETURN_NOT_OK(ets->Pause());
I think these should either stay as CHECK or have unique PREPENDs so it's easier to diagnose when the test fails


-- 
To view, visit http://gerrit.cloudera.org:8080/7227
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia5fb2b82afb15b50659d068ef83d11b7cc291ca9
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2042 fix flakiness in raft consensus-itest

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change.

Change subject: KUDU-2042 fix flakiness in raft_consensus-itest
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7227/1/src/kudu/integration-tests/raft_consensus-itest.cc
File src/kudu/integration-tests/raft_consensus-itest.cc:

Line 304:     } while (true);
> instead of this loop, could we introduce a function like:
Ha!

Just thought about that while preparing this patch and looking at ts_itest-base.h and decided to go with re-tries since it seemed to me that the 'atomic' snapshot approach would not catch as many re-elections as the 'retry' approach.

OK, I'll do the 'atomic' approach, moving the retry into the function itself.


-- 
To view, visit http://gerrit.cloudera.org:8080/7227
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia5fb2b82afb15b50659d068ef83d11b7cc291ca9
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2042 fix flakiness in raft consensus-itest

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change.

Change subject: KUDU-2042 fix flakiness in raft_consensus-itest
......................................................................


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7227/4/src/kudu/integration-tests/raft_consensus-itest.cc
File src/kudu/integration-tests/raft_consensus-itest.cc:

Line 1083:   for (auto latch : latches) {
> nit: use auto* here so it's more obviously a pointer
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7227
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia5fb2b82afb15b50659d068ef83d11b7cc291ca9
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2042 fix flakiness in raft consensus-itest

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/7227

to look at the new patch set (#2).

Change subject: KUDU-2042 fix flakiness in raft_consensus-itest
......................................................................

KUDU-2042 fix flakiness in raft_consensus-itest

Fixed flakiness in the raft_consensus-itest test exposed by recent
change 86116e4c515a9c89e728dd699decaf20d097edac.

The source of the flakiness was the race between possible leader
election and calls of GetLeaderReplicaWithRetries() and
GetOnlyLiveFollowerReplicas() methods.

Change-Id: Ia5fb2b82afb15b50659d068ef83d11b7cc291ca9
---
M src/kudu/integration-tests/raft_consensus-itest.cc
M src/kudu/integration-tests/ts_itest-base.h
2 files changed, 86 insertions(+), 39 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/27/7227/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7227
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia5fb2b82afb15b50659d068ef83d11b7cc291ca9
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>