You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Alexey Serbin (Code Review)" <ge...@cloudera.org> on 2017/06/16 16:25:07 UTC

[kudu-CR] [registration-test] fix flake in TestTabletReports

Alexey Serbin has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7209

Change subject: [registration-test] fix flake in TestTabletReports
......................................................................

[registration-test] fix flake in TestTabletReports

Prior to the fix, the failure rate attributed to this flake was about
0.3% in 1K run:
  http://dist-test.cloudera.org//job?job_id=aserbin.1497582474.17739

After the fix, no failures in 1K run:
  http://dist-test.cloudera.org//job?job_id=aserbin.1497629325.23427

Both runs were of DEBUG build run with -stress_cpu_threads=8 flag.

The crux of the fix is to use AssertEvetually instead of relying on a
hard-coded delay to capture non-zero system catalog's metric reading
updated upon processing tablet reports from TS.  In addition to that,
this patch includes a clean-up of the test code.

Change-Id: I906465ad220236538175c80972ae055193f9bb45
---
M src/kudu/integration-tests/registration-test.cc
1 file changed, 57 insertions(+), 50 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/09/7209/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7209
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I906465ad220236538175c80972ae055193f9bb45
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>

[kudu-CR] [registration-test] fix flake in TestTabletReports

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has submitted this change and it was merged.

Change subject: [registration-test] fix flake in TestTabletReports
......................................................................


[registration-test] fix flake in TestTabletReports

Prior to the fix, the failure rate attributed to this flake was about
0.3% in 1K run:
  http://dist-test.cloudera.org//job?job_id=aserbin.1497582474.17739

After the fix, no failures in 1K run:
  http://dist-test.cloudera.org//job?job_id=aserbin.1497629325.23427

Both runs were of DEBUG build run with -stress_cpu_threads=8 flag.

The crux of the fix is to use AssertEvetually instead of relying on a
hard-coded delay to capture system catalog's metric.  That's because
the metrics of interest are updated upon processing tablet reports from
tservers which contain tablet consensus status information.  The latter
are sent with a tserver-->master heartbeat when the tserver determines
that it has become the leader for one of its tablets.

In addition to that, this patch includes a clean-up of the test code.

Change-Id: I906465ad220236538175c80972ae055193f9bb45
Reviewed-on: http://gerrit.cloudera.org:8080/7209
Tested-by: Kudu Jenkins
Reviewed-by: Alexey Serbin <as...@cloudera.com>
---
M src/kudu/integration-tests/registration-test.cc
1 file changed, 58 insertions(+), 50 deletions(-)

Approvals:
  Alexey Serbin: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/7209
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I906465ad220236538175c80972ae055193f9bb45
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] [registration-test] fix flake in TestTabletReports

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.

Change subject: [registration-test] fix flake in TestTabletReports
......................................................................


Patch Set 1:

(1 comment)

one nit, but no need to fix it, just for future reference.

http://gerrit.cloudera.org:8080/#/c/7209/1/src/kudu/integration-tests/registration-test.cc
File src/kudu/integration-tests/registration-test.cc:

PS1, Line 201:           locations->Swap(&loc);
nit: you can use *locations = std::move(loc) now that we have move-ctors enabled for PB


-- 
To view, visit http://gerrit.cloudera.org:8080/7209
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I906465ad220236538175c80972ae055193f9bb45
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] [registration-test] fix flake in TestTabletReports

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Hello Todd Lipcon, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/7209

to look at the new patch set (#3).

Change subject: [registration-test] fix flake in TestTabletReports
......................................................................

[registration-test] fix flake in TestTabletReports

Prior to the fix, the failure rate attributed to this flake was about
0.3% in 1K run:
  http://dist-test.cloudera.org//job?job_id=aserbin.1497582474.17739

After the fix, no failures in 1K run:
  http://dist-test.cloudera.org//job?job_id=aserbin.1497629325.23427

Both runs were of DEBUG build run with -stress_cpu_threads=8 flag.

The crux of the fix is to use AssertEvetually instead of relying on a
hard-coded delay to capture system catalog's metric.  That's because
the metrics of interest are updated upon processing tablet reports from
tservers which contain tablet consensus status information.  The latter
are sent with a tserver-->master heartbeat when the tserver determines
that it has become the leader for one of its tablets.

In addition to that, this patch includes a clean-up of the test code.

Change-Id: I906465ad220236538175c80972ae055193f9bb45
---
M src/kudu/integration-tests/registration-test.cc
1 file changed, 58 insertions(+), 50 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/09/7209/3
-- 
To view, visit http://gerrit.cloudera.org:8080/7209
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I906465ad220236538175c80972ae055193f9bb45
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] [registration-test] fix flake in TestTabletReports

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change.

Change subject: [registration-test] fix flake in TestTabletReports
......................................................................


Patch Set 3: Code-Review+2

Carrying over Todd's +2 from the PS1.

-- 
To view, visit http://gerrit.cloudera.org:8080/7209
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I906465ad220236538175c80972ae055193f9bb45
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] [registration-test] fix flake in TestTabletReports

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.

Change subject: [registration-test] fix flake in TestTabletReports
......................................................................


Patch Set 1: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7209
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I906465ad220236538175c80972ae055193f9bb45
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] [registration-test] fix flake in TestTabletReports

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change.

Change subject: [registration-test] fix flake in TestTabletReports
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7209/1/src/kudu/integration-tests/registration-test.cc
File src/kudu/integration-tests/registration-test.cc:

PS1, Line 201:           locations->Swap(&loc);
> nit: you can use *locations = std::move(loc) now that we have move-ctors en
That's much better.  Let me do this and also update the over-complicated commit message :)


-- 
To view, visit http://gerrit.cloudera.org:8080/7209
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I906465ad220236538175c80972ae055193f9bb45
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] [registration-test] fix flake in TestTabletReports

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Hello Todd Lipcon, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/7209

to look at the new patch set (#2).

Change subject: [registration-test] fix flake in TestTabletReports
......................................................................

[registration-test] fix flake in TestTabletReports

Prior to the fix, the failure rate attributed to this flake was about
0.3% in 1K run:
  http://dist-test.cloudera.org//job?job_id=aserbin.1497582474.17739

After the fix, no failures in 1K run:
  http://dist-test.cloudera.org//job?job_id=aserbin.1497629325.23427

Both runs were of DEBUG build run with -stress_cpu_threads=8 flag.

The crux of the fix is to use AssertEvetually instead of relying on a
hard-coded delay to capture non-zero system catalog's metric.  That's
because the metrics of interest are updated upon processing tablet
reports from tservers which contain tablet consensus status information.
The latter are sent with a tserver-->master heartbeat when the tserver
determines that it has become the leader for one of its tablets.

In addition to that, this patch includes a clean-up of the test code.

Change-Id: I906465ad220236538175c80972ae055193f9bb45
---
M src/kudu/integration-tests/registration-test.cc
1 file changed, 58 insertions(+), 50 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/09/7209/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7209
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I906465ad220236538175c80972ae055193f9bb45
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>