You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Andrew Wong (Code Review)" <ge...@cloudera.org> on 2018/08/28 03:16:09 UTC

[kudu-CR] build-support: option to retry all failed tests

Andrew Wong has uploaded this change for review. ( http://gerrit.cloudera.org:8080/11342


Change subject: build-support: option to retry all failed tests
......................................................................

build-support: option to retry all failed tests

Currently, users can opt to retry flaky tests as reported by the
user-specified test server. The test server's flaky test list may not
accurately reflect what tests are flaky in all environments. In
environments where there are flaky tests that are under-represented by
the test server, it would still be nice to be resilience to flakies. As
such, this patch adds an option to retry all failed tests.

Change-Id: I24aea0b9e7a1c2c66bc5feffcb454ff01cdca6fd
---
M build-support/run-test.sh
1 file changed, 20 insertions(+), 15 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/42/11342/1
-- 
To view, visit http://gerrit.cloudera.org:8080/11342
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I24aea0b9e7a1c2c66bc5feffcb454ff01cdca6fd
Gerrit-Change-Number: 11342
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>

[kudu-CR] build-support: option to retry all failed tests

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/11342 )

Change subject: build-support: option to retry all failed tests
......................................................................


Patch Set 7:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/11342/7/build-support/run-test.sh
File build-support/run-test.sh:

http://gerrit.cloudera.org:8080/#/c/11342/7/build-support/run-test.sh@28
PS7, Line 28: KUDU_RETRY_ALL_FAILED_TESTS is non-zero
This is a boolean, right? So why non-zero? Why not just non-empty? Then you could use -n in the checks rather than -gt 0, which is a little weird for a variable whose actual value we don't care about.


http://gerrit.cloudera.org:8080/#/c/11342/7/build-support/run-test.sh@89
PS7, Line 89: if [ "$TEST_IS_RETRYABLE" -gt 0 ]; then
This doesn't need to be a number value either; could just make it set or unset, and can drop L85. The only change you'd need to make to existing logic is to make sure that L82 doesn't set TEST_IS_RETRYABLE if the grep returns nothing.



-- 
To view, visit http://gerrit.cloudera.org:8080/11342
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I24aea0b9e7a1c2c66bc5feffcb454ff01cdca6fd
Gerrit-Change-Number: 11342
Gerrit-PatchSet: 7
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Tue, 28 Aug 2018 19:00:10 +0000
Gerrit-HasComments: Yes

[kudu-CR] build-support: option to retry all failed tests

Posted by "Grant Henke (Code Review)" <ge...@cloudera.org>.
Grant Henke has posted comments on this change. ( http://gerrit.cloudera.org:8080/11342 )

Change subject: build-support: option to retry all failed tests
......................................................................


Patch Set 6: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/11342
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I24aea0b9e7a1c2c66bc5feffcb454ff01cdca6fd
Gerrit-Change-Number: 11342
Gerrit-PatchSet: 6
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Tue, 28 Aug 2018 15:35:33 +0000
Gerrit-HasComments: No

[kudu-CR] build-support: option to retry all failed tests

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/11342 )

Change subject: build-support: option to retry all failed tests
......................................................................

build-support: option to retry all failed tests

Currently, users can opt to retry flaky tests as reported by the
user-specified test server. The test server's flaky test list may not
accurately reflect what tests are flaky in all environments. In
environments where there are flaky tests that are under-represented by
the test server, it would still be nice to be resilient to flakies. As
such, this patch adds an option to retry all failed tests.

Here's a run of a non-flaky test into which I added a FATAL log.
http://dist-test.cloudera.org/job?job_id=awong.1535433877.28172

Change-Id: I24aea0b9e7a1c2c66bc5feffcb454ff01cdca6fd
Reviewed-on: http://gerrit.cloudera.org:8080/11342
Tested-by: Kudu Jenkins
Reviewed-by: Grant Henke <gr...@apache.org>
---
M build-support/dist_test.py
M build-support/run-test.sh
2 files changed, 28 insertions(+), 16 deletions(-)

Approvals:
  Kudu Jenkins: Verified
  Grant Henke: Looks good to me, approved

-- 
To view, visit http://gerrit.cloudera.org:8080/11342
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I24aea0b9e7a1c2c66bc5feffcb454ff01cdca6fd
Gerrit-Change-Number: 11342
Gerrit-PatchSet: 7
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins

[kudu-CR] build-support: option to retry all failed tests

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/11342 )

Change subject: build-support: option to retry all failed tests
......................................................................


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11342/2/build-support/run-test.sh
File build-support/run-test.sh:

http://gerrit.cloudera.org:8080/#/c/11342/2/build-support/run-test.sh@78
PS2, Line 78:   echo "Will retry on failure"
> Nit: I suppose "all" doesn't make sense here since this is a single test.
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/11342
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I24aea0b9e7a1c2c66bc5feffcb454ff01cdca6fd
Gerrit-Change-Number: 11342
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Tue, 28 Aug 2018 05:33:16 +0000
Gerrit-HasComments: Yes

[kudu-CR] build-support: option to retry all failed tests

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins, Adar Dembo, Grant Henke, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/11342

to look at the new patch set (#3).

Change subject: build-support: option to retry all failed tests
......................................................................

build-support: option to retry all failed tests

Currently, users can opt to retry flaky tests as reported by the
user-specified test server. The test server's flaky test list may not
accurately reflect what tests are flaky in all environments. In
environments where there are flaky tests that are under-represented by
the test server, it would still be nice to be resilience to flakies. As
such, this patch adds an option to retry all failed tests.

Here's a run of a non-flaky test into which I added a FATAL log.
http://dist-test.cloudera.org/job?job_id=awong.1535433877.28172

Change-Id: I24aea0b9e7a1c2c66bc5feffcb454ff01cdca6fd
---
M build-support/dist_test.py
M build-support/run-test.sh
2 files changed, 29 insertions(+), 16 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/42/11342/3
-- 
To view, visit http://gerrit.cloudera.org:8080/11342
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I24aea0b9e7a1c2c66bc5feffcb454ff01cdca6fd
Gerrit-Change-Number: 11342
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins

[kudu-CR] build-support: option to retry all failed tests

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/11342 )

Change subject: build-support: option to retry all failed tests
......................................................................


Patch Set 7:

(2 comments)

Merged before getting to your comments, but just posted a follow-up https://gerrit.cloudera.org/c/11348/

http://gerrit.cloudera.org:8080/#/c/11342/7/build-support/run-test.sh
File build-support/run-test.sh:

http://gerrit.cloudera.org:8080/#/c/11342/7/build-support/run-test.sh@28
PS7, Line 28: KUDU_RETRY_ALL_FAILED_TESTS is non-zero
> This is a boolean, right? So why non-zero? Why not just non-empty? Then you
Fair point, I was copying KUDU_REPORT_TEST_RESULTS, but I can update this.


http://gerrit.cloudera.org:8080/#/c/11342/7/build-support/run-test.sh@89
PS7, Line 89: if [ "$TEST_IS_RETRYABLE" -gt 0 ]; then
> This doesn't need to be a number value either; could just make it set or un
Fair, seems this was originally done this way to leverage `grep`.



-- 
To view, visit http://gerrit.cloudera.org:8080/11342
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I24aea0b9e7a1c2c66bc5feffcb454ff01cdca6fd
Gerrit-Change-Number: 11342
Gerrit-PatchSet: 7
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Wed, 29 Aug 2018 01:18:37 +0000
Gerrit-HasComments: Yes

[kudu-CR] build-support: option to retry all failed tests

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Alexey Serbin, Kudu Jenkins, Adar Dembo, Grant Henke, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/11342

to look at the new patch set (#4).

Change subject: build-support: option to retry all failed tests
......................................................................

build-support: option to retry all failed tests

Currently, users can opt to retry flaky tests as reported by the
user-specified test server. The test server's flaky test list may not
accurately reflect what tests are flaky in all environments. In
environments where there are flaky tests that are under-represented by
the test server, it would still be nice to be resilience to flakies. As
such, this patch adds an option to retry all failed tests.

Here's a run of a non-flaky test into which I added a FATAL log.
http://dist-test.cloudera.org/job?job_id=awong.1535433877.28172

Change-Id: I24aea0b9e7a1c2c66bc5feffcb454ff01cdca6fd
---
M build-support/dist_test.py
M build-support/run-test.sh
M src/kudu/fs/data_dirs-test.cc
3 files changed, 30 insertions(+), 16 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/42/11342/4
-- 
To view, visit http://gerrit.cloudera.org:8080/11342
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I24aea0b9e7a1c2c66bc5feffcb454ff01cdca6fd
Gerrit-Change-Number: 11342
Gerrit-PatchSet: 4
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins

[kudu-CR] build-support: option to retry all failed tests

Posted by "Grant Henke (Code Review)" <ge...@cloudera.org>.
Grant Henke has posted comments on this change. ( http://gerrit.cloudera.org:8080/11342 )

Change subject: build-support: option to retry all failed tests
......................................................................


Patch Set 4:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/11342/4/build-support/dist_test.py
File build-support/dist_test.py:

http://gerrit.cloudera.org:8080/#/c/11342/4/build-support/dist_test.py@347
PS4, Line 347:              '-e', 'KUDU_RETRY_ALL_FAILED_TESTS=%d' % RETRY_ALL_TESTS,
I don't think this is needed. dist-test will handle the retries when creating the task. This could result in each run-test.sh execution retrying and dist-test retrying too. Though I don't think the number of retries is passed here.


http://gerrit.cloudera.org:8080/#/c/11342/4/build-support/dist_test.py@405
PS4, Line 405:                "max_retries": max_retries
This is where the dist-test retries are handled.


http://gerrit.cloudera.org:8080/#/c/11342/4/src/kudu/fs/data_dirs-test.cc
File src/kudu/fs/data_dirs-test.cc:

http://gerrit.cloudera.org:8080/#/c/11342/4/src/kudu/fs/data_dirs-test.cc@442
PS4, Line 442:   LOG(FATAL) << "lol";
We should remove this in the patch.



-- 
To view, visit http://gerrit.cloudera.org:8080/11342
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I24aea0b9e7a1c2c66bc5feffcb454ff01cdca6fd
Gerrit-Change-Number: 11342
Gerrit-PatchSet: 4
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Tue, 28 Aug 2018 13:52:52 +0000
Gerrit-HasComments: Yes

[kudu-CR] build-support: option to retry all failed tests

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/11342

to look at the new patch set (#2).

Change subject: build-support: option to retry all failed tests
......................................................................

build-support: option to retry all failed tests

Currently, users can opt to retry flaky tests as reported by the
user-specified test server. The test server's flaky test list may not
accurately reflect what tests are flaky in all environments. In
environments where there are flaky tests that are under-represented by
the test server, it would still be nice to be resilience to flakies. As
such, this patch adds an option to retry all failed tests.

Change-Id: I24aea0b9e7a1c2c66bc5feffcb454ff01cdca6fd
---
M build-support/run-test.sh
1 file changed, 20 insertions(+), 15 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/42/11342/2
-- 
To view, visit http://gerrit.cloudera.org:8080/11342
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I24aea0b9e7a1c2c66bc5feffcb454ff01cdca6fd
Gerrit-Change-Number: 11342
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins

[kudu-CR] build-support: option to retry all failed tests

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/11342 )

Change subject: build-support: option to retry all failed tests
......................................................................


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11342/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11342/2//COMMIT_MSG@13
PS2, Line 13: resilience
nit: resilient ?



-- 
To view, visit http://gerrit.cloudera.org:8080/11342
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I24aea0b9e7a1c2c66bc5feffcb454ff01cdca6fd
Gerrit-Change-Number: 11342
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Tue, 28 Aug 2018 05:53:18 +0000
Gerrit-HasComments: Yes

[kudu-CR] build-support: option to retry all failed tests

Posted by "Grant Henke (Code Review)" <ge...@cloudera.org>.
Grant Henke has posted comments on this change. ( http://gerrit.cloudera.org:8080/11342 )

Change subject: build-support: option to retry all failed tests
......................................................................


Patch Set 5: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/11342
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I24aea0b9e7a1c2c66bc5feffcb454ff01cdca6fd
Gerrit-Change-Number: 11342
Gerrit-PatchSet: 5
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Tue, 28 Aug 2018 15:04:13 +0000
Gerrit-HasComments: No

[kudu-CR] build-support: option to retry all failed tests

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/11342 )

Change subject: build-support: option to retry all failed tests
......................................................................


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11342/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11342/2//COMMIT_MSG@13
PS2, Line 13: resilience
> nit: resilient ?
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/11342
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I24aea0b9e7a1c2c66bc5feffcb454ff01cdca6fd
Gerrit-Change-Number: 11342
Gerrit-PatchSet: 5
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Tue, 28 Aug 2018 15:15:26 +0000
Gerrit-HasComments: Yes

[kudu-CR] build-support: option to retry all failed tests

Posted by "Grant Henke (Code Review)" <ge...@cloudera.org>.
Grant Henke has posted comments on this change. ( http://gerrit.cloudera.org:8080/11342 )

Change subject: build-support: option to retry all failed tests
......................................................................


Patch Set 2:

(1 comment)

See this logic here for dist-test. We should be sure the retry logic doesn't overlap: https://github.com/apache/kudu/blob/master/build-support/dist_test.py#L391-L393

Also see here in build-and-test.sh: https://github.com/apache/kudu/blob/master/build-support/jenkins/build-and-test.sh#L282

Do we need to make sure KUDU_FLAKY_TEST_ATTEMPTS isn't set to 1 if the flaky list isn't found?

http://gerrit.cloudera.org:8080/#/c/11342/2/build-support/run-test.sh
File build-support/run-test.sh:

http://gerrit.cloudera.org:8080/#/c/11342/2/build-support/run-test.sh@78
PS2, Line 78:   echo "Will retry all failed tests"
Nit: I suppose "all" doesn't make sense here since this is a single test.



-- 
To view, visit http://gerrit.cloudera.org:8080/11342
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I24aea0b9e7a1c2c66bc5feffcb454ff01cdca6fd
Gerrit-Change-Number: 11342
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Tue, 28 Aug 2018 04:14:00 +0000
Gerrit-HasComments: Yes

[kudu-CR] build-support: option to retry all failed tests

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Alexey Serbin, Kudu Jenkins, Adar Dembo, Grant Henke, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/11342

to look at the new patch set (#6).

Change subject: build-support: option to retry all failed tests
......................................................................

build-support: option to retry all failed tests

Currently, users can opt to retry flaky tests as reported by the
user-specified test server. The test server's flaky test list may not
accurately reflect what tests are flaky in all environments. In
environments where there are flaky tests that are under-represented by
the test server, it would still be nice to be resilient to flakies. As
such, this patch adds an option to retry all failed tests.

Here's a run of a non-flaky test into which I added a FATAL log.
http://dist-test.cloudera.org/job?job_id=awong.1535433877.28172

Change-Id: I24aea0b9e7a1c2c66bc5feffcb454ff01cdca6fd
---
M build-support/dist_test.py
M build-support/run-test.sh
2 files changed, 28 insertions(+), 16 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/42/11342/6
-- 
To view, visit http://gerrit.cloudera.org:8080/11342
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I24aea0b9e7a1c2c66bc5feffcb454ff01cdca6fd
Gerrit-Change-Number: 11342
Gerrit-PatchSet: 6
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins

[kudu-CR] build-support: option to retry all failed tests

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Alexey Serbin, Kudu Jenkins, Adar Dembo, Grant Henke, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/11342

to look at the new patch set (#5).

Change subject: build-support: option to retry all failed tests
......................................................................

build-support: option to retry all failed tests

Currently, users can opt to retry flaky tests as reported by the
user-specified test server. The test server's flaky test list may not
accurately reflect what tests are flaky in all environments. In
environments where there are flaky tests that are under-represented by
the test server, it would still be nice to be resilience to flakies. As
such, this patch adds an option to retry all failed tests.

Here's a run of a non-flaky test into which I added a FATAL log.
http://dist-test.cloudera.org/job?job_id=awong.1535433877.28172

Change-Id: I24aea0b9e7a1c2c66bc5feffcb454ff01cdca6fd
---
M build-support/dist_test.py
M build-support/run-test.sh
2 files changed, 28 insertions(+), 16 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/42/11342/5
-- 
To view, visit http://gerrit.cloudera.org:8080/11342
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I24aea0b9e7a1c2c66bc5feffcb454ff01cdca6fd
Gerrit-Change-Number: 11342
Gerrit-PatchSet: 5
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins

[kudu-CR] build-support: option to retry all failed tests

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/11342 )

Change subject: build-support: option to retry all failed tests
......................................................................


Patch Set 5:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/11342/4/build-support/dist_test.py
File build-support/dist_test.py:

http://gerrit.cloudera.org:8080/#/c/11342/4/build-support/dist_test.py@347
PS4, Line 347:              '-e', 'KUDU_COMPRESS_TEST_OUTPUT=%s' % \
> I don't think this is needed. dist-test will handle the retries when creati
Ah right, and it wasn't doing anything in rev4 because KUDU_FLAKY_TEST_ATTEMPTS isn't passed in.


http://gerrit.cloudera.org:8080/#/c/11342/4/build-support/dist_test.py@405
PS4, Line 405:                }] * replicate_tasks
> This is where the dist-test retries are handled.
Ack


http://gerrit.cloudera.org:8080/#/c/11342/4/src/kudu/fs/data_dirs-test.cc
File src/kudu/fs/data_dirs-test.cc:

http://gerrit.cloudera.org:8080/#/c/11342/4/src/kudu/fs/data_dirs-test.cc@442
PS4, Line 442: }
> We should remove this in the patch.
(coneofshame)



-- 
To view, visit http://gerrit.cloudera.org:8080/11342
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I24aea0b9e7a1c2c66bc5feffcb454ff01cdca6fd
Gerrit-Change-Number: 11342
Gerrit-PatchSet: 5
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Tue, 28 Aug 2018 14:58:36 +0000
Gerrit-HasComments: Yes

[kudu-CR] build-support: option to retry all failed tests

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/11342 )

Change subject: build-support: option to retry all failed tests
......................................................................


Patch Set 2:

> Patch Set 2:
> 
> (1 comment)
> 
> See this logic here for dist-test. We should be sure the retry logic doesn't overlap: https://github.com/apache/kudu/blob/master/build-support/dist_test.py#L391-L393

Good catch; we chatted offline and it seems we just need to pass around the right environment variables.

> Also see here in build-and-test.sh: https://github.com/apache/kudu/blob/master/build-support/jenkins/build-and-test.sh#L282

Yeah, that's setting the number of attempts and setting the list. That number and list are both used in run-test.sh via ctest.

> Do we need to make sure KUDU_FLAKY_TEST_ATTEMPTS isn't set to 1 if the flaky list isn't found?

I'm don't think that's necessary since the value of KUDU_FLAKY_TEST_ATTEMPTS is independent of the flaky list. If there's no flaky list but we're set to retry via KUDU_FLAKY_TEST_ATTEMPTS, we currently don't retry. That's still the case if KUDU_RETRY_ALL_FAILED_TESTS is 0.


-- 
To view, visit http://gerrit.cloudera.org:8080/11342
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I24aea0b9e7a1c2c66bc5feffcb454ff01cdca6fd
Gerrit-Change-Number: 11342
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Tue, 28 Aug 2018 05:32:44 +0000
Gerrit-HasComments: No