You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Alexey Serbin (Code Review)" <ge...@cloudera.org> on 2022/08/11 23:47:02 UTC
[kudu-CR] [tests] fix flakiness in TestTabletCopyEncryptedServers
Alexey Serbin has uploaded this change for review. ( http://gerrit.cloudera.org:8080/18842
Change subject: [tests] fix flakiness in TestTabletCopyEncryptedServers
......................................................................
[tests] fix flakiness in TestTabletCopyEncryptedServers
The TabletCopyITest.TestTabletCopyEncryptedServers scenario deletes
a tablet, and then checks to see that the tablet data state is
TABLET_DATA_COPYING. However, it's possible for the remote bootstrap
to complete so quickly that it's already TABLET_DATA_READY at the time
of sampling, so from time to time the test failed with
src/kudu/integration-tests/tablet_copy-itest.cc:1014: Failure
Failed
Bad status: Timed out: Timed out after 30.002s waiting for correct tablet state: Illegal state: State TABLET_DATA_READY unexpected, expected TABLET_DATA_COPYING
This patch updates the assertion to allow both the COPYING and READY
tablet data states.
Without the patch, the test was about 7% flaky [1]. With the patch,
it's not flaky [2].
[1] http://dist-test.cloudera.org/job?job_id=aserbin.1660260668.94650
[2] http://dist-test.cloudera.org/job?job_id=aserbin.1660261249.109365
Change-Id: I22933cc9cb727711ee5fb45c811c2a759958fdfa
---
M src/kudu/integration-tests/tablet_copy-itest.cc
1 file changed, 7 insertions(+), 4 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/42/18842/1
--
To view, visit http://gerrit.cloudera.org:8080/18842
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I22933cc9cb727711ee5fb45c811c2a759958fdfa
Gerrit-Change-Number: 18842
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin <al...@apache.org>
[kudu-CR] [tests] fix flakiness in TestTabletCopyEncryptedServers
Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/18842 )
Change subject: [tests] fix flakiness in TestTabletCopyEncryptedServers
......................................................................
Patch Set 1: Verified+1
unrelated dist-test failure (DEBUG):
Could not submit C++ distributed test job
--
To view, visit http://gerrit.cloudera.org:8080/18842
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I22933cc9cb727711ee5fb45c811c2a759958fdfa
Gerrit-Change-Number: 18842
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai <ac...@gmail.com>
Gerrit-Comment-Date: Sat, 13 Aug 2022 01:56:28 +0000
Gerrit-HasComments: No
[kudu-CR] [tests] fix flakiness in TestTabletCopyEncryptedServers
Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/18842 )
Change subject: [tests] fix flakiness in TestTabletCopyEncryptedServers
......................................................................
Patch Set 1:
(1 comment)
http://gerrit.cloudera.org:8080/#/c/18842/1//COMMIT_MSG
Commit Message:
http://gerrit.cloudera.org:8080/#/c/18842/1//COMMIT_MSG@19
PS1, Line 19: This patch updates the assertion to allow both the COPYING and READY
> couldn't we inject latency instead?
I didn't try that option yet: just found that similar flakiness was fixed this way some time ago, so I though I'd simply use the same approach: https://github.com/apache/kudu/commit/54839984932bca0c0ba49cdd8fa199a5711e589e
Let me know if you think injecting latency is the preferred way.
--
To view, visit http://gerrit.cloudera.org:8080/18842
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I22933cc9cb727711ee5fb45c811c2a759958fdfa
Gerrit-Change-Number: 18842
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai <ac...@gmail.com>
Gerrit-Comment-Date: Sat, 20 Aug 2022 02:58:46 +0000
Gerrit-HasComments: Yes
[kudu-CR] [tests] fix flakiness in TestTabletCopyEncryptedServers
Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has removed a vote on this change.
Change subject: [tests] fix flakiness in TestTabletCopyEncryptedServers
......................................................................
Removed Verified-1 by Kudu Jenkins (120)
--
To view, visit http://gerrit.cloudera.org:8080/18842
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: I22933cc9cb727711ee5fb45c811c2a759958fdfa
Gerrit-Change-Number: 18842
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai <ac...@gmail.com>
[kudu-CR] [tests] fix flakiness in TestTabletCopyEncryptedServers
Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/18842 )
Change subject: [tests] fix flakiness in TestTabletCopyEncryptedServers
......................................................................
[tests] fix flakiness in TestTabletCopyEncryptedServers
The TabletCopyITest.TestTabletCopyEncryptedServers scenario deletes
a tablet, and then checks to see that the tablet data state is
TABLET_DATA_COPYING. However, it's possible for the remote bootstrap
to complete so quickly that it's already TABLET_DATA_READY at the time
of sampling, so from time to time the test failed with
src/kudu/integration-tests/tablet_copy-itest.cc:1014: Failure
Failed
Bad status: Timed out: Timed out after 30.002s waiting for correct tablet state: Illegal state: State TABLET_DATA_READY unexpected, expected TABLET_DATA_COPYING
This patch updates the assertion to allow both the COPYING and READY
tablet data states.
Without the patch, the test was about 7% flaky [1]. With the patch,
it's not flaky [2].
[1] http://dist-test.cloudera.org/job?job_id=aserbin.1660260668.94650
[2] http://dist-test.cloudera.org/job?job_id=aserbin.1660261249.109365
Change-Id: I22933cc9cb727711ee5fb45c811c2a759958fdfa
Reviewed-on: http://gerrit.cloudera.org:8080/18842
Tested-by: Alexey Serbin <al...@apache.org>
Reviewed-by: Yingchun Lai <ac...@gmail.com>
Reviewed-by: Abhishek Chennaka <ac...@cloudera.com>
Reviewed-by: Attila Bukor <ab...@apache.org>
---
M src/kudu/integration-tests/tablet_copy-itest.cc
1 file changed, 7 insertions(+), 4 deletions(-)
Approvals:
Alexey Serbin: Verified
Yingchun Lai: Looks good to me, but someone else must approve
Abhishek Chennaka: Looks good to me, but someone else must approve
Attila Bukor: Looks good to me, approved
--
To view, visit http://gerrit.cloudera.org:8080/18842
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I22933cc9cb727711ee5fb45c811c2a759958fdfa
Gerrit-Change-Number: 18842
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai <ac...@gmail.com>
[kudu-CR] [tests] fix flakiness in TestTabletCopyEncryptedServers
Posted by "Yingchun Lai (Code Review)" <ge...@cloudera.org>.
Yingchun Lai has posted comments on this change. ( http://gerrit.cloudera.org:8080/18842 )
Change subject: [tests] fix flakiness in TestTabletCopyEncryptedServers
......................................................................
Patch Set 1: Code-Review+1
--
To view, visit http://gerrit.cloudera.org:8080/18842
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I22933cc9cb727711ee5fb45c811c2a759958fdfa
Gerrit-Change-Number: 18842
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai <ac...@gmail.com>
Gerrit-Comment-Date: Sat, 13 Aug 2022 03:32:12 +0000
Gerrit-HasComments: No
[kudu-CR] [tests] fix flakiness in TestTabletCopyEncryptedServers
Posted by "Abhishek Chennaka (Code Review)" <ge...@cloudera.org>.
Abhishek Chennaka has posted comments on this change. ( http://gerrit.cloudera.org:8080/18842 )
Change subject: [tests] fix flakiness in TestTabletCopyEncryptedServers
......................................................................
Patch Set 1: Code-Review+1
--
To view, visit http://gerrit.cloudera.org:8080/18842
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I22933cc9cb727711ee5fb45c811c2a759958fdfa
Gerrit-Change-Number: 18842
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai <ac...@gmail.com>
Gerrit-Comment-Date: Thu, 18 Aug 2022 03:03:56 +0000
Gerrit-HasComments: No
[kudu-CR] [tests] fix flakiness in TestTabletCopyEncryptedServers
Posted by "Attila Bukor (Code Review)" <ge...@cloudera.org>.
Attila Bukor has posted comments on this change. ( http://gerrit.cloudera.org:8080/18842 )
Change subject: [tests] fix flakiness in TestTabletCopyEncryptedServers
......................................................................
Patch Set 1: Code-Review+2
(1 comment)
http://gerrit.cloudera.org:8080/#/c/18842/1//COMMIT_MSG
Commit Message:
http://gerrit.cloudera.org:8080/#/c/18842/1//COMMIT_MSG@19
PS1, Line 19: This patch updates the assertion to allow both the COPYING and READY
> I didn't try that option yet: just found that similar flakiness was fixed t
I think we can go this way, it should be okay if we miss the copying, we still test what we mean to test here.
--
To view, visit http://gerrit.cloudera.org:8080/18842
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I22933cc9cb727711ee5fb45c811c2a759958fdfa
Gerrit-Change-Number: 18842
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai <ac...@gmail.com>
Gerrit-Comment-Date: Tue, 23 Aug 2022 09:43:53 +0000
Gerrit-HasComments: Yes
[kudu-CR] [tests] fix flakiness in TestTabletCopyEncryptedServers
Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/18842 )
Change subject: [tests] fix flakiness in TestTabletCopyEncryptedServers
......................................................................
Patch Set 1:
(1 comment)
http://gerrit.cloudera.org:8080/#/c/18842/1//COMMIT_MSG
Commit Message:
http://gerrit.cloudera.org:8080/#/c/18842/1//COMMIT_MSG@19
PS1, Line 19: This patch updates the assertion to allow both the COPYING and READY
> I think we can go this way, it should be okay if we miss the copying, we st
Yep, that makes sense, thanks.
As an afterthought, I guess relying on the injected latency increases the runtime a bit and might still be prone to flakiness in case of scheduler anomalies, etc.
--
To view, visit http://gerrit.cloudera.org:8080/18842
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I22933cc9cb727711ee5fb45c811c2a759958fdfa
Gerrit-Change-Number: 18842
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Abhishek Chennaka <ac...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai <ac...@gmail.com>
Gerrit-Comment-Date: Tue, 23 Aug 2022 15:01:21 +0000
Gerrit-HasComments: Yes