You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Alexey Serbin (Code Review)" <ge...@cloudera.org> on 2017/11/28 06:49:47 UTC
[kudu-CR] KUDU-1097: 'gone-and-back tablet server' test scenario
Alexey Serbin has uploaded this change for review. ( http://gerrit.cloudera.org:8080/8664
Change subject: KUDU-1097: 'gone-and-back tablet server' test scenario
......................................................................
KUDU-1097: 'gone-and-back tablet server' test scenario
Added a new test scenario for the new 3-4-3 re-replication
scheme. The scenario addresses the situation when a tablet
server has not been running for some time, a bit over the
FLAGS_follower_unavailable_considered_failed_sec interval,
and then it comes back before the newly added non-voter replicas
are promoted. As a result, the original voter replicas from
the tablet server should stay, but the newly added non-voter replicas
should be evicted.
Change-Id: I35eb6a0c7de5bfef962b5e96857c3f9c85a1a7b0
---
M src/kudu/integration-tests/raft_consensus_nonvoter-itest.cc
1 file changed, 95 insertions(+), 0 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/64/8664/1
--
To view, visit http://gerrit.cloudera.org:8080/8664
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I35eb6a0c7de5bfef962b5e96857c3f9c85a1a7b0
Gerrit-Change-Number: 8664
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
[kudu-CR] KUDU-1097: 'gone-and-back tablet server' test scenario
Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change. ( http://gerrit.cloudera.org:8080/8664 )
Change subject: KUDU-1097: 'gone-and-back tablet server' test scenario
......................................................................
Patch Set 2:
(2 comments)
looks good
http://gerrit.cloudera.org:8080/#/c/8664/2/src/kudu/integration-tests/raft_consensus_nonvoter-itest.cc
File src/kudu/integration-tests/raft_consensus_nonvoter-itest.cc:
http://gerrit.cloudera.org:8080/#/c/8664/2/src/kudu/integration-tests/raft_consensus_nonvoter-itest.cc@1059
PS2, Line 1059: The catalog
: // manager should spawn non-voter replicas to replace the non-responsive
: // replicas, but as soon as the tablet server is back while the newly added
: // non-voter replicas are still copying data, the catalog manager should detect
: // the excess of replicas and evict the newly added non-voter replicas.
I think the explanation for this part is better in the commit message.
http://gerrit.cloudera.org:8080/#/c/8664/2/src/kudu/integration-tests/raft_consensus_nonvoter-itest.cc@1144
PS2, Line 1144: NO_FATALS(cluster_->AssertNoCrashes());
Before exiting we should do something to ensure that the right replica got evicted, like:
ASSERT_OK(GetConsensusState(ts, tablet_id, kTimeout, &cstate));
ASSERT_TRUE(IsRaftConfigMember(ts_with_replica->uuid(), cstate.committed_config()));
--
To view, visit http://gerrit.cloudera.org:8080/8664
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I35eb6a0c7de5bfef962b5e96857c3f9c85a1a7b0
Gerrit-Change-Number: 8664
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Comment-Date: Tue, 28 Nov 2017 21:24:05 +0000
Gerrit-HasComments: Yes
[kudu-CR] KUDU-1097: 'gone-and-back tablet server' test scenario
Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change. ( http://gerrit.cloudera.org:8080/8664 )
Change subject: KUDU-1097: 'gone-and-back tablet server' test scenario
......................................................................
Patch Set 2:
(1 comment)
http://gerrit.cloudera.org:8080/#/c/8664/2//COMMIT_MSG
Commit Message:
http://gerrit.cloudera.org:8080/#/c/8664/2//COMMIT_MSG@10
PS2, Line 10: The scenario addresses the situation when a tablet server has not been
: running for some time (e.g., a bit over the time interval specified by
: the 'follower_unavailable_considered_failed_sec' flag), and then it
: comes back before the newly added non-voter replicas are promoted.
: As a result, the original voter replicas from the tablet server should
: stay, but the newly added non-voter replicas should be evicted.
This is a great explanation of the test. Would you mind putting this part of the description in the test comment?
--
To view, visit http://gerrit.cloudera.org:8080/8664
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I35eb6a0c7de5bfef962b5e96857c3f9c85a1a7b0
Gerrit-Change-Number: 8664
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Comment-Date: Tue, 28 Nov 2017 21:12:43 +0000
Gerrit-HasComments: Yes
[kudu-CR] KUDU-1097: 'gone-and-back tablet server' test scenario
Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/8664 )
Change subject: KUDU-1097: 'gone-and-back tablet server' test scenario
......................................................................
Patch Set 2:
(1 comment)
http://gerrit.cloudera.org:8080/#/c/8664/2//COMMIT_MSG
Commit Message:
http://gerrit.cloudera.org:8080/#/c/8664/2//COMMIT_MSG@10
PS2, Line 10: The scenario addresses the situation when a tablet server has not been
: running for some time (e.g., a bit over the time interval specified by
: the 'follower_unavailable_considered_failed_sec' flag), and then it
: comes back before the newly added non-voter replicas are promoted.
: As a result, the original voter replicas from the tablet server should
: stay, but the newly added non-voter replicas should be evicted.
> This is a great explanation of the test. Would you mind putting this part o
Sure, why not. I'll replace the current comment with this part.
--
To view, visit http://gerrit.cloudera.org:8080/8664
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I35eb6a0c7de5bfef962b5e96857c3f9c85a1a7b0
Gerrit-Change-Number: 8664
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Comment-Date: Tue, 28 Nov 2017 21:15:19 +0000
Gerrit-HasComments: Yes
[kudu-CR] KUDU-1097: 'gone-and-back tablet server' test scenario
Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/8664 )
Change subject: KUDU-1097: 'gone-and-back tablet server' test scenario
......................................................................
KUDU-1097: 'gone-and-back tablet server' test scenario
Added a new test scenario for the new 3-4-3 re-replication scheme.
The scenario addresses the situation when a tablet server has not been
running for some time (e.g., a bit over the time interval specified by
the 'follower_unavailable_considered_failed_sec' flag), and then it
comes back before the newly added non-voter replicas are promoted.
As a result, the original voter replicas from the tablet server should
stay, but the newly added non-voter replicas should be evicted.
Change-Id: I35eb6a0c7de5bfef962b5e96857c3f9c85a1a7b0
Reviewed-on: http://gerrit.cloudera.org:8080/8664
Tested-by: Alexey Serbin <as...@cloudera.com>
Reviewed-by: Mike Percy <mp...@apache.org>
---
M src/kudu/integration-tests/raft_consensus_nonvoter-itest.cc
1 file changed, 122 insertions(+), 3 deletions(-)
Approvals:
Alexey Serbin: Verified
Mike Percy: Looks good to me, approved
--
To view, visit http://gerrit.cloudera.org:8080/8664
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I35eb6a0c7de5bfef962b5e96857c3f9c85a1a7b0
Gerrit-Change-Number: 8664
Gerrit-PatchSet: 5
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
[kudu-CR] KUDU-1097: 'gone-and-back tablet server' test scenario
Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Hello Mike Percy, Kudu Jenkins,
I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/8664
to look at the new patch set (#3).
Change subject: KUDU-1097: 'gone-and-back tablet server' test scenario
......................................................................
KUDU-1097: 'gone-and-back tablet server' test scenario
Added a new test scenario for the new 3-4-3 re-replication scheme.
The scenario addresses the situation when a tablet server has not been
running for some time (e.g., a bit over the time interval specified by
the 'follower_unavailable_considered_failed_sec' flag), and then it
comes back before the newly added non-voter replicas are promoted.
As a result, the original voter replicas from the tablet server should
stay, but the newly added non-voter replicas should be evicted.
Change-Id: I35eb6a0c7de5bfef962b5e96857c3f9c85a1a7b0
---
M src/kudu/integration-tests/raft_consensus_nonvoter-itest.cc
1 file changed, 121 insertions(+), 3 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/64/8664/3
--
To view, visit http://gerrit.cloudera.org:8080/8664
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I35eb6a0c7de5bfef962b5e96857c3f9c85a1a7b0
Gerrit-Change-Number: 8664
Gerrit-PatchSet: 3
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
[kudu-CR] KUDU-1097: 'gone-and-back tablet server' test scenario
Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Hello Mike Percy, Kudu Jenkins,
I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/8664
to look at the new patch set (#2).
Change subject: KUDU-1097: 'gone-and-back tablet server' test scenario
......................................................................
KUDU-1097: 'gone-and-back tablet server' test scenario
Added a new test scenario for the new 3-4-3 re-replication scheme.
The scenario addresses the situation when a tablet server has not been
running for some time (e.g., a bit over the time interval specified by
the 'follower_unavailable_considered_failed_sec' flag), and then it
comes back before the newly added non-voter replicas are promoted.
As a result, the original voter replicas from the tablet server should
stay, but the newly added non-voter replicas should be evicted.
Change-Id: I35eb6a0c7de5bfef962b5e96857c3f9c85a1a7b0
---
M src/kudu/integration-tests/raft_consensus_nonvoter-itest.cc
1 file changed, 107 insertions(+), 3 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/64/8664/2
--
To view, visit http://gerrit.cloudera.org:8080/8664
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I35eb6a0c7de5bfef962b5e96857c3f9c85a1a7b0
Gerrit-Change-Number: 8664
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
[kudu-CR] KUDU-1097: 'gone-and-back tablet server' test scenario
Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/8664 )
Change subject: KUDU-1097: 'gone-and-back tablet server' test scenario
......................................................................
Patch Set 4: Verified+1
Unrelated flake in OpenReadonlyFsITest.TestWriteAndVerify due to NTP error:
F1128 22:56:33.242112 22843 master_main.cc:74] Check failed: _s.ok() Bad status: Service unavailable: Cannot initialize clock: Error reading clock. Clock considered unsynchronized
*** Check failure stack trace: ***
@ 0x7efcf4e9362d google::LogMessage::Fail() at ??:0
@ 0x7efcf4e9564c google::LogMessage::SendToLog() at ??:0
@ 0x7efcf4e93189 google::LogMessage::Flush() at ??:0
@ 0x7efcf4e95fdf google::LogMessageFatal::~LogMessageFatal() at ??:0
@ 0x404fdc kudu::master::MasterMain() at ??:0
@ 0x4053ab main at ??:0
@ 0x7efcf45c5f45 __libc_start_main at ??:0
@ 0x404a49 (unknown) at ??:0
@ (nil) (unknown)
/home/jenkins-slave/workspace/kudu-master/3/src/kudu/integration-tests/open-readonly-fs-itest.cc:84: Failure
Failed
--
To view, visit http://gerrit.cloudera.org:8080/8664
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I35eb6a0c7de5bfef962b5e96857c3f9c85a1a7b0
Gerrit-Change-Number: 8664
Gerrit-PatchSet: 4
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Comment-Date: Tue, 28 Nov 2017 23:17:18 +0000
Gerrit-HasComments: No
[kudu-CR] KUDU-1097: 'gone-and-back tablet server' test scenario
Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change. ( http://gerrit.cloudera.org:8080/8664 )
Change subject: KUDU-1097: 'gone-and-back tablet server' test scenario
......................................................................
Patch Set 4: Code-Review+2
--
To view, visit http://gerrit.cloudera.org:8080/8664
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I35eb6a0c7de5bfef962b5e96857c3f9c85a1a7b0
Gerrit-Change-Number: 8664
Gerrit-PatchSet: 4
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Comment-Date: Wed, 29 Nov 2017 01:15:13 +0000
Gerrit-HasComments: No
[kudu-CR] KUDU-1097: 'gone-and-back tablet server' test scenario
Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has removed Kudu Jenkins from this change. ( http://gerrit.cloudera.org:8080/8664 )
Change subject: KUDU-1097: 'gone-and-back tablet server' test scenario
......................................................................
Removed reviewer Kudu Jenkins with the following votes:
* Verified-1 by Kudu Jenkins (120)
--
To view, visit http://gerrit.cloudera.org:8080/8664
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteReviewer
Gerrit-Change-Id: I35eb6a0c7de5bfef962b5e96857c3f9c85a1a7b0
Gerrit-Change-Number: 8664
Gerrit-PatchSet: 4
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
[kudu-CR] KUDU-1097: 'gone-and-back tablet server' test scenario
Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Hello Mike Percy, Kudu Jenkins,
I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/8664
to look at the new patch set (#4).
Change subject: KUDU-1097: 'gone-and-back tablet server' test scenario
......................................................................
KUDU-1097: 'gone-and-back tablet server' test scenario
Added a new test scenario for the new 3-4-3 re-replication scheme.
The scenario addresses the situation when a tablet server has not been
running for some time (e.g., a bit over the time interval specified by
the 'follower_unavailable_considered_failed_sec' flag), and then it
comes back before the newly added non-voter replicas are promoted.
As a result, the original voter replicas from the tablet server should
stay, but the newly added non-voter replicas should be evicted.
Change-Id: I35eb6a0c7de5bfef962b5e96857c3f9c85a1a7b0
---
M src/kudu/integration-tests/raft_consensus_nonvoter-itest.cc
1 file changed, 122 insertions(+), 3 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/64/8664/4
--
To view, visit http://gerrit.cloudera.org:8080/8664
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I35eb6a0c7de5bfef962b5e96857c3f9c85a1a7b0
Gerrit-Change-Number: 8664
Gerrit-PatchSet: 4
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
[kudu-CR] KUDU-1097: 'gone-and-back tablet server' test scenario
Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/8664 )
Change subject: KUDU-1097: 'gone-and-back tablet server' test scenario
......................................................................
Patch Set 2:
(2 comments)
http://gerrit.cloudera.org:8080/#/c/8664/2/src/kudu/integration-tests/raft_consensus_nonvoter-itest.cc
File src/kudu/integration-tests/raft_consensus_nonvoter-itest.cc:
http://gerrit.cloudera.org:8080/#/c/8664/2/src/kudu/integration-tests/raft_consensus_nonvoter-itest.cc@1059
PS2, Line 1059: The catalog
: // manager should spawn non-voter replicas to replace the non-responsive
: // replicas, but as soon as the tablet server is back while the newly added
: // non-voter replicas are still copying data, the catalog manager should detect
: // the excess of replicas and evict the newly added non-voter replicas.
> I think the explanation for this part is better in the commit message.
Replaced.
http://gerrit.cloudera.org:8080/#/c/8664/2/src/kudu/integration-tests/raft_consensus_nonvoter-itest.cc@1144
PS2, Line 1144: NO_FATALS(cluster_->AssertNoCrashes());
> Before exiting we should do something to ensure that the right replica got
Ah, sure, thanks! I just verified that once based on the logs, but it's crucial to automate that part as well.
--
To view, visit http://gerrit.cloudera.org:8080/8664
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I35eb6a0c7de5bfef962b5e96857c3f9c85a1a7b0
Gerrit-Change-Number: 8664
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Comment-Date: Tue, 28 Nov 2017 22:28:26 +0000
Gerrit-HasComments: Yes