You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Todd Lipcon (Code Review)" <ge...@cloudera.org> on 2017/11/21 07:27:04 UTC

[kudu-CR] Add a test for data consistency after stopping tablets

Hello Mike Percy, Andrew Wong,

I'd like you to do a code review. Please visit

    http://gerrit.cloudera.org:8080/8618

to review the following change.


Change subject: Add a test for data consistency after stopping tablets
......................................................................

Add a test for data consistency after stopping tablets

Change-Id: I7b3b9e90fd1fbfb555a22074fd4a777a2b2e56e2
---
M src/kudu/integration-tests/cluster_verifier.cc
M src/kudu/integration-tests/cluster_verifier.h
M src/kudu/integration-tests/stop_tablet-itest.cc
M src/kudu/integration-tests/test_workload.cc
M src/kudu/integration-tests/test_workload.h
M src/kudu/mini-cluster/internal_mini_cluster.cc
M src/kudu/tools/ksck_remote.cc
7 files changed, 77 insertions(+), 15 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/18/8618/1
-- 
To view, visit http://gerrit.cloudera.org:8080/8618
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I7b3b9e90fd1fbfb555a22074fd4a777a2b2e56e2
Gerrit-Change-Number: 8618
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Andrew Wong <an...@cloudera.com>
Gerrit-Reviewer: Mike Percy <mp...@apache.org>

[kudu-CR] Add a test for data consistency after stopping tablets

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/8618 )

Change subject: Add a test for data consistency after stopping tablets
......................................................................


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8618/3/src/kudu/integration-tests/stop_tablet-itest.cc
File src/kudu/integration-tests/stop_tablet-itest.cc:

http://gerrit.cloudera.org:8080/#/c/8618/3/src/kudu/integration-tests/stop_tablet-itest.cc@231
PS3, Line 231:  why does the client get a RemoteError in this test?
> what does the RemoteError say?
while starting up the tservers, they start their RPC server before they register the TabletService RPC service itself into the messenger. So, the client bounces around and gets ServiceUnavailable on all replicas, and then returns RemoteError instead of TimedOut



-- 
To view, visit http://gerrit.cloudera.org:8080/8618
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7b3b9e90fd1fbfb555a22074fd4a777a2b2e56e2
Gerrit-Change-Number: 8618
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Andrew Wong <an...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Tue, 21 Nov 2017 22:44:09 +0000
Gerrit-HasComments: Yes

[kudu-CR] Add a test for data consistency after stopping tablets

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8618 )

Change subject: Add a test for data consistency after stopping tablets
......................................................................


Patch Set 3: Verified+1 Code-Review+2

Build failure was caused by KUDU-2222.


-- 
To view, visit http://gerrit.cloudera.org:8080/8618
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7b3b9e90fd1fbfb555a22074fd4a777a2b2e56e2
Gerrit-Change-Number: 8618
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Andrew Wong <an...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Wed, 22 Nov 2017 00:38:24 +0000
Gerrit-HasComments: No

[kudu-CR] Add a test for data consistency after stopping tablets

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change. ( http://gerrit.cloudera.org:8080/8618 )

Change subject: Add a test for data consistency after stopping tablets
......................................................................


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8618/3/src/kudu/integration-tests/cluster_verifier.cc
File src/kudu/integration-tests/cluster_verifier.cc:

http://gerrit.cloudera.org:8080/#/c/8618/3/src/kudu/integration-tests/cluster_verifier.cc@92
PS3, Line 92: we should support
totally agree


http://gerrit.cloudera.org:8080/#/c/8618/3/src/kudu/integration-tests/stop_tablet-itest.cc
File src/kudu/integration-tests/stop_tablet-itest.cc:

http://gerrit.cloudera.org:8080/#/c/8618/3/src/kudu/integration-tests/stop_tablet-itest.cc@231
PS3, Line 231:  why does the client get a RemoteError in this test?
what does the RemoteError say?



-- 
To view, visit http://gerrit.cloudera.org:8080/8618
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7b3b9e90fd1fbfb555a22074fd4a777a2b2e56e2
Gerrit-Change-Number: 8618
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Andrew Wong <an...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Tue, 21 Nov 2017 22:12:18 +0000
Gerrit-HasComments: Yes

[kudu-CR] Add a test for data consistency after stopping tablets

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/8618 )

Change subject: Add a test for data consistency after stopping tablets
......................................................................

Add a test for data consistency after stopping tablets

This adds a new test which starts a write workload and then shuts down all of
the servers while the writes are in-flight. The shutdown of the servers
causes the tablets to shut down, hence testing the new "abort" path for
in-flight writes.

Passed 1000/1000 in a TSAN build:
http://dist-test.cloudera.org/job?job_id=todd.1511294742.123710

Change-Id: I7b3b9e90fd1fbfb555a22074fd4a777a2b2e56e2
Reviewed-on: http://gerrit.cloudera.org:8080/8618
Reviewed-by: Andrew Wong <aw...@cloudera.com>
Tested-by: Andrew Wong <aw...@cloudera.com>
Reviewed-by: Mike Percy <mp...@apache.org>
---
M src/kudu/integration-tests/cluster_verifier.cc
M src/kudu/integration-tests/cluster_verifier.h
M src/kudu/integration-tests/stop_tablet-itest.cc
M src/kudu/integration-tests/test_workload.cc
M src/kudu/integration-tests/test_workload.h
M src/kudu/mini-cluster/external_mini_cluster.cc
M src/kudu/mini-cluster/internal_mini_cluster.cc
M src/kudu/tools/ksck_remote.cc
8 files changed, 81 insertions(+), 15 deletions(-)

Approvals:
  Andrew Wong: Looks good to me, approved; Verified
  Mike Percy: Looks good to me, approved

-- 
To view, visit http://gerrit.cloudera.org:8080/8618
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I7b3b9e90fd1fbfb555a22074fd4a777a2b2e56e2
Gerrit-Change-Number: 8618
Gerrit-PatchSet: 4
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Andrew Wong <an...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] Add a test for data consistency after stopping tablets

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Hello Mike Percy, Andrew Wong, Kudu Jenkins, Andrew Wong, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/8618

to look at the new patch set (#2).

Change subject: Add a test for data consistency after stopping tablets
......................................................................

Add a test for data consistency after stopping tablets

This adds a new test which starts a write workload and then shuts down all of
the servers while the writes are in-flight. The shutdown of the servers
causes the tablets to shut down, hence testing the new "abort" path for
in-flight writes.

Passed 1000/1000 in a TSAN build:
http://dist-test.cloudera.org/job?job_id=todd.1511294742.123710

Change-Id: I7b3b9e90fd1fbfb555a22074fd4a777a2b2e56e2
---
M src/kudu/integration-tests/cluster_verifier.cc
M src/kudu/integration-tests/cluster_verifier.h
M src/kudu/integration-tests/stop_tablet-itest.cc
M src/kudu/integration-tests/test_workload.cc
M src/kudu/integration-tests/test_workload.h
M src/kudu/mini-cluster/external_mini_cluster.cc
M src/kudu/mini-cluster/internal_mini_cluster.cc
M src/kudu/tools/ksck_remote.cc
8 files changed, 81 insertions(+), 15 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/18/8618/2
-- 
To view, visit http://gerrit.cloudera.org:8080/8618
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7b3b9e90fd1fbfb555a22074fd4a777a2b2e56e2
Gerrit-Change-Number: 8618
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Andrew Wong <an...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] Add a test for data consistency after stopping tablets

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8618 )

Change subject: Add a test for data consistency after stopping tablets
......................................................................


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8618/1/src/kudu/integration-tests/cluster_verifier.cc
File src/kudu/integration-tests/cluster_verifier.cc:

http://gerrit.cloudera.org:8080/#/c/8618/1/src/kudu/integration-tests/cluster_verifier.cc@107
PS1, Line 107: cluster_->master_rpc_addrs()
Seems like this is empty for single-master clusters / by default. I think this should do what you want:

 for (int i = 0; i < cluster_->num_masters(); i++) {
   hp_strs.emplace_back(cluster_->master(i)->bound_rpc_hostport());
 }


http://gerrit.cloudera.org:8080/#/c/8618/1/src/kudu/tools/ksck_remote.cc
File src/kudu/tools/ksck_remote.cc:

http://gerrit.cloudera.org:8080/#/c/8618/1/src/kudu/tools/ksck_remote.cc@323
PS1, Line 323:   CHECK(!master_addresses.empty());
Seems this is breaking a bunch of test cases. Is it necessary?



-- 
To view, visit http://gerrit.cloudera.org:8080/8618
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7b3b9e90fd1fbfb555a22074fd4a777a2b2e56e2
Gerrit-Change-Number: 8618
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Andrew Wong <an...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Comment-Date: Tue, 21 Nov 2017 19:04:41 +0000
Gerrit-HasComments: Yes

[kudu-CR] Add a test for data consistency after stopping tablets

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change. ( http://gerrit.cloudera.org:8080/8618 )

Change subject: Add a test for data consistency after stopping tablets
......................................................................


Patch Set 3: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/8618
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7b3b9e90fd1fbfb555a22074fd4a777a2b2e56e2
Gerrit-Change-Number: 8618
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Andrew Wong <an...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Wed, 22 Nov 2017 00:41:41 +0000
Gerrit-HasComments: No

[kudu-CR] Add a test for data consistency after stopping tablets

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change. ( http://gerrit.cloudera.org:8080/8618 )

Change subject: Add a test for data consistency after stopping tablets
......................................................................


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8618/3/src/kudu/integration-tests/stop_tablet-itest.cc
File src/kudu/integration-tests/stop_tablet-itest.cc:

http://gerrit.cloudera.org:8080/#/c/8618/3/src/kudu/integration-tests/stop_tablet-itest.cc@249
PS3, Line 249: Shutdown
> Did you mean to Shutdown() or Stop() here? Maybe one after the other?
Oh, wait, Shutdown() calls Stop() first. Good idea with this.



-- 
To view, visit http://gerrit.cloudera.org:8080/8618
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7b3b9e90fd1fbfb555a22074fd4a777a2b2e56e2
Gerrit-Change-Number: 8618
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Andrew Wong <an...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Tue, 21 Nov 2017 22:14:34 +0000
Gerrit-HasComments: Yes

[kudu-CR] Add a test for data consistency after stopping tablets

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change. ( http://gerrit.cloudera.org:8080/8618 )

Change subject: Add a test for data consistency after stopping tablets
......................................................................


Patch Set 3: Code-Review+1


-- 
To view, visit http://gerrit.cloudera.org:8080/8618
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7b3b9e90fd1fbfb555a22074fd4a777a2b2e56e2
Gerrit-Change-Number: 8618
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Andrew Wong <an...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Tue, 21 Nov 2017 22:40:07 +0000
Gerrit-HasComments: No

[kudu-CR] Add a test for data consistency after stopping tablets

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has removed a vote on this change.

Change subject: Add a test for data consistency after stopping tablets
......................................................................


Removed Verified-1 by Kudu Jenkins (120)
-- 
To view, visit http://gerrit.cloudera.org:8080/8618
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: I7b3b9e90fd1fbfb555a22074fd4a777a2b2e56e2
Gerrit-Change-Number: 8618
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Andrew Wong <an...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] Add a test for data consistency after stopping tablets

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/8618 )

Change subject: Add a test for data consistency after stopping tablets
......................................................................


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8618/1/src/kudu/integration-tests/cluster_verifier.cc
File src/kudu/integration-tests/cluster_verifier.cc:

http://gerrit.cloudera.org:8080/#/c/8618/1/src/kudu/integration-tests/cluster_verifier.cc@107
PS1, Line 107: cluster_->master_rpc_addrs()
> Seems like this is empty for single-master clusters / by default. I think t
yea I just fixed master_rpc_addrs() in the minicluster implementation. was waiting on your patch to get committed before posting new rev of this one


http://gerrit.cloudera.org:8080/#/c/8618/1/src/kudu/tools/ksck_remote.cc
File src/kudu/tools/ksck_remote.cc:

http://gerrit.cloudera.org:8080/#/c/8618/1/src/kudu/tools/ksck_remote.cc@323
PS1, Line 323:   CHECK(!master_addresses.empty());
> Seems this is breaking a bunch of test cases. Is it necessary?
seemed like we should make this explicit. The bug is that ExternalMiniCluster wasn't returning any addrs when we asked for them. Fixed that and now this CHECK should be OK.



-- 
To view, visit http://gerrit.cloudera.org:8080/8618
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7b3b9e90fd1fbfb555a22074fd4a777a2b2e56e2
Gerrit-Change-Number: 8618
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Andrew Wong <an...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Tue, 21 Nov 2017 19:38:02 +0000
Gerrit-HasComments: Yes

[kudu-CR] Add a test for data consistency after stopping tablets

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change. ( http://gerrit.cloudera.org:8080/8618 )

Change subject: Add a test for data consistency after stopping tablets
......................................................................


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8618/3/src/kudu/integration-tests/stop_tablet-itest.cc
File src/kudu/integration-tests/stop_tablet-itest.cc:

http://gerrit.cloudera.org:8080/#/c/8618/3/src/kudu/integration-tests/stop_tablet-itest.cc@249
PS3, Line 249: Shutdown
Did you mean to Shutdown() or Stop() here? Maybe one after the other?



-- 
To view, visit http://gerrit.cloudera.org:8080/8618
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7b3b9e90fd1fbfb555a22074fd4a777a2b2e56e2
Gerrit-Change-Number: 8618
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Andrew Wong <an...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Tue, 21 Nov 2017 22:13:56 +0000
Gerrit-HasComments: Yes