You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Andrew Wong (Code Review)" <ge...@cloudera.org> on 2017/06/21 02:26:50 UTC

[kudu-CR] tests: IOErrors and IllegalState in TestWorkload

Andrew Wong has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7243

Change subject: tests: IOErrors and IllegalState in TestWorkload
......................................................................

tests: IOErrors and IllegalState in TestWorkload

A TestWorkload has the ability to expect various errors during its run.
It can now expect IOErrors or IllegalState during a scan. This is
helpful in testing disk failure during scans.

An IOError may surface if the disk with tablet data fails in the middle
of a scan. An IllegalState may surface if the disk failed and the tablet
was failed before the scan starts.

Change-Id: I364c0ae2ac48920bcbd5b662b931ca448464c90e
---
M src/kudu/integration-tests/test_workload.cc
M src/kudu/integration-tests/test_workload.h
2 files changed, 29 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/43/7243/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7243
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I364c0ae2ac48920bcbd5b662b931ca448464c90e
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>

[kudu-CR] tests: IOErrors and IllegalState in TestWorkload

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/7243

to look at the new patch set (#4).

Change subject: tests: IOErrors and IllegalState in TestWorkload
......................................................................

tests: IOErrors and IllegalState in TestWorkload

A TestWorkload has the ability to expect various errors during its run.
It can now expect IOErrors or IllegalState during a scan. This is
helpful in testing disk failure during scans.

An IOError may surface if the disk with tablet data fails in the middle
of a scan. An IllegalState may surface if the disk failed and the tablet
was failed before the scan starts.

Change-Id: I364c0ae2ac48920bcbd5b662b931ca448464c90e
---
M src/kudu/integration-tests/test_workload.cc
M src/kudu/integration-tests/test_workload.h
2 files changed, 30 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/43/7243/4
-- 
To view, visit http://gerrit.cloudera.org:8080/7243
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I364c0ae2ac48920bcbd5b662b931ca448464c90e
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins

[kudu-CR] disk failure: tests for disk failure recovery

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has abandoned this change.

Change subject: disk failure: tests for disk failure recovery
......................................................................


Abandoned

With the right error handling, this patch shouldn't be needed. Fault-tolerant scans shouldn't see errors at this level.

-- 
To view, visit http://gerrit.cloudera.org:8080/7243
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: abandon
Gerrit-Change-Id: I364c0ae2ac48920bcbd5b662b931ca448464c90e
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] disk failure: tests for disk failure recovery

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/7243

to look at the new patch set (#5).

Change subject: disk failure: tests for disk failure recovery
......................................................................

disk failure: tests for disk failure recovery

This patch adds an EMC test that spawns three servers and triggers EIOs
on two of them to fail two different tablets. With improper
disk-failure-handling, this scenario alone would have been enough to
leave the server with only a single copy of data, as the two servers
with EIOs would have been shut down entirely.

With proper disk-failure handling, this scenario would be salvageable,
and data would be replicated on the remaining disks. This exercises the
FlushMRS codepath.

Tests are also added to test behavior during FlushDMS calls and scans,
ensuring the servers return to a normal state.

Change-Id: I364c0ae2ac48920bcbd5b662b931ca448464c90e
---
M src/kudu/integration-tests/CMakeLists.txt
A src/kudu/integration-tests/disk_failure-itest.cc
2 files changed, 361 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/43/7243/5
-- 
To view, visit http://gerrit.cloudera.org:8080/7243
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I364c0ae2ac48920bcbd5b662b931ca448464c90e
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] tests: IOErrors and IllegalState in TestWorkload

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.

Change subject: tests: IOErrors and IllegalState in TestWorkload
......................................................................


Patch Set 4:

(5 comments)

consider squashing this into whatever patch relies on it?

http://gerrit.cloudera.org:8080/#/c/7243/4//COMMIT_MSG
Commit Message:

Line 11: helpful in testing disk failure during scans.
should we also allow it to use a fault-tolerant scan so we can end-to-end test the fault tolerance property?


PS4, Line 14: IllegalState may surface if the disk failed and the tablet
            : was failed before the scan starts
in that case shouldn't it fail over to a different replica? or are these tests running in cases with only a single replica?


http://gerrit.cloudera.org:8080/#/c/7243/4/src/kudu/integration-tests/test_workload.cc
File src/kudu/integration-tests/test_workload.cc:

Line 217:       return;
why not continue? same below


Line 219:     if (illegal_state_allowed_ && s.IsIllegalState()) {
isn't this redundant?


Line 226:       if ((io_error_allowed_ && s.IsIOError()) || (illegal_state_allowed_ && s.IsIllegalState())) {
maybe it's worth introducing a private boolean function like IsStatusAllowed(const Status& s) ?


-- 
To view, visit http://gerrit.cloudera.org:8080/7243
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I364c0ae2ac48920bcbd5b662b931ca448464c90e
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes