You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Andrew Wong (Code Review)" <ge...@cloudera.org> on 2017/07/15 00:11:00 UTC

[kudu-CR] disk failure: replicate failed tablets

Andrew Wong has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7440

Change subject: disk failure: replicate failed tablets
......................................................................

disk failure: replicate failed tablets

Tablets that are marked FAILED should eventually be replicated.
A tablet will be FAILED if it fails to bootstrap or start.

This patch also adds a new tablet state called FAILED_AND_SHUTDOWN for
cases where the tablet needs to be shutdown (e.g. to stop responding to
heartbeats).

RaftConsensusITest is updated to ensure that a FAILED tablet is actually
copied.

Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
---
M src/kudu/client/scanner-internal.cc
M src/kudu/consensus/consensus_peers.cc
M src/kudu/consensus/consensus_queue.cc
M src/kudu/integration-tests/raft_consensus-itest.cc
M src/kudu/tablet/metadata.proto
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/tserver.proto
10 files changed, 52 insertions(+), 24 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/40/7440/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>

[kudu-CR] KUDU-1407: reassign failed tablets

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello David Ribeiro Alves, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/7440

to look at the new patch set (#14).

Change subject: KUDU-1407: reassign failed tablets
......................................................................

KUDU-1407: reassign failed tablets

Tablets put into the state tablet::FAILED are left until they are
manually deleted; they are not evicted and reassigned. If a tablet
fails to bootstrap, it will sit, responding to heartbeats, doing nothing
else. This patch ensures failed tablets will be reassigned.

As the tablets are not used, rather than directly setting replicas to
FAILED, an error is first recorded and the TabletReplica::Shutdown(), leaving
the final state as FAILED. A replica can no longer leave the FAILED
state (calls to Shutdown() leave it FAILED). The tserver response
generated by FAILED tablets is now TABLET_FAILED. Upon receiving this, a
leader will immediately evict the peer.

Prior to this patch, a tablet was marked FAILED if its WAL or metadata
failed to delete (after already shutting down). If this occurs, there
may be an inconsistency on-disk. This has been made fatal.

Testing is done in a few places:
- raft_consensus-itest is updated to ensure that tablets that fail to
  bootstrap are evicted and replaced.
- tablet_server-test is also updated to ensure that, instead of
  TABLET_NOT_RUNNING, TABLET_FAILED is returned by failed tablets.
- a test is added to ts_tablet_manager-itest is added to test that
  failed tablets while running are evicted and replaced.

This patch is a part of a series of patches to handle disk failure. See
section 2.5 in this doc:
https://docs.google.com/document/d/1zZk-vb_ETKUuePcZ9ZqoSK2oPvAAaEV1sjDXes8Pxgk/edit

Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
---
M src/kudu/client/scanner-internal.cc
M src/kudu/consensus/consensus_peers.cc
M src/kudu/consensus/consensus_queue.cc
M src/kudu/consensus/consensus_queue.h
M src/kudu/integration-tests/raft_consensus-itest.cc
M src/kudu/integration-tests/ts_recovery-itest.cc
M src/kudu/integration-tests/ts_tablet_manager-itest.cc
M src/kudu/master/catalog_manager.cc
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/tserver.proto
14 files changed, 160 insertions(+), 57 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/40/7440/14
-- 
To view, visit http://gerrit.cloudera.org:8080/7440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
Gerrit-PatchSet: 14
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-1407: reassign failed tablets

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change.

Change subject: KUDU-1407: reassign failed tablets
......................................................................


Patch Set 13:

(1 comment)

@dralves Opted out of doing the enums thing we discussed offline. I think there are few enough cases for now for it to be ok just add case handling as needed.

http://gerrit.cloudera.org:8080/#/c/7440/12/src/kudu/tserver/ts_tablet_manager.cc
File src/kudu/tserver/ts_tablet_manager.cc:

PS12, Line 753:     replica->SetError(s);
              :     replica->Shutdown();
> I don't think you have to lose the error messages:
Ah, fair enough. Done.


-- 
To view, visit http://gerrit.cloudera.org:8080/7440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
Gerrit-PatchSet: 13
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1407: reassign failed tablets

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change.

Change subject: KUDU-1407: reassign failed tablets
......................................................................


Patch Set 19:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7440/19/src/kudu/tserver/ts_tablet_manager.cc
File src/kudu/tserver/ts_tablet_manager.cc:

PS19, Line 646:                          replica->state() == tablet::FAILED);
> Without your tombstoned voting patch, if the replica has been shut down, co
Gotcha. OK, let's leave it and I will remove it in my tombstoned voting patch.


-- 
To view, visit http://gerrit.cloudera.org:8080/7440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
Gerrit-PatchSet: 19
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1407: reassign failed tablets

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change.

Change subject: KUDU-1407: reassign failed tablets
......................................................................


Patch Set 19:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/7440/19/src/kudu/consensus/consensus_queue.cc
File src/kudu/consensus/consensus_queue.cc:

PS19, Line 628: peers_map_
This should be protected by queue_lock_


PS19, Line 632: queue_state_.current_term
This should be protected by queue_lock_


PS19, Line 633: NotifyObserversOfFailedFollower
No need to hold queue_lock_ while calling NotifyObserversOfFailedFollower() because the thread pool it uses is set in the constructor


http://gerrit.cloudera.org:8080/#/c/7440/19/src/kudu/integration-tests/ts_recovery-itest.cc
File src/kudu/integration-tests/ts_recovery-itest.cc:

Line 310:       { "--master_tombstone_evicted_tablet_replicas=false" }));
nit: add a comment for why we are specifying this flag


http://gerrit.cloudera.org:8080/#/c/7440/17/src/kudu/tserver/ts_tablet_manager.cc
File src/kudu/tserver/ts_tablet_manager.cc:

PS17, Line 655: 
> Failures when writing the data directory are passed off as WARN_NOT_OK() (s
ok


PS17, Line 658: 
> Ah, I see. I'll update the comment.
ok.


-- 
To view, visit http://gerrit.cloudera.org:8080/7440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
Gerrit-PatchSet: 19
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1407: reassign failed tablets

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change.

Change subject: KUDU-1407: reassign failed tablets
......................................................................


Patch Set 11:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/7440/11/src/kudu/integration-tests/ts_tablet_manager-itest.cc
File src/kudu/integration-tests/ts_tablet_manager-itest.cc:

Line 76: using tserver::TSTabletManager;
> warning: using decl 'TSTabletManager' is unused [misc-unused-using-decls]
Done


http://gerrit.cloudera.org:8080/#/c/7440/11/src/kudu/tserver/ts_tablet_manager.cc
File src/kudu/tserver/ts_tablet_manager.cc:

Line 620:   // TODO: There's actually a race here between the check and shutdown, but
> warning: missing username/bug in TODO [google-readability-todo]
Done


Line 753:     replica->SetFailed(s);
> error: no member named 'SetFailed' in 'kudu::tablet::TabletReplica' [clang-
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
Gerrit-PatchSet: 11
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1407: reassign failed tablets

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change.

Change subject: KUDU-1407: reassign failed tablets
......................................................................


Patch Set 20:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/7440/19/src/kudu/consensus/consensus_queue.cc
File src/kudu/consensus/consensus_queue.cc:

Line 627: void PeerMessageQueue::NotifyPeerHasFailed(const string& peer_uuid, const string& reason) {
> consider:
Done


PS19, Line 628: l(queue_lo
> This should be protected by queue_lock_
Done


PS19, Line 632: ignored.
> This should be protected by queue_lock_
Done


PS19, Line 633: int64_t current_term = queue_st
> No need to hold queue_lock_ while calling NotifyObserversOfFailedFollower()
Ah I see, thanks for clarifying.


http://gerrit.cloudera.org:8080/#/c/7440/19/src/kudu/integration-tests/ts_recovery-itest.cc
File src/kudu/integration-tests/ts_recovery-itest.cc:

Line 310:   // its FAILED state.
> nit: add a comment for why we are specifying this flag
Done


http://gerrit.cloudera.org:8080/#/c/7440/19/src/kudu/tserver/ts_tablet_manager.cc
File src/kudu/tserver/ts_tablet_manager.cc:

PS19, Line 646: 
> Why this change? I don't think this is necessary or desired.
Without your tombstoned voting patch, if the replica has been shut down, consensus will be shut down.


-- 
To view, visit http://gerrit.cloudera.org:8080/7440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
Gerrit-PatchSet: 20
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] disk failure: reassign failed tablets

Posted by "David Ribeiro Alves (Code Review)" <ge...@cloudera.org>.
David Ribeiro Alves has posted comments on this change.

Change subject: disk failure: reassign failed tablets
......................................................................


Patch Set 7:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/7440/7/src/kudu/client/scanner-internal.cc
File src/kudu/client/scanner-internal.cc:

PS7, Line 232: case tserver::TabletServerErrorPB::TABLET_FAILED: // fall-through
would it make more sense to have this be like: TABLET_NOT_FOUND? How do we respond to TABLET_NOT_RUNNING on the client? do we retry?
my fear is that we might eventually do that (or might even do that already in some cases like single replica tablets) but this is a non-retriable error. Alternatively actually sending a tablet failed reply might make sense too.


http://gerrit.cloudera.org:8080/#/c/7440/7/src/kudu/consensus/consensus_peers.cc
File src/kudu/consensus/consensus_peers.cc:

PS7, Line 284: response_.error().code() != TabletServerErrorPB::TABLET_FAILED
maybe in this case we should directly call: NotifyObserversOfFailedFollower() or a new analog public method (like queue_->NotifyPeerHasFailed())?


http://gerrit.cloudera.org:8080/#/c/7440/7/src/kudu/consensus/consensus_queue.cc
File src/kudu/consensus/consensus_queue.cc:

PS7, Line 638: // We only let special types of errors through to this point from the peer.
             :     if (response.has_error()) {
             :       // Initiate Tablet Copy on the peer if the tablet is not found.
             :       if (tserver::TabletServerErrorPB::TABLET_NOT_FOUND == response.error().code()) {
             :         peer->needs_tablet_copy = true;
             :         VLOG_WITH_PREFIX_UNLOCKED(1) << "Marked peer as needing tablet copy: "
             :                                      << peer->ToString();
             :         *more_pending = true;
             :       } else {
             :         // Ignore the response from the peer if it is failed.
             :         CHECK(tserver::TabletServerErrorPB::TABLET_FAILED == response.error().code())
             :           << SecureShortDebugString(response);
             :         VLOG_WITH_PREFIX_UNLOCKED(1) << "Ignoring response from tablet to promote eviction: "
             :                                     << peer->ToString();
             :         *more_pending = false;
             :       }
             :       return;
             :     }
see my comment on the call site


http://gerrit.cloudera.org:8080/#/c/7440/7/src/kudu/master/catalog_manager.cc
File src/kudu/master/catalog_manager.cc:

PS7, Line 170: DEFINE_bool(master_tombstone_failed_tablet_replicas, true,
             :             "Whether the master should tombstone (delete) tablet replicas that "
             :             "are reporting a failed state.");
             : TAG_FLAG(master_tombstone_failed_tablet_replicas, hidden);
is this a test only thing?


http://gerrit.cloudera.org:8080/#/c/7440/7/src/kudu/tablet/metadata.proto
File src/kudu/tablet/metadata.proto:

PS7, Line 161: the tablet will be replicated.
??


-- 
To view, visit http://gerrit.cloudera.org:8080/7440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
Gerrit-PatchSet: 7
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1407: reassign failed tablets

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change.

Change subject: KUDU-1407: reassign failed tablets
......................................................................


Patch Set 20: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
Gerrit-PatchSet: 20
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] KUDU-1407: reassign failed tablets

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello David Ribeiro Alves, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/7440

to look at the new patch set (#18).

Change subject: KUDU-1407: reassign failed tablets
......................................................................

KUDU-1407: reassign failed tablets

Tablets put into the state tablet::FAILED are left until they are
manually deleted; they are not evicted and reassigned. If a tablet
fails to bootstrap, it will sit, responding to heartbeats, doing nothing
else. This patch ensures failed tablets will be reassigned.

As the tablets are not used, rather than directly setting replicas to
FAILED, an error is first recorded and the TabletReplica::Shutdown(), leaving
the final state as FAILED. A replica can no longer leave the FAILED
state (calls to Shutdown() leave it FAILED). The tserver response
generated by FAILED tablets is now TABLET_FAILED. Upon receiving this, a
leader will immediately evict the peer.

Prior to this patch, a tablet was marked FAILED if its WAL or metadata
failed to delete (after already shutting down). If this occurs, there is
no guarantee that the tablet's metadata reflects the deleted state. This
has been made fatal.

Testing is done in a few places:
- raft_consensus-itest is updated to ensure that tablets that fail to
  bootstrap are evicted and replaced.
- tablet_server-test is also updated to ensure that, instead of
  TABLET_NOT_RUNNING, TABLET_FAILED is returned by failed tablets.
- a test is added to ts_tablet_manager-itest to test that a tablet that
  is manually failed while running is evicted and replaced.

This patch is a part of a series of patches to handle disk failure. See
section 2.5 in this doc:
https://docs.google.com/document/d/1zZk-vb_ETKUuePcZ9ZqoSK2oPvAAaEV1sjDXes8Pxgk/edit

Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
---
M src/kudu/client/scanner-internal.cc
M src/kudu/consensus/consensus_peers.cc
M src/kudu/consensus/consensus_queue.cc
M src/kudu/consensus/consensus_queue.h
M src/kudu/integration-tests/raft_consensus-itest.cc
M src/kudu/integration-tests/ts_recovery-itest.cc
M src/kudu/integration-tests/ts_tablet_manager-itest.cc
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/tserver.proto
13 files changed, 151 insertions(+), 59 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/40/7440/18
-- 
To view, visit http://gerrit.cloudera.org:8080/7440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
Gerrit-PatchSet: 18
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] disk failure: reassign tablets due to disk failure

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/7440

to look at the new patch set (#3).

Change subject: disk failure: reassign tablets due to disk failure
......................................................................

disk failure: reassign tablets due to disk failure

Tablets that are marked FAILED_AND_SHUTDOWN should eventually be
replicated, as these tablets have been shut down due to disk failure. If
a tablet is in this state and the tserver is responding to a request, no
response will be given, as to spur eviction.

This is useful, as a disk failure immediately marks a tablet as FAILED
(during which the tablet will be seen as TABLET_NOT_RUNNING) and
eventually shuts it down forcefully, leaving it in the state
FAILED_AND_SHUTDOWN (after which the tablet will be seen as
TABLET_FAILED and responses will not update the communication time).

This patch is a part of a series of patches to handle disk failure. See
section 2.5 in this doc:
https://docs.google.com/document/d/1zZk-vb_ETKUuePcZ9ZqoSK2oPvAAaEV1sjDXes8Pxgk/edit

Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
---
M src/kudu/client/scanner-internal.cc
M src/kudu/consensus/consensus_peers.cc
M src/kudu/consensus/consensus_queue.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/tserver.proto
6 files changed, 48 insertions(+), 17 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/40/7440/3
-- 
To view, visit http://gerrit.cloudera.org:8080/7440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-1407: reassign failed tablets

Posted by "David Ribeiro Alves (Code Review)" <ge...@cloudera.org>.
David Ribeiro Alves has posted comments on this change.

Change subject: KUDU-1407: reassign failed tablets
......................................................................


Patch Set 17: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
Gerrit-PatchSet: 17
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] disk failure: reassign failed tablets

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/7440

to look at the new patch set (#7).

Change subject: disk failure: reassign failed tablets
......................................................................

disk failure: reassign failed tablets

Tablets put into the state tablet::FAILED are left until they are
manually deleted. This is an issue because failed tablets don't get
evicted and reassigned (e.g. if a tablet fails to bootstrap, it will
sit, responding to heartbeats, doing nothing else).

To remediate this, this patch changes the tserver response generated by
FAILED tablets to a new TABLET_FAILED, which is ignored by leaders to
promote eviction.

Additionally, a new tablet state is added: FAILED_AND_SHUTDOWN. Like
QUIESCING and SHUTDOWN, TabletReplica::Shutdown() can wait on
FAILED_AND_SHUTDOWN. This is useful if a failed tablet needs to be shut
down and still needs to be reassigned. Calling normal Shutdown() cannot
leave the replica in the FAILED state, and the SHUTDOWN state cannot
itself indicate the need for eviction.

Prior to this patch, tablets were set to FAILED when they failed to
delete metadata. This is no longer the case. Since error statuses during
deletion are only returned during IO to the metadata directory, and
because the metadata directory is a single point of failure, failures on
this codepath are made fatal for now. Once this is no longer the case,
these failures should be made benign, as proper error handling should
make files in the failed metadata directory unreachable. This ensures
the tablets that were meant to be deleted are not reassigned.

The test raft_consensus-itest is updated to ensure that failed tablets
are evicted and replaced. The test tablet_server-test is also updated to
ensure that, instead of TABLET_NOT_RUNNING, TABLET_FAILED is returned by
failed tablets.

This patch is a part of a series of patches to handle disk failure. See
section 2.5 in this doc:
https://docs.google.com/document/d/1zZk-vb_ETKUuePcZ9ZqoSK2oPvAAaEV1sjDXes8Pxgk/edit

Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
---
M src/kudu/client/scanner-internal.cc
M src/kudu/consensus/consensus_peers.cc
M src/kudu/consensus/consensus_queue.cc
M src/kudu/integration-tests/raft_consensus-itest.cc
M src/kudu/integration-tests/ts_recovery-itest.cc
M src/kudu/master/catalog_manager.cc
M src/kudu/tablet/metadata.proto
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/tserver.proto
13 files changed, 119 insertions(+), 54 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/40/7440/7
-- 
To view, visit http://gerrit.cloudera.org:8080/7440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
Gerrit-PatchSet: 7
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-1407: reassign failed tablets

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change.

Change subject: KUDU-1407: reassign failed tablets
......................................................................


Patch Set 19:

The jenkins failure doesn't seem to be related to these changes. This patch is still good for review.

-- 
To view, visit http://gerrit.cloudera.org:8080/7440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
Gerrit-PatchSet: 19
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] KUDU-1407: reassign failed tablets

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change.

Change subject: KUDU-1407: reassign failed tablets
......................................................................


Patch Set 17:

This patch has been +2'd, but some of the logic around failing tablet replicas overlaps with the tombstoned voting patch. Overall, this patch shouldn't need to change much to make it compatible.

-- 
To view, visit http://gerrit.cloudera.org:8080/7440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
Gerrit-PatchSet: 17
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] KUDU-1407: reassign failed tablets

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello David Ribeiro Alves, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/7440

to look at the new patch set (#19).

Change subject: KUDU-1407: reassign failed tablets
......................................................................

KUDU-1407: reassign failed tablets

Tablets put into the state tablet::FAILED are left until they are
manually deleted; they are not evicted and reassigned. If a tablet
fails to bootstrap, it will sit, responding to heartbeats, doing nothing
else. This patch ensures failed tablets will be reassigned.

As the tablets are not used, rather than directly setting replicas to
FAILED, an error is first recorded and the TabletReplica::Shutdown(), leaving
the final state as FAILED. A replica can no longer leave the FAILED
state (calls to Shutdown() leave it FAILED). The tserver response
generated by FAILED tablets is now TABLET_FAILED. Upon receiving this, a
leader will immediately evict the peer.

Prior to this patch, a tablet was marked FAILED if its WAL or metadata
failed to delete (after already shutting down). If this occurs, there is
no guarantee that the tablet's metadata reflects the deleted state. This
has been made fatal.

Testing is done in a few places:
- raft_consensus-itest is updated to ensure that tablets that fail to
  bootstrap are evicted and replaced.
- tablet_server-test is also updated to ensure that, instead of
  TABLET_NOT_RUNNING, TABLET_FAILED is returned by failed tablets.
- a test is added to ts_tablet_manager-itest to test that a tablet that
  is manually failed while running is evicted and replaced.

This patch is a part of a series of patches to handle disk failure. See
section 2.5 in this doc:
https://docs.google.com/document/d/1zZk-vb_ETKUuePcZ9ZqoSK2oPvAAaEV1sjDXes8Pxgk/edit

Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
---
M src/kudu/client/scanner-internal.cc
M src/kudu/consensus/consensus_peers.cc
M src/kudu/consensus/consensus_queue.cc
M src/kudu/consensus/consensus_queue.h
M src/kudu/integration-tests/raft_consensus-itest.cc
M src/kudu/integration-tests/ts_recovery-itest.cc
M src/kudu/integration-tests/ts_tablet_manager-itest.cc
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/tserver.proto
13 files changed, 151 insertions(+), 58 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/40/7440/19
-- 
To view, visit http://gerrit.cloudera.org:8080/7440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
Gerrit-PatchSet: 19
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] disk failure: reassign failed tablets

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/7440

to look at the new patch set (#5).

Change subject: disk failure: reassign failed tablets
......................................................................

disk failure: reassign failed tablets

Tablets put into the state tablet::FAILED are left until they are
manually deleted. This is an issue because failed tablets don't get
evicted and reassigned (e.g. if a tablet fails to bootstrap, it will
sit, responding to heartbeats, doing nothing else).

To remediate this, this patch changes the tserver response generated by
FAILED tablets to a new TABLET_FAILED, which is ignored by leaders to
promote eviction.

Additionally, a new tablet state is added: FAILED_AND_SHUTDOWN.  Like
QUIESCING and SHUTDOWN, TabletReplica::Shutdown() can wait on
FAILED_AND_SHUTDOWN. This is useful if a failed tablet needs to be shut
down and still needs to be reassigned. Calling normal Shutdown() cannot
leave the replica in the FAILED state, and the SHUTDOWN state cannot
itself indicate the need for eviction.

Additionally, prior to this patch, tablets were set to FAILED when they
failed to delete metadata. This is no longer the case. Since error
statuses during deletion are only returned during IO to the metadata
directory, and because the metadata directory is a single point of
failure, failures on this codepath are made fatal for now. Once this is
no longer the case, these failures should be made benign, as proper
error handling should make files on the failed metadata directory
unreachable. This ensures the tablets that were meant to be deleted are
not reassigned.

This patch is a part of a series of patches to handle disk failure. See
section 2.5 in this doc:
https://docs.google.com/document/d/1zZk-vb_ETKUuePcZ9ZqoSK2oPvAAaEV1sjDXes8Pxgk/edit

Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
---
M src/kudu/client/scanner-internal.cc
M src/kudu/consensus/consensus_peers.cc
M src/kudu/consensus/consensus_queue.cc
M src/kudu/integration-tests/raft_consensus-itest.cc
M src/kudu/master/catalog_manager.cc
M src/kudu/tablet/metadata.proto
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/tserver.proto
12 files changed, 111 insertions(+), 55 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/40/7440/5
-- 
To view, visit http://gerrit.cloudera.org:8080/7440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-1407: reassign failed tablets

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello David Ribeiro Alves, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/7440

to look at the new patch set (#20).

Change subject: KUDU-1407: reassign failed tablets
......................................................................

KUDU-1407: reassign failed tablets

Tablets put into the state tablet::FAILED are left until they are
manually deleted; they are not evicted and reassigned. If a tablet
fails to bootstrap, it will sit, responding to heartbeats, doing nothing
else. This patch ensures failed tablets will be reassigned.

As the tablets are not used, rather than directly setting replicas to
FAILED, an error is first recorded and the TabletReplica::Shutdown(), leaving
the final state as FAILED. A replica can no longer leave the FAILED
state (calls to Shutdown() leave it FAILED). The tserver response
generated by FAILED tablets is now TABLET_FAILED. Upon receiving this, a
leader will immediately evict the peer.

Prior to this patch, a tablet was marked FAILED if its WAL or metadata
failed to delete (after already shutting down). If this occurs, there is
no guarantee that the tablet's metadata reflects the deleted state. This
has been made fatal.

Testing is done in a few places:
- raft_consensus-itest is updated to ensure that tablets that fail to
  bootstrap are evicted and replaced.
- tablet_server-test is also updated to ensure that, instead of
  TABLET_NOT_RUNNING, TABLET_FAILED is returned by failed tablets.
- a test is added to ts_tablet_manager-itest to test that a tablet that
  is manually failed while running is evicted and replaced.

This patch is a part of a series of patches to handle disk failure. See
section 2.5 in this doc:
https://docs.google.com/document/d/1zZk-vb_ETKUuePcZ9ZqoSK2oPvAAaEV1sjDXes8Pxgk/edit

Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
---
M src/kudu/client/scanner-internal.cc
M src/kudu/consensus/consensus_peers.cc
M src/kudu/consensus/consensus_queue.cc
M src/kudu/consensus/consensus_queue.h
M src/kudu/integration-tests/raft_consensus-itest.cc
M src/kudu/integration-tests/ts_recovery-itest.cc
M src/kudu/integration-tests/ts_tablet_manager-itest.cc
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/tserver.proto
13 files changed, 154 insertions(+), 58 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/40/7440/20
-- 
To view, visit http://gerrit.cloudera.org:8080/7440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
Gerrit-PatchSet: 20
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-1407: reassign failed tablets

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change.

Change subject: KUDU-1407: reassign failed tablets
......................................................................


Patch Set 19:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7440/19/src/kudu/tserver/ts_tablet_manager.cc
File src/kudu/tserver/ts_tablet_manager.cc:

PS19, Line 646:                          replica->state() == tablet::FAILED);
Why this change? I don't think this is necessary or desired.


-- 
To view, visit http://gerrit.cloudera.org:8080/7440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
Gerrit-PatchSet: 19
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] disk failure: reassign failed tablets

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change.

Change subject: disk failure: reassign failed tablets
......................................................................


Patch Set 7:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7440/2/src/kudu/consensus/consensus_queue.cc
File src/kudu/consensus/consensus_queue.cc:

Line 642:         peer->needs_tablet_copy = true;
> Why Tablet Copy? Why not evict and let the master reassign the replica?
Hrm, good point. This would probably introduce a cases where a disk fails and a single server has to take on a bunch of tablet copies. Done.


-- 
To view, visit http://gerrit.cloudera.org:8080/7440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
Gerrit-PatchSet: 7
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] disk failure: reassign failed tablets

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/7440

to look at the new patch set (#6).

Change subject: disk failure: reassign failed tablets
......................................................................

disk failure: reassign failed tablets

Tablets put into the state tablet::FAILED are left until they are
manually deleted. This is an issue because failed tablets don't get
evicted and reassigned (e.g. if a tablet fails to bootstrap, it will
sit, responding to heartbeats, doing nothing else).

To remediate this, this patch changes the tserver response generated by
FAILED tablets to a new TABLET_FAILED, which is ignored by leaders to
promote eviction.

Additionally, a new tablet state is added: FAILED_AND_SHUTDOWN. Like
QUIESCING and SHUTDOWN, TabletReplica::Shutdown() can wait on
FAILED_AND_SHUTDOWN. This is useful if a failed tablet needs to be shut
down and still needs to be reassigned. Calling normal Shutdown() cannot
leave the replica in the FAILED state, and the SHUTDOWN state cannot
itself indicate the need for eviction.

Prior to this patch, tablets were set to FAILED when they failed to
delete metadata. This is no longer the case. Since error statuses during
deletion are only returned during IO to the metadata directory, and
because the metadata directory is a single point of failure, failures on
this codepath are made fatal for now. Once this is no longer the case,
these failures should be made benign, as proper error handling should
make files in the failed metadata directory unreachable. This ensures
the tablets that were meant to be deleted are not reassigned.

The test raft_consensus-itest is updated to ensure that failed tablets
are evicted and replaced. The test tablet_server-test is also updated to
ensure that, instead of TABLET_NOT_RUNNING, TABLET_FAILED is returned by
failed tablets.

This patch is a part of a series of patches to handle disk failure. See
section 2.5 in this doc:
https://docs.google.com/document/d/1zZk-vb_ETKUuePcZ9ZqoSK2oPvAAaEV1sjDXes8Pxgk/edit

Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
---
M src/kudu/client/scanner-internal.cc
M src/kudu/consensus/consensus_peers.cc
M src/kudu/consensus/consensus_queue.cc
M src/kudu/integration-tests/raft_consensus-itest.cc
M src/kudu/master/catalog_manager.cc
M src/kudu/tablet/metadata.proto
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/tserver.proto
12 files changed, 105 insertions(+), 56 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/40/7440/6
-- 
To view, visit http://gerrit.cloudera.org:8080/7440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-1407: reassign failed tablets

Posted by "David Ribeiro Alves (Code Review)" <ge...@cloudera.org>.
David Ribeiro Alves has posted comments on this change.

Change subject: KUDU-1407: reassign failed tablets
......................................................................


Patch Set 10:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/7440/7/src/kudu/client/scanner-internal.cc
File src/kudu/client/scanner-internal.cc:

PS7, Line 232: case tserver::TabletServerErrorPB::TABLET_FAILED: // fall-through
> Hrm, maybe, but I'm keeping this as is for now. Reasoning here was that bef
the difference seems to be that TABLET_NOT_FOUND forces a new metadata fetch (as well as blacklisting the server) whereas TABLET_NOT_RUNNING only blacklists. not sure whether it wouldn't be best to re-fetch the metadata since the tablet would have to be forcibly re-replicated. up to you.


http://gerrit.cloudera.org:8080/#/c/7440/10/src/kudu/consensus/consensus_queue.cc
File src/kudu/consensus/consensus_queue.cc:

PS10, Line 618: std::
nit remove


PS10, Line 623: Use the current term to ensure the peer will be evicted.
nit: explain that otherwise this notification is ignored


http://gerrit.cloudera.org:8080/#/c/7440/10/src/kudu/tablet/tablet_replica.cc
File src/kudu/tablet/tablet_replica.cc:

PS10, Line 233: state_ == QUIESCING || state_ == SHUTDOWN
doesn't this have to include FAILED too?


http://gerrit.cloudera.org:8080/#/c/7440/10/src/kudu/tserver/tablet_service.cc
File src/kudu/tserver/tablet_service.cc:

PS10, Line 176: SetupErrorAndRespond(resp->mutable_error(), s, TabletServerErrorPB::TABLET_FAILED, context);
likely slightly cleaner to do:
    TabletServerErrorPB::Code code = TabletServerErrorPB::TABLET_NOT_RUNNING;
    if (state == tablet::FAILED) {
      s = s.CloneAndAppend((*replica)->error().ToString());
      code = TabletServerErrorPB::TABLET_FAILED;
    }
    SetupErrorAndRespond(resp->mutable_error(), s, code, context);


http://gerrit.cloudera.org:8080/#/c/7440/10/src/kudu/tserver/ts_tablet_manager.cc
File src/kudu/tserver/ts_tablet_manager.cc:

PS10, Line 659: s.ToString();
more informative status message


-- 
To view, visit http://gerrit.cloudera.org:8080/7440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
Gerrit-PatchSet: 10
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] disk failure: replicate failed tablets

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/7440

to look at the new patch set (#2).

Change subject: disk failure: replicate failed tablets
......................................................................

disk failure: replicate failed tablets

Tablets that are marked FAILED or FAILED_AND_SHUTDOWN should eventually
be replicated. If a tablet is in either of these states and the tserver
is responding to a request, the response will be TABLET_FAILED rather
than TABLET_NOT_RUNNING. A response of TABLET_FAILED will then initiate
a tablet copy for the failed tablet.

This is useful, as a disk failure immediately marks a tablet as FAILED
and eventually shuts it down forcefully, leaving it as
FAILED_AND_SHUTDOWN.

RaftConsensusITest is updated to ensure that a FAILED tablet is actually
copied.

Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
---
M src/kudu/client/scanner-internal.cc
M src/kudu/consensus/consensus_peers.cc
M src/kudu/consensus/consensus_queue.cc
M src/kudu/integration-tests/raft_consensus-itest.cc
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/tserver.proto
7 files changed, 26 insertions(+), 15 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/40/7440/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-1407: reassign failed tablets

Posted by "David Ribeiro Alves (Code Review)" <ge...@cloudera.org>.
David Ribeiro Alves has posted comments on this change.

Change subject: KUDU-1407: reassign failed tablets
......................................................................


Patch Set 12: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
Gerrit-PatchSet: 12
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] KUDU-1407: reassign failed tablets

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change.

Change subject: KUDU-1407: reassign failed tablets
......................................................................


Patch Set 19:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7440/19/src/kudu/consensus/consensus_queue.cc
File src/kudu/consensus/consensus_queue.cc:

Line 627: void PeerMessageQueue::NotifyPeerHasFailed(const string& peer_uuid, const string& reason) {
consider:

  std::unique_lock<simple_spinlock> l(queue_lock_);
  TrackedPeer* peer = FindPtrOrNull(peers_map_, peer_uuid);
  if (peer) {
    // Use the current term to ensure the peer will be evicted, otherwise this
    // notification may be ignored.
    int64_t current_term = queue_state_.current_term;
    l.unlock();
    NotifyObserversOfFailedFollower(peer_uuid, current_term, reason);
  }


-- 
To view, visit http://gerrit.cloudera.org:8080/7440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
Gerrit-PatchSet: 19
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1407: reassign failed tablets

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/7440

to look at the new patch set (#12).

Change subject: KUDU-1407: reassign failed tablets
......................................................................

KUDU-1407: reassign failed tablets

Tablets put into the state tablet::FAILED are left until they are
manually deleted; they are not evicted and reassigned. If a tablet
fails to bootstrap, it will sit, responding to heartbeats, doing nothing
else. This patch ensures failed tablets will be reassigned.

As the tablets are not used, rather than directly setting replicas to
FAILED, an error is first recorded and the TabletReplica::Shutdown(), leaving
the final state as FAILED. A replica can no longer leave the FAILED
state (calls to Shutdown() leave it FAILED). The tserver response
generated by FAILED tablets is now TABLET_FAILED. Upon receiving this, a
leader will immediately evict the peer.

Prior to this patch, a tablet was marked FAILED if its WAL or metadata
failed to delete (after already shutting down). If this occurs, there
may be an inconsistency on-disk. This has been made fatal.

Testing is done in a few places:
- raft_consensus-itest is updated to ensure that tablets that fail to
  bootstrap are evicted and replaced.
- tablet_server-test is also updated to ensure that, instead of
  TABLET_NOT_RUNNING, TABLET_FAILED is returned by failed tablets.
- a test is added to ts_tablet_manager-itest is added to test that
  failed tablets while running are evicted and replaced.

This patch is a part of a series of patches to handle disk failure. See
section 2.5 in this doc:
https://docs.google.com/document/d/1zZk-vb_ETKUuePcZ9ZqoSK2oPvAAaEV1sjDXes8Pxgk/edit

Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
---
M src/kudu/client/scanner-internal.cc
M src/kudu/consensus/consensus_peers.cc
M src/kudu/consensus/consensus_queue.cc
M src/kudu/consensus/consensus_queue.h
M src/kudu/integration-tests/raft_consensus-itest.cc
M src/kudu/integration-tests/ts_recovery-itest.cc
M src/kudu/integration-tests/ts_tablet_manager-itest.cc
M src/kudu/master/catalog_manager.cc
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/tserver.proto
14 files changed, 167 insertions(+), 67 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/40/7440/12
-- 
To view, visit http://gerrit.cloudera.org:8080/7440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
Gerrit-PatchSet: 12
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-1407: reassign failed tablets

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change.

Change subject: KUDU-1407: reassign failed tablets
......................................................................


Patch Set 12: Code-Review-1

(1 comment)

I'm a little suspicious of the build failure; I'm currently running a dist-test to make sure this won't flake delete_table-itest

http://gerrit.cloudera.org:8080/#/c/7440/12/src/kudu/tserver/ts_tablet_manager.cc
File src/kudu/tserver/ts_tablet_manager.cc:

PS12, Line 753:     replica->SetError(s);
              :     replica->Shutdown();
> Should put this in a ScopedCleanup (and cancel it at the end of the functio
I'm going to keep it this way since it preserves some useful error messages. Scoped cleanup might make it a bit trickier.


-- 
To view, visit http://gerrit.cloudera.org:8080/7440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
Gerrit-PatchSet: 12
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1407: reassign failed tablets

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change.

Change subject: KUDU-1407: reassign failed tablets
......................................................................


Patch Set 7:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/7440/7/src/kudu/client/scanner-internal.cc
File src/kudu/client/scanner-internal.cc:

PS7, Line 232: case tserver::TabletServerErrorPB::TABLET_FAILED: // fall-through
> the difference seems to be that TABLET_NOT_FOUND forces a new metadata fetc
Ah, thanks for pointing that out. It seems like the the staleness needs to be refreshed when there is a config change. In this case, even though we know a config change is impending, it may not have happened yet, but it's a fair point that we may still want to retry.

If we assume the time between a failure and the config change (maybe a bit longer than the heartbeat interval) is shorter than the time between the client staleness check and it doing the actual metadata re-fetch, this isn't too big a deal (and still not a big deal otherwise). Went with your suggestion of TABLET_NOT_FOUND.


http://gerrit.cloudera.org:8080/#/c/7440/8/src/kudu/consensus/consensus_peers.cc
File src/kudu/consensus/consensus_peers.cc:

Line 291:     return;
> warning: do not use 'else' after 'return' [readability-else-after-return]
Done


http://gerrit.cloudera.org:8080/#/c/7440/10/src/kudu/consensus/consensus_queue.cc
File src/kudu/consensus/consensus_queue.cc:

PS10, Line 618: ::str
> nit remove
Done


PS10, Line 623: 
> nit: explain that otherwise this notification is ignored
Done


http://gerrit.cloudera.org:8080/#/c/7440/10/src/kudu/tablet/tablet_replica.cc
File src/kudu/tablet/tablet_replica.cc:

PS10, Line 233: turn;
> doesn't this have to include FAILED too?
Done


http://gerrit.cloudera.org:8080/#/c/7440/10/src/kudu/tserver/tablet_service.cc
File src/kudu/tserver/tablet_service.cc:

PS10, Line 176: SetupErrorAndRespond(resp->mutable_error(), s, TabletServerErrorPB::TABLET_NOT_RUNNING,
> likely slightly cleaner to do:
Done


http://gerrit.cloudera.org:8080/#/c/7440/10/src/kudu/tserver/ts_tablet_manager.cc
File src/kudu/tserver/ts_tablet_manager.cc:

PS10, Line 659: usMessage("De
> more informative status message
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
Gerrit-PatchSet: 7
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1407: reassign failed tablets

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: KUDU-1407: reassign failed tablets
......................................................................


Patch Set 12:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7440/12/src/kudu/tserver/ts_tablet_manager.cc
File src/kudu/tserver/ts_tablet_manager.cc:

PS12, Line 753:     replica->SetError(s);
              :     replica->Shutdown();
> I'm going to keep it this way since it preserves some useful error messages
I don't think you have to lose the error messages:

  Status s;
  auto cleanup = MakeScopedCleanup([&]() {
    replica->SetError(s);
    replica->Shutdown();
  }

  s = cmeta_manager_->Load(...);
  if (!s.ok()) {
    LOG(ERROR) << ...;
    return;
  }

This way you get a customized log message per error site, but you centralize the cleanup in one place.


-- 
To view, visit http://gerrit.cloudera.org:8080/7440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
Gerrit-PatchSet: 12
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1407: reassign failed tablets

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change.

Change subject: KUDU-1407: reassign failed tablets
......................................................................


Patch Set 17:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/7440/17//COMMIT_MSG
Commit Message:

PS17, Line 30: is added 
> this is repeated
Done


PS17, Line 31: failed tablets while running
> tablets that fail while running (due to what?)
Done


http://gerrit.cloudera.org:8080/#/c/7440/17/src/kudu/consensus/consensus_queue.cc
File src/kudu/consensus/consensus_queue.cc:

Line 629:     NotifyObserversOfFailedFollower(peer_uuid, current_term, reason);
> nit: No need to hold the lock while calling this method.
Done


http://gerrit.cloudera.org:8080/#/c/7440/17/src/kudu/master/catalog_manager.cc
File src/kudu/master/catalog_manager.cc:

Line 170: DEFINE_bool(master_tombstone_failed_tablet_replicas, true,
> Should be removed per below. See master_tombstone_evicted_tablet_replica
Done


PS17, Line 2473:     if (FLAGS_master_tombstone_failed_tablet_replicas) {
               :       SendDeleteReplicaRequest(report.tablet_id(), TABLET_DATA_TOMBSTONED,
               :                                boost::none,
               :                                tablet->table(), ts_desc->permanent_uuid(),
               :                                Substitute("Tablet failed: $0", s.ToString()));
               :     }
> Is this required? The leader will now evict a failed follower because of th
I think you're right; when the leader sees the failed tablet, it should evict and config change, and then report to the master.


http://gerrit.cloudera.org:8080/#/c/7440/17/src/kudu/tserver/ts_tablet_manager.cc
File src/kudu/tserver/ts_tablet_manager.cc:

PS17, Line 655: metadata
> Couldn't this simply happen if one of the data disks failed?
Failures when writing the data directory are passed off as WARN_NOT_OK() (see TabletMetadata::DeleteOrphanedBlocks), since the blocks can always be removed in the future (eg when we next startup).


PS17, Line 658: is unclear
> Shouldn't the contract of DeleteTabletData() be a crash-consistent one? In 
Ah, I see. I'll update the comment.


Line 752:   auto fail_tablet = MakeScopedCleanup([&]() {
> I like this approach.
It is quite clean indeed!

Credit to Adar for the suggestion.


-- 
To view, visit http://gerrit.cloudera.org:8080/7440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
Gerrit-PatchSet: 17
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] disk failure: reassign failed tablets

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/7440

to look at the new patch set (#4).

Change subject: disk failure: reassign failed tablets
......................................................................

disk failure: reassign failed tablets

Tablets are marked tablet::FAILED in a number of places (e.g. failing to
bootstrap) and will be marked tablet::FAILED_AND_SHUTDOWN upon disk
failure.

Currently, a tablet in either state is left alone and will respond to
heartbeats with TABLET_NOT_RUNNING messages, which indicate it is
responsive, but the tablet itself is not used for anything.

This patch changes this behavior to ensure that tablets in either state
will respond with the new TABLET_FAILED response that does not indicate
the tablet is responsive, promoting eviction.

Additionally, prior to this patch, tablets were set to FAILED when they
failed to delete metadata. This is no longer the case. Since error
statuses during deletion are only returned during IO to the metadata
directory, and because the metadata directory is a single point of
failure, failures in this codepath are made fatal for now. Once this is
no longer the case, these failures should be made benign, as proper
error handling should make files on the failed metadata directory
unreachable. This ensures the tablets that were meant to be deleted are
not reassigned.

This patch is a part of a series of patches to handle disk failure. See
section 2.5 in this doc:
https://docs.google.com/document/d/1zZk-vb_ETKUuePcZ9ZqoSK2oPvAAaEV1sjDXes8Pxgk/edit

Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
---
M src/kudu/client/scanner-internal.cc
M src/kudu/consensus/consensus_peers.cc
M src/kudu/consensus/consensus_queue.cc
M src/kudu/integration-tests/raft_consensus-itest.cc
M src/kudu/master/catalog_manager.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/tserver.proto
8 files changed, 76 insertions(+), 40 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/40/7440/4
-- 
To view, visit http://gerrit.cloudera.org:8080/7440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] disk failure: reassign failed tablets

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/7440

to look at the new patch set (#8).

Change subject: disk failure: reassign failed tablets
......................................................................

disk failure: reassign failed tablets

Tablets put into the state tablet::FAILED are left until they are
manually deleted. This is an issue because failed tablets don't get
evicted and reassigned (e.g. if a tablet fails to bootstrap, it will
sit, responding to heartbeats, doing nothing else).

To remediate this, this patch changes the tserver response generated by
FAILED tablets to a new TABLET_FAILED state, on which a leader will
immediately evict the peer.

Additionally, a new tablet state is added: FAILED_AND_SHUTDOWN. Like
QUIESCING and SHUTDOWN, TabletReplica::Shutdown() can wait on
FAILED_AND_SHUTDOWN. This is useful if a failed tablet needs to be shut
down and still needs to be reassigned. Calling normal Shutdown() cannot
leave the replica in the FAILED state, and the SHUTDOWN state cannot
itself indicate the need for eviction.

Prior to this patch, tablets were set to FAILED when they failed to
delete metadata. This is no longer the case. Since error statuses during
deletion are only returned during IO to the metadata directory, and
because the metadata directory is a single point of failure, failures on
this codepath are made fatal for now. Once this is no longer the case,
these failures should be made benign, as proper error handling should
make files in the failed metadata directory unreachable. This ensures
the tablets that were meant to be deleted are not reassigned.

The test raft_consensus-itest is updated to ensure that failed tablets
are evicted and replaced. The test tablet_server-test is also updated to
ensure that, instead of TABLET_NOT_RUNNING, TABLET_FAILED is returned by
failed tablets.

This patch is a part of a series of patches to handle disk failure. See
section 2.5 in this doc:
https://docs.google.com/document/d/1zZk-vb_ETKUuePcZ9ZqoSK2oPvAAaEV1sjDXes8Pxgk/edit

Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
---
M src/kudu/client/scanner-internal.cc
M src/kudu/consensus/consensus_peers.cc
M src/kudu/consensus/consensus_queue.cc
M src/kudu/consensus/consensus_queue.h
M src/kudu/integration-tests/raft_consensus-itest.cc
M src/kudu/integration-tests/ts_recovery-itest.cc
M src/kudu/master/catalog_manager.cc
M src/kudu/tablet/metadata.proto
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/tserver.proto
14 files changed, 131 insertions(+), 65 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/40/7440/8
-- 
To view, visit http://gerrit.cloudera.org:8080/7440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
Gerrit-PatchSet: 8
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-1407: reassign failed tablets

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change.

Change subject: KUDU-1407: reassign failed tablets
......................................................................


Patch Set 15:

Just rebased this in my series of start-time failure-handling patches, since it's necessary for https://gerrit.cloudera.org/#/c/7766/ to reassign tablets that fail to bootstrap.

Currently still going through the tombstoned voting patch to see how this needs to be updated.

-- 
To view, visit http://gerrit.cloudera.org:8080/7440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
Gerrit-PatchSet: 15
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] KUDU-1407: reassign failed tablets

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello David Ribeiro Alves, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/7440

to look at the new patch set (#13).

Change subject: KUDU-1407: reassign failed tablets
......................................................................

KUDU-1407: reassign failed tablets

Tablets put into the state tablet::FAILED are left until they are
manually deleted; they are not evicted and reassigned. If a tablet
fails to bootstrap, it will sit, responding to heartbeats, doing nothing
else. This patch ensures failed tablets will be reassigned.

As the tablets are not used, rather than directly setting replicas to
FAILED, an error is first recorded and the TabletReplica::Shutdown(), leaving
the final state as FAILED. A replica can no longer leave the FAILED
state (calls to Shutdown() leave it FAILED). The tserver response
generated by FAILED tablets is now TABLET_FAILED. Upon receiving this, a
leader will immediately evict the peer.

Prior to this patch, a tablet was marked FAILED if its WAL or metadata
failed to delete (after already shutting down). If this occurs, there
may be an inconsistency on-disk. This has been made fatal.

Testing is done in a few places:
- raft_consensus-itest is updated to ensure that tablets that fail to
  bootstrap are evicted and replaced.
- tablet_server-test is also updated to ensure that, instead of
  TABLET_NOT_RUNNING, TABLET_FAILED is returned by failed tablets.
- a test is added to ts_tablet_manager-itest is added to test that
  failed tablets while running are evicted and replaced.

This patch is a part of a series of patches to handle disk failure. See
section 2.5 in this doc:
https://docs.google.com/document/d/1zZk-vb_ETKUuePcZ9ZqoSK2oPvAAaEV1sjDXes8Pxgk/edit

Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
---
M src/kudu/client/scanner-internal.cc
M src/kudu/consensus/consensus_peers.cc
M src/kudu/consensus/consensus_queue.cc
M src/kudu/consensus/consensus_queue.h
M src/kudu/integration-tests/raft_consensus-itest.cc
M src/kudu/integration-tests/ts_recovery-itest.cc
M src/kudu/integration-tests/ts_tablet_manager-itest.cc
M src/kudu/master/catalog_manager.cc
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/tserver.proto
14 files changed, 162 insertions(+), 58 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/40/7440/13
-- 
To view, visit http://gerrit.cloudera.org:8080/7440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
Gerrit-PatchSet: 13
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] disk failure: reassign failed tablets

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change.

Change subject: disk failure: reassign failed tablets
......................................................................


Patch Set 8:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/7440/7/src/kudu/client/scanner-internal.cc
File src/kudu/client/scanner-internal.cc:

PS7, Line 232: case tserver::TabletServerErrorPB::TABLET_FAILED: // fall-through
> would it make more sense to have this be like: TABLET_NOT_FOUND? How do we 
Hrm, maybe, but I'm keeping this as is for now. Reasoning here was that before when a tablet was in the FAILED state, we would treat it as TABLET_NOT_RUNNING. I'm looking in client/scanner-internal.cc and it seems like we blacklist the location for TNR (if there's somewhere else I should be looking, please let me know).

I'm not sure it makes sense to retry on TNR. I suppose it could retry if the tablet were NOT_STARTED or BOOTSTRAPPING, but tablets in QUIESCING and SHUTDOWN are also considered NOT_RUNNING.


http://gerrit.cloudera.org:8080/#/c/7440/7/src/kudu/consensus/consensus_peers.cc
File src/kudu/consensus/consensus_peers.cc:

PS7, Line 284: sponse_.error().code() == TabletServerErrorPB::TABLET_FAILED) 
> maybe in this case we should directly call: NotifyObserversOfFailedFollower
Done.


http://gerrit.cloudera.org:8080/#/c/7440/7/src/kudu/consensus/consensus_queue.cc
File src/kudu/consensus/consensus_queue.cc:

PS7, Line 638: // Initiate Tablet Copy on the peer if the tablet is not found.
             :     if (response.has_error()) {
             :       CHECK_EQ(tserver::TabletServerErrorPB::TABLET_NOT_FOUND, response.error().code());
             :       peer->needs_tablet_copy = true;
             :       VLOG_WITH_PREFIX_UNLOCKED(1) << "Marked peer as needing tablet copy: "
             :                                     << peer->ToString();
             :       *more_pending = true;
             :       return;
             :     }
             : 
             :     // Sanity checks.
             :     // Some of these can be eventually removed, but they are handy for now.
             :     DCHECK(response.status().IsInitialized())
             :         << "Error: Uninitialized: " << response.InitializationErrorString()
             :         << ". Response: "<< SecureShortDebugString(response);
             :     // TODO: Include uuid in error messages as well.
             :     DCHECK(response.has_responder_uuid() && !response.responder_uuid().empty())
             :      
> see my comment on the call site
Done


http://gerrit.cloudera.org:8080/#/c/7440/7/src/kudu/master/catalog_manager.cc
File src/kudu/master/catalog_manager.cc:

PS7, Line 170: DEFINE_bool(master_tombstone_failed_tablet_replicas, true,
             :             "Whether the master should tombstone (delete) tablet replicas that "
             :             "are reporting a failed state. Only for testing!");
             : TAG_FLAG(master_tombstone_failed_tablet_replicas, hidden);
> is this a test only thing?
As of now, yes. Will update to make that clear.


http://gerrit.cloudera.org:8080/#/c/7440/7/src/kudu/tablet/metadata.proto
File src/kudu/tablet/metadata.proto:

PS7, Line 161: the tablet will be evicted and
> ??
Should be evicted and replaced.


-- 
To view, visit http://gerrit.cloudera.org:8080/7440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
Gerrit-PatchSet: 8
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1407: reassign failed tablets

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has submitted this change and it was merged.

Change subject: KUDU-1407: reassign failed tablets
......................................................................


KUDU-1407: reassign failed tablets

Tablets put into the state tablet::FAILED are left until they are
manually deleted; they are not evicted and reassigned. If a tablet
fails to bootstrap, it will sit, responding to heartbeats, doing nothing
else. This patch ensures failed tablets will be reassigned.

As the tablets are not used, rather than directly setting replicas to
FAILED, an error is first recorded and the TabletReplica::Shutdown(), leaving
the final state as FAILED. A replica can no longer leave the FAILED
state (calls to Shutdown() leave it FAILED). The tserver response
generated by FAILED tablets is now TABLET_FAILED. Upon receiving this, a
leader will immediately evict the peer.

Prior to this patch, a tablet was marked FAILED if its WAL or metadata
failed to delete (after already shutting down). If this occurs, there is
no guarantee that the tablet's metadata reflects the deleted state. This
has been made fatal.

Testing is done in a few places:
- raft_consensus-itest is updated to ensure that tablets that fail to
  bootstrap are evicted and replaced.
- tablet_server-test is also updated to ensure that, instead of
  TABLET_NOT_RUNNING, TABLET_FAILED is returned by failed tablets.
- a test is added to ts_tablet_manager-itest to test that a tablet that
  is manually failed while running is evicted and replaced.

This patch is a part of a series of patches to handle disk failure. See
section 2.5 in this doc:
https://docs.google.com/document/d/1zZk-vb_ETKUuePcZ9ZqoSK2oPvAAaEV1sjDXes8Pxgk/edit

Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
Reviewed-on: http://gerrit.cloudera.org:8080/7440
Tested-by: Kudu Jenkins
Reviewed-by: Mike Percy <mp...@apache.org>
---
M src/kudu/client/scanner-internal.cc
M src/kudu/consensus/consensus_peers.cc
M src/kudu/consensus/consensus_queue.cc
M src/kudu/consensus/consensus_queue.h
M src/kudu/integration-tests/raft_consensus-itest.cc
M src/kudu/integration-tests/ts_recovery-itest.cc
M src/kudu/integration-tests/ts_tablet_manager-itest.cc
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/tserver.proto
13 files changed, 154 insertions(+), 58 deletions(-)

Approvals:
  Mike Percy: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/7440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
Gerrit-PatchSet: 21
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] disk failure: replicate failed tablets

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change.

Change subject: disk failure: replicate failed tablets
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7440/2/src/kudu/consensus/consensus_queue.cc
File src/kudu/consensus/consensus_queue.cc:

Line 642:             tserver::TabletServerErrorPB::TABLET_FAILED == response.error().code())
Why Tablet Copy? Why not evict and let the master reassign the replica?


-- 
To view, visit http://gerrit.cloudera.org:8080/7440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1407: reassign failed tablets

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change.

Change subject: KUDU-1407: reassign failed tablets
......................................................................


Patch Set 17:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/7440/17//COMMIT_MSG
Commit Message:

PS17, Line 30: is added 
this is repeated


PS17, Line 31: failed tablets while running
tablets that fail while running (due to what?)


http://gerrit.cloudera.org:8080/#/c/7440/17/src/kudu/consensus/consensus_queue.cc
File src/kudu/consensus/consensus_queue.cc:

Line 629:     NotifyObserversOfFailedFollower(peer_uuid, current_term, reason);
nit: No need to hold the lock while calling this method.


http://gerrit.cloudera.org:8080/#/c/7440/17/src/kudu/master/catalog_manager.cc
File src/kudu/master/catalog_manager.cc:

Line 170: DEFINE_bool(master_tombstone_failed_tablet_replicas, true,
Should be removed per below. See master_tombstone_evicted_tablet_replica


PS17, Line 2473:     if (FLAGS_master_tombstone_failed_tablet_replicas) {
               :       SendDeleteReplicaRequest(report.tablet_id(), TABLET_DATA_TOMBSTONED,
               :                                boost::none,
               :                                tablet->table(), ts_desc->permanent_uuid(),
               :                                Substitute("Tablet failed: $0", s.ToString()));
               :     }
Is this required? The leader will now evict a failed follower because of the changes in the queue in this patch. Once that eviction is committed as a new config change, the master should find out and automatically delete this replica that is part of a stale config (in a safe way that passes in cas_config_opid_index_less_or_equal). See FLAGS_master_tombstone_evicted_tablet_replicas usage in this file.


http://gerrit.cloudera.org:8080/#/c/7440/17/src/kudu/tserver/ts_tablet_manager.cc
File src/kudu/tserver/ts_tablet_manager.cc:

PS17, Line 655: metadata
Couldn't this simply happen if one of the data disks failed?


PS17, Line 658: is unclear
Shouldn't the contract of DeleteTabletData() be a crash-consistent one? In fact, I think it is (perhaps not well documented) from the perspective of the order in which we delete things. It's extensively tested in ts_recovery-itest.


Line 752:   auto fail_tablet = MakeScopedCleanup([&]() {
I like this approach.


-- 
To view, visit http://gerrit.cloudera.org:8080/7440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
Gerrit-PatchSet: 17
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1407: reassign failed tablets

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/7440

to look at the new patch set (#9).

Change subject: KUDU-1407: reassign failed tablets
......................................................................

KUDU-1407: reassign failed tablets

Tablets put into the state tablet::FAILED are left until they are
manually deleted; they are not evicted and reassigned. If a tablet
fails to bootstrap, it will sit, responding to heartbeats, doing nothing
else. This patch ensures failed tablets will be reassigned.

As the tablets are not used, rather than directly setting replicas to
FAILED, an error is first recorded and the TabletReplica::Shutdown(), leaving
the final state as FAILED. A replica can no longer leave the FAILED
state (calls to Shutdown() leave it FAILED). The tserver response
generated by FAILED tablets is now TABLET_FAILED. Upon receiving this, a
leader will immediately evict the peer.

Prior to this patch, a tablet was marked FAILED if its WAL or metadata
failed to delete (after already shutting down). If this occurs, there
may be an inconsistency on-disk. This has been made fatal.

Testing is done in a few places:
- raft_consensus-itest is updated to ensure that tablets that fail to
  bootstrap are evicted and replaced.
- tablet_server-test is also updated to ensure that, instead of
  TABLET_NOT_RUNNING, TABLET_FAILED is returned by failed tablets.
- a test is added to ts_tablet_manager-itest is added to test that
  failed tablets while running are evicted and replaced.

This patch is a part of a series of patches to handle disk failure. See
section 2.5 in this doc:
https://docs.google.com/document/d/1zZk-vb_ETKUuePcZ9ZqoSK2oPvAAaEV1sjDXes8Pxgk/edit

Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
---
M src/kudu/client/scanner-internal.cc
M src/kudu/consensus/consensus_peers.cc
M src/kudu/consensus/consensus_queue.cc
M src/kudu/consensus/consensus_queue.h
M src/kudu/integration-tests/raft_consensus-itest.cc
M src/kudu/integration-tests/ts_recovery-itest.cc
M src/kudu/integration-tests/ts_tablet_manager-itest.cc
M src/kudu/master/catalog_manager.cc
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/tserver.proto
14 files changed, 160 insertions(+), 59 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/40/7440/9
-- 
To view, visit http://gerrit.cloudera.org:8080/7440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
Gerrit-PatchSet: 9
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-1407: reassign failed tablets

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: KUDU-1407: reassign failed tablets
......................................................................


Patch Set 12:

(1 comment)

Feel free to ignore my feedback if you weren't going to generate a new patch since it's minor.

http://gerrit.cloudera.org:8080/#/c/7440/12/src/kudu/tserver/ts_tablet_manager.cc
File src/kudu/tserver/ts_tablet_manager.cc:

PS12, Line 753:     replica->SetError(s);
              :     replica->Shutdown();
Should put this in a ScopedCleanup (and cancel it at the end of the function) so you don't have to repeat it over and over.


-- 
To view, visit http://gerrit.cloudera.org:8080/7440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
Gerrit-PatchSet: 12
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1407: reassign failed tablets

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/7440

to look at the new patch set (#11).

Change subject: KUDU-1407: reassign failed tablets
......................................................................

KUDU-1407: reassign failed tablets

Tablets put into the state tablet::FAILED are left until they are
manually deleted; they are not evicted and reassigned. If a tablet
fails to bootstrap, it will sit, responding to heartbeats, doing nothing
else. This patch ensures failed tablets will be reassigned.

As the tablets are not used, rather than directly setting replicas to
FAILED, an error is first recorded and the TabletReplica::Shutdown(), leaving
the final state as FAILED. A replica can no longer leave the FAILED
state (calls to Shutdown() leave it FAILED). The tserver response
generated by FAILED tablets is now TABLET_FAILED. Upon receiving this, a
leader will immediately evict the peer.

Prior to this patch, a tablet was marked FAILED if its WAL or metadata
failed to delete (after already shutting down). If this occurs, there
may be an inconsistency on-disk. This has been made fatal.

Testing is done in a few places:
- raft_consensus-itest is updated to ensure that tablets that fail to
  bootstrap are evicted and replaced.
- tablet_server-test is also updated to ensure that, instead of
  TABLET_NOT_RUNNING, TABLET_FAILED is returned by failed tablets.
- a test is added to ts_tablet_manager-itest is added to test that
  failed tablets while running are evicted and replaced.

This patch is a part of a series of patches to handle disk failure. See
section 2.5 in this doc:
https://docs.google.com/document/d/1zZk-vb_ETKUuePcZ9ZqoSK2oPvAAaEV1sjDXes8Pxgk/edit

Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
---
M src/kudu/client/scanner-internal.cc
M src/kudu/consensus/consensus_peers.cc
M src/kudu/consensus/consensus_queue.cc
M src/kudu/consensus/consensus_queue.h
M src/kudu/integration-tests/raft_consensus-itest.cc
M src/kudu/integration-tests/ts_recovery-itest.cc
M src/kudu/integration-tests/ts_tablet_manager-itest.cc
M src/kudu/master/catalog_manager.cc
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/tserver.proto
14 files changed, 161 insertions(+), 61 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/40/7440/11
-- 
To view, visit http://gerrit.cloudera.org:8080/7440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
Gerrit-PatchSet: 11
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>