You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Andrew Wong (Code Review)" <ge...@cloudera.org> on 2017/05/31 18:19:15 UTC

[kudu-CR] disk failure: handle EIOs on I/O to blocks

Andrew Wong has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7030

Change subject: disk failure: handle EIOs on I/O to blocks
......................................................................

disk failure: handle EIOs on I/O to blocks

This patch adds the proper macros around operations that touch disk that
occur when writing or reading blocks.

Change-Id: Ia03bfb711a1b022d7516f4adb37fe9fb28ec949c
---
M src/kudu/cfile/cfile_reader.cc
M src/kudu/cfile/cfile_writer.cc
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/util/status.h
5 files changed, 45 insertions(+), 23 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/30/7030/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7030
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ia03bfb711a1b022d7516f4adb37fe9fb28ec949c
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>

[kudu-CR] WIP disk failure: coordinate disk failure handling

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: WIP disk failure: coordinate disk failure handling
......................................................................


Patch Set 9:

(23 comments)

I focused on the block manager and some MM ops. You should definitely get a review from Todd/David/Mike on the transaction changes, the tablet read-path changes, and the consensus changes.

http://gerrit.cloudera.org:8080/#/c/7030/9/src/kudu/fs/block_manager.h
File src/kudu/fs/block_manager.h:

Line 141:   virtual void HandleError(const Status& s, DataDir* dir) const = 0;
Why does HandleError() need to be part of the external API to WritableBlock and ReadableBlock? IIUC the macros correctly, it should be sufficient to expose them in LogReadableBlock/LogWritableBlock as non-virtual methods.


http://gerrit.cloudera.org:8080/#/c/7030/9/src/kudu/fs/error_manager.h
File src/kudu/fs/error_manager.h:

Line 45:   RETURN_NOT_OK(s_); \
Wrong patch, but shouldn't this be just 'return s'? We've already established that s_ isn't OK.


Line 56:   RETURN_NOT_OK(s_); \
Same.


Line 61: #define HANDLE_DISK_FAILURE(status_expr, err_handler) do { \
Since this is so similar to RETURN_NOT_OK_HANDLE would prefer if the macro names were a little more similar. Maybe RETURN_NOT_OK_HANDLE_DISK_FAILURE and just HANDLE_DISK_FAILURE?


Line 64:       (err_handler); \
Nit: indentation


http://gerrit.cloudera.org:8080/#/c/7030/9/src/kudu/fs/file_block_manager.cc
File src/kudu/fs/file_block_manager.cc:

Line 370:     if (PREDICT_FALSE(IsDiskFailure(sync))) {
Shouldn't these be HANDLE_DISK_FAILURE calls instead?


http://gerrit.cloudera.org:8080/#/c/7030/9/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

Line 399:   void HandleError(const Status& s, DataDir* dir) const;
This isn't an accessor; bring it up into the other section and doc it.


Line 402:   friend class LogWritableBlock;
Don't need? HandleError() is public after all.


Line 498: Status LogBlockContainer::Create(LogBlockManager* block_manager,
What about disk failures during container creation?


Line 604:   // Open the existing metadata and data files for writing.
What about failures here?


Line 648:   RETURN_NOT_OK(block_manager()->env()->NewRandomAccessFile(metadata_path, &metadata_reader));
What about failures here?


Line 656:     read_status = pb_reader.ReadNextPB(&record);
And here?


Line 858:   return metadata_file_->Append(pb);
Failures here?


Line 881:     return metadata_file_->Sync();
What about this one?


Line 886: Status LogBlockContainer::ReopenMetadataWriter() {
And failures here?


Line 1090:   DataDir* mutable_data_dir() {
Where is this used?


Line 1305:   DataDir* mutable_data_dir() {
And this?


http://gerrit.cloudera.org:8080/#/c/7030/9/src/kudu/tablet/delta_tracker.cc
File src/kudu/tablet/delta_tracker.cc:

Line 696:   Status s = FlushDMS(old_dms.get(), &dfr, flush_type);
Shouldn't we just RETURN_NOT_OK here and drop the CHECK? Isn't the expectation that the MM op will figure out whether the failure is fatal or not? Why do we also have to consider it here?


http://gerrit.cloudera.org:8080/#/c/7030/9/src/kudu/tablet/metadata.proto
File src/kudu/tablet/metadata.proto:

Line 160:   // The Tablet has been stopped after a failure. In this state, like FAILED,
Why do we need a new state for this? Why isn't FAILED sufficient?


http://gerrit.cloudera.org:8080/#/c/7030/9/src/kudu/tablet/tablet_replica_mm_ops.cc
File src/kudu/tablet/tablet_replica_mm_ops.cc:

Line 138:     tablet_replica_->tablet()->rowsets_flush_sem_.unlock();
Can you use an RAII-style guard for the acquisition forof rowsets_flush_sem_?


Line 196:                             max_idx_to_replay_size);
Nit: indentation.


http://gerrit.cloudera.org:8080/#/c/7030/9/src/kudu/tserver/tserver.proto
File src/kudu/tserver/tserver.proto:

Line 97:     // The tablet is in a failed state.
TABLET_NOT_RUNNING is insufficiently descriptive?


http://gerrit.cloudera.org:8080/#/c/7030/9/src/kudu/util/status.h
File src/kudu/util/status.h:

Line 433: inline bool IsDiskFailure(const Status& s) {
Would be nice to add a comment to each explaining why it's included.


-- 
To view, visit http://gerrit.cloudera.org:8080/7030
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia03bfb711a1b022d7516f4adb37fe9fb28ec949c
Gerrit-PatchSet: 9
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] disk failure: error-handling macros in blocks

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: disk failure: error-handling macros in blocks
......................................................................


Patch Set 21: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7030
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia03bfb711a1b022d7516f4adb37fe9fb28ec949c
Gerrit-PatchSet: 21
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] WIP disk failure: coordinate disk failure handling

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/7030

to look at the new patch set (#8).

Change subject: WIP disk failure: coordinate disk failure handling
......................................................................

WIP disk failure: coordinate disk failure handling

This patch adds the logic required to prevent a crash on disk failure.

Disk failure handling happens in a few places:
- block/container-level functions that call env functions that may result in
disk failure can run callbacks to fail/shutdown tablets in the parent data dir
- tablet-level functions that CHECK for failures are ended early if the tablet
is known to have data on a bad disk
- transactions can now be canceled to force a shutdown of a tablet replica
- tablets in FAILED or the new FAILED_AND_SHUTDOWN state will trigger
replication

A set of basic tests are added in ts_disk_failure-itest.

TODO:
- crash if tablet metadata dir is bad
- don't crash if a disk can't be read at startup and we try to
  CheckIntegrity, but it's missing

Change-Id: Ia03bfb711a1b022d7516f4adb37fe9fb28ec949c
---
M src/kudu/consensus/consensus_peers.cc
M src/kudu/consensus/consensus_queue.cc
M src/kudu/fs/block_manager.h
M src/kudu/fs/error_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/fs-test-util.h
M src/kudu/fs/fs_manager.h
M src/kudu/fs/log_block_manager.cc
M src/kudu/master/sys_catalog.cc
M src/kudu/tablet/delta_tracker.cc
M src/kudu/tablet/metadata.proto
M src/kudu/tablet/mvcc.cc
M src/kudu/tablet/mvcc.h
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
M src/kudu/tablet/tablet_replica_mm_ops.cc
M src/kudu/tablet/transactions/alter_schema_transaction.h
M src/kudu/tablet/transactions/transaction.h
M src/kudu/tablet/transactions/transaction_driver.cc
M src/kudu/tablet/transactions/transaction_driver.h
M src/kudu/tablet/transactions/transaction_tracker-test.cc
M src/kudu/tablet/transactions/transaction_tracker.cc
M src/kudu/tablet/transactions/transaction_tracker.h
M src/kudu/tablet/transactions/write_transaction.cc
M src/kudu/tablet/transactions/write_transaction.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tserver/CMakeLists.txt
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/tserver/tablet_service.cc
A src/kudu/tserver/ts_disk_failure-test.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/tserver.proto
M src/kudu/util/status.h
35 files changed, 487 insertions(+), 79 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/30/7030/8
-- 
To view, visit http://gerrit.cloudera.org:8080/7030
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia03bfb711a1b022d7516f4adb37fe9fb28ec949c
Gerrit-PatchSet: 8
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] disk failure: error-handling macros in blocks

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: disk failure: error-handling macros in blocks
......................................................................


Patch Set 20: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7030
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia03bfb711a1b022d7516f4adb37fe9fb28ec949c
Gerrit-PatchSet: 20
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] disk failure: error-handling macros in blocks

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/7030

to look at the new patch set (#16).

Change subject: disk failure: error-handling macros in blocks
......................................................................

disk failure: error-handling macros in blocks

This patch adds macros to trigger error-handling code in the event of
disk failure during block IO. Block- and container-level functions that
call env or cache functions (e.g. File::Append, FileCache::DeleteFile)
that may result in disk failure can now run callbacks that, in the
future, will fail and shutdown tablets in the failed directory,
preventing further IO to the disk.

Disk failures are indicated by one of EIO, ENODEV, ENXIO, and EROFS,
although this list can be extended as needed in the future.

As failure-handling relies on the addition of a handful of other
features, the meat of the error-handling callback will come in a later
patch. This patch only serves to cover block IO calls with macros.
Failures elswhere (e.g. when reading instance or tablet metadata) will
also be handled in separate patches.

This is part of a series of patches to handle disk failure. See
section 2.2 of:
https://docs.google.com/document/d/1zZk-vb_ETKUuePcZ9ZqoSK2oPvAAaEV1sjDXes8Pxgk/edit?usp=sharing

Change-Id: Ia03bfb711a1b022d7516f4adb37fe9fb28ec949c
---
M src/kudu/fs/error_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/util/status.h
5 files changed, 194 insertions(+), 67 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/30/7030/16
-- 
To view, visit http://gerrit.cloudera.org:8080/7030
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia03bfb711a1b022d7516f4adb37fe9fb28ec949c
Gerrit-PatchSet: 16
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] disk failure: error-handling macros in blocks

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/7030

to look at the new patch set (#18).

Change subject: disk failure: error-handling macros in blocks
......................................................................

disk failure: error-handling macros in blocks

This patch adds macros to trigger error-handling code in the event of
disk failure during block IO. Block- and container-level functions that
call env or cache functions (e.g. File::Append, FileCache::DeleteFile)
that may result in disk failure can now run callbacks that, in the
future, will fail and shutdown tablets in the failed directory,
preventing further IO to the disk.

Disk failures are indicated by one of EIO, ENODEV, ENXIO, and EROFS,
although this list can be extended as needed in the future.

As failure-handling relies on the addition of a handful of other
features, the meat of the error-handling callback will come in a later
patch. This patch only serves to surround block IO calls with macros.
Failures elswhere (e.g. when reading instance or tablet metadata) will
be handled in separate patches.

This is part of a series of patches to handle disk failure. See
section 2.2 of:
https://docs.google.com/document/d/1zZk-vb_ETKUuePcZ9ZqoSK2oPvAAaEV1sjDXes8Pxgk/edit?usp=sharing

Change-Id: Ia03bfb711a1b022d7516f4adb37fe9fb28ec949c
---
M src/kudu/fs/error_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/util/status.h
5 files changed, 248 insertions(+), 107 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/30/7030/18
-- 
To view, visit http://gerrit.cloudera.org:8080/7030
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia03bfb711a1b022d7516f4adb37fe9fb28ec949c
Gerrit-PatchSet: 18
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] WIP disk failure: handle disk failures in blocks

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/7030

to look at the new patch set (#13).

Change subject: WIP disk failure: handle disk failures in blocks
......................................................................

WIP disk failure: handle disk failures in blocks

This patch adds the logic required to prevent a crash on disk failure
during I/O to blocks.

Block- and container-level functions that call env functions (e.g.
ReadV)  that may result in disk failure can now run callbacks to fail
and shutdown tablets in the failed directory.

Failures elswhere (e.g. when reading instance or tablet metadata) will
be handled in separate patches.

WIP: As failure handling relies on the addition of a handful of other
features, end-to-end testing of handling disk failure will come in a
later patch.

Change-Id: Ia03bfb711a1b022d7516f4adb37fe9fb28ec949c
---
M src/kudu/fs/data_dirs.h
M src/kudu/fs/error_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/tablet/mvcc.cc
M src/kudu/tserver/tablet_server.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/ts_tablet_manager.h
M src/kudu/util/status.h
9 files changed, 252 insertions(+), 70 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/30/7030/13
-- 
To view, visit http://gerrit.cloudera.org:8080/7030
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia03bfb711a1b022d7516f4adb37fe9fb28ec949c
Gerrit-PatchSet: 13
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] disk failure: error-handling macros in blocks

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/7030

to look at the new patch set (#17).

Change subject: disk failure: error-handling macros in blocks
......................................................................

disk failure: error-handling macros in blocks

This patch adds macros to trigger error-handling code in the event of
disk failure during block IO. Block- and container-level functions that
call env or cache functions (e.g. File::Append, FileCache::DeleteFile)
that may result in disk failure can now run callbacks that, in the
future, will fail and shutdown tablets in the failed directory,
preventing further IO to the disk.

Disk failures are indicated by one of EIO, ENODEV, ENXIO, and EROFS,
although this list can be extended as needed in the future.

As failure-handling relies on the addition of a handful of other
features, the meat of the error-handling callback will come in a later
patch. This patch only serves to surround block IO calls with macros.
Failures elswhere (e.g. when reading instance or tablet metadata) will
be handled in separate patches.

This is part of a series of patches to handle disk failure. See
section 2.2 of:
https://docs.google.com/document/d/1zZk-vb_ETKUuePcZ9ZqoSK2oPvAAaEV1sjDXes8Pxgk/edit?usp=sharing

Change-Id: Ia03bfb711a1b022d7516f4adb37fe9fb28ec949c
---
M src/kudu/fs/error_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/util/status.h
5 files changed, 191 insertions(+), 67 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/30/7030/17
-- 
To view, visit http://gerrit.cloudera.org:8080/7030
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia03bfb711a1b022d7516f4adb37fe9fb28ec949c
Gerrit-PatchSet: 17
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] WIP disk failure: handle disk failures in blocks

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/7030

to look at the new patch set (#14).

Change subject: WIP disk failure: handle disk failures in blocks
......................................................................

WIP disk failure: handle disk failures in blocks

This patch adds the logic required to prevent a crash on disk failure
during I/O to blocks.

Block- and container-level functions that call env functions (e.g.
ReadV)  that may result in disk failure can now run callbacks to fail
and shutdown tablets in the failed directory.

Failures elswhere (e.g. when reading instance or tablet metadata) will
be handled in separate patches.

WIP: As failure handling relies on the addition of a handful of other
features, end-to-end testing of handling disk failure will come in a
later patch.

This is part of a series of patches to handle disk failure. See
section 2.2 of:
https://docs.google.com/document/d/1zZk-vb_ETKUuePcZ9ZqoSK2oPvAAaEV1sjDXes8Pxgk/edit?usp=sharing

Change-Id: Ia03bfb711a1b022d7516f4adb37fe9fb28ec949c
---
M src/kudu/fs/data_dirs.h
M src/kudu/fs/error_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/tablet/mvcc.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/ts_tablet_manager.h
M src/kudu/util/status.h
8 files changed, 239 insertions(+), 68 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/30/7030/14
-- 
To view, visit http://gerrit.cloudera.org:8080/7030
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia03bfb711a1b022d7516f4adb37fe9fb28ec949c
Gerrit-PatchSet: 14
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] disk failure: error-handling macros in blocks

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: disk failure: error-handling macros in blocks
......................................................................


Patch Set 17:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/7030/17/src/kudu/fs/file_block_manager.cc
File src/kudu/fs/file_block_manager.cc:

PS17, Line 534: Status sync = env_->SyncDir(s);
              :       RETURN_NOT_OK_HANDLE_DISK_FAILURE(sync,
              :           error_manager_->RunErrorNotificationCb(location.data_dir()));
Combine?

  RETURN_NOT_OK_HANDLE_DISK_FAILURE(env_->SyncDir(s),
          error_manager_->RunErrorNotificationCb(location.data_dir()));


http://gerrit.cloudera.org:8080/#/c/7030/14/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

Line 1095:   }
> HandleError() is called any place with RETURN_NOT_OK_HANDLE_ERROR(). You're
Right but I don't see any RETURN_NOT_OK_HANDLE_ERROR calls in LogWritableBlock functions. AFAICT, all the macro sites will resolve to other HandleError() instances, such as the one in LogBlockContainer.

Same goes for LogReadableBlock::HandleError.


Line 1745: }
> This still indicates that part of the tablet's data is probably lost due to
I see. I think the failure handling policy for a tablet is out of scope for the block managers (and thus they should operate in a way that's most predictable), but there's no point to implementing functionality that's never used.

Perhaps add a comment here (and in the equivalent place in FileBlockManager) explaining that we _could_ satisfy the block on a different data dir, but there's currently no point in doing that?


http://gerrit.cloudera.org:8080/#/c/7030/17/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

Line 1304:   void HandleError(const Status& s) const {
Also unnecessar


Line 2336:     RETURN_NOT_OK_PREPEND(env_->SyncDir(dir->dir()),
What about this one?


Line 2345: Status LogBlockManager::RewriteMetadataFile(const string& metadata_file_name,
And the calls in here?


http://gerrit.cloudera.org:8080/#/c/7030/17/src/kudu/tserver/ts_tablet_manager.cc
File src/kudu/tserver/ts_tablet_manager.cc:

There are still some changes to this file that don't belong in this patch. They're minor, but we can avoid some confusion if we move them.


-- 
To view, visit http://gerrit.cloudera.org:8080/7030
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia03bfb711a1b022d7516f4adb37fe9fb28ec949c
Gerrit-PatchSet: 17
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] disk failure: error-handling macros in blocks

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has submitted this change and it was merged.

Change subject: disk failure: error-handling macros in blocks
......................................................................


disk failure: error-handling macros in blocks

This patch adds macros to trigger error-handling code in the event of
disk failure during block IO. Block- and container-level functions that
call env or cache functions (e.g. File::Append, FileCache::DeleteFile)
that may result in disk failure can now run callbacks that, in the
future, will fail and shutdown tablets in the failed directory,
preventing further IO to the disk.

Disk failures are indicated by one of EIO, ENODEV, ENXIO, and EROFS,
although this list can be extended as needed in the future.

As failure-handling relies on the addition of a handful of other
features, the meat of the error-handling callback will come in a later
patch. This patch only serves to surround block IO calls with macros.
Failures elswhere (e.g. when reading instance or tablet metadata) will
be handled in separate patches.

This is part of a series of patches to handle disk failure. See
section 2.2 of:
https://docs.google.com/document/d/1zZk-vb_ETKUuePcZ9ZqoSK2oPvAAaEV1sjDXes8Pxgk/edit?usp=sharing

Change-Id: Ia03bfb711a1b022d7516f4adb37fe9fb28ec949c
Reviewed-on: http://gerrit.cloudera.org:8080/7030
Reviewed-by: Andrew Wong <aw...@cloudera.com>
Reviewed-by: Adar Dembo <ad...@cloudera.com>
Tested-by: Adar Dembo <ad...@cloudera.com>
---
M src/kudu/fs/block_manager.h
M src/kudu/fs/error_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/util/status.h
7 files changed, 257 insertions(+), 112 deletions(-)

Approvals:
  Andrew Wong: Looks good to me, but someone else must approve
  Adar Dembo: Looks good to me, approved; Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/7030
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ia03bfb711a1b022d7516f4adb37fe9fb28ec949c
Gerrit-PatchSet: 22
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] WIP disk failure: handle disk failures in blocks

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: WIP disk failure: handle disk failures in blocks
......................................................................


Patch Set 14:

(26 comments)

http://gerrit.cloudera.org:8080/#/c/7030/14/src/kudu/fs/data_dirs.h
File src/kudu/fs/data_dirs.h:

Line 268:   DataDir* FindDataDirByUuidIndex(uint16_t uuid_idx)const;
Mistake?


http://gerrit.cloudera.org:8080/#/c/7030/9/src/kudu/fs/error_manager.h
File src/kudu/fs/error_manager.h:

Line 56:   RETURN_NOT_OK(s_); \
> Done
Not done? Should be 'return s_'.


Line 61: #define HANDLE_DISK_FAILURE(status_expr, err_handler) do { \
> I would agree, but RETURN_NOT_OK_HANDLE and RETURN_NOT_OK_HANDLE_ERROR are 
Oh, yes, I see what you mean: HANDLE_DISK_FAILURE only considers actual disk failures, while RETURN_NOT_OK_HANDLE considers any non-OK status.

In that case, could you revert the name back to RETURN_NOT_OK_HANDLE (or RETURN_NOT_OK_RUN_HANDLER)? I agree that RETURN_NOT_OK_HANDLE_DISK_FAILURE is an inappropriate name.


http://gerrit.cloudera.org:8080/#/c/7030/14/src/kudu/fs/error_manager.h
File src/kudu/fs/error_manager.h:

PS14, Line 48: if it results in a disk
             : // failure
The handler is run on any failure, not just disk failures.


http://gerrit.cloudera.org:8080/#/c/7030/14/src/kudu/fs/file_block_manager.cc
File src/kudu/fs/file_block_manager.cc:

Line 238:   virtual void HandleError(const Status& s) const;
Shouldn't be virtual as it's not participating in the class hierarchy at all.


Line 240:   DataDir* mutable_data_dir() const {
Just combine into HandleError().


Line 386:   return !close.ok() ? close : sync;
Maybe do a single HandleError() down here instead?


Line 417:   virtual void HandleError(const Status& s) const;
Not virtual.


Line 419:   DataDir* mutable_data_dir() const {
This is only called by HandleError(); how about moving its implementation into HandlError() itself?


Line 482:   return reader_->Size(sz);
Can't this fail with an IO error too?


Line 539:       if (PREDICT_FALSE(IsDiskFailure(env_->SyncDir(s)))) {
We're no longer returning early if one SyncDir() fails. I don't necessarily disagree with that change, just wanted to make sure you made it intentionally.


Line 649:     if (PREDICT_FALSE(IsDiskFailure(s))) {
For this failure (and/or the next one), should we try a new data directory instead of returning the error?


Line 671:   if (PREDICT_FALSE(IsDiskFailure(s))) {
Shouldn't this (and the above) use HANDLE_DISK_FAILURE?


Line 700:     return Status::OK();
Shouldn't we return an actual error though? If we return OK, the tablet flush code will remove this block from the list of deleted blocks and won't retry its deletion, orphaning it.


http://gerrit.cloudera.org:8080/#/c/7030/9/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

Line 498: }
> Since this is a static method, the handling wraps the Create() method. With
What's missing? Open() is also a static method and you've managed to insert the failure handling into it.

Or, alternatively, if that results in too many handling sites, perhaps move the handling so it wraps Open() too? Then they're at least consistent.


http://gerrit.cloudera.org:8080/#/c/7030/14/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

PS14, Line 682:   // Handle any other errors, e.g. disk failures.
              :   HandleError(read_status);
Nit: move this under the "If we've made it here..." comment, since it applies to the HandleError() call too.


Line 794:   RETURN_NOT_OK(block_manager()->SyncContainer(*this));
RETURN_NOT_OK_HANDLE_ERROR?

Or, alternatively, add failure handling within SyncContainer() itself.


Line 1095:   virtual void HandleError(const Status& s) const {
Shouldn't be virtual.

On second look, I'm not seeing where HandleError() or mutable_data_dir() are called.


Line 1253:       RETURN_NOT_OK_PREPEND(AppendMetadata(), "Unable to flush block during close");
This is wrong, look at the comment on L1234-1236.


Line 1309:   virtual void HandleError(const Status& s) const {
Not virtual.


PS14, Line 1309:   virtual void HandleError(const Status& s) const {
               :     container_->HandleError(s);
               :   }
               : 
               :   DataDir* mutable_data_dir() {
               :     return container_->mutable_data_dir();
               :   }
Same; where is this used?


Line 1517:       bool is_on_ext4;
Bad conflict resolution here.


Line 1745:   RETURN_NOT_OK_PREPEND(s, "Could not create new log block container at " + dir->dir());
Hmm, but we could try a new data directory right?


http://gerrit.cloudera.org:8080/#/c/7030/14/src/kudu/tablet/mvcc.cc
File src/kudu/tablet/mvcc.cc:

Line 93:   CHECK_EQ(RESERVED, old_state) << "transaction with timestamp " << timestamp.ToString()
Belongs in the patch dealing with transactions?


http://gerrit.cloudera.org:8080/#/c/7030/14/src/kudu/tserver/ts_tablet_manager.cc
File src/kudu/tserver/ts_tablet_manager.cc:

Line 1094:     LOG(WARNING) << Substitute("Tablet $0 is located on an unhealthy data dir.", tablet_id);
Likewise, the changes to this file don't seem like they belong in this patch.


http://gerrit.cloudera.org:8080/#/c/7030/14/src/kudu/util/status.h
File src/kudu/util/status.h:

PS14, Line 466: As to not
              : // break ABI-compatability of the Status class, rather than changing the Status
              : // class, we use this function to determine disk failures.
Adding a new member function to Status doesn't break ABI compatibility, so if you'd rather do that, go for it.


-- 
To view, visit http://gerrit.cloudera.org:8080/7030
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia03bfb711a1b022d7516f4adb37fe9fb28ec949c
Gerrit-PatchSet: 14
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] disk failure: error-handling macros in blocks

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: disk failure: error-handling macros in blocks
......................................................................


Patch Set 18: Code-Review+2 Verified+1

Overriding Jenkins, looks like another Java client deadlock (which we've been seeing quite a few of recently)

-- 
To view, visit http://gerrit.cloudera.org:8080/7030
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia03bfb711a1b022d7516f4adb37fe9fb28ec949c
Gerrit-PatchSet: 18
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] disk failure: error-handling macros in blocks

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change.

Change subject: disk failure: error-handling macros in blocks
......................................................................


Patch Set 21: Code-Review+1

Seems like another java client deadlock.

-- 
To view, visit http://gerrit.cloudera.org:8080/7030
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia03bfb711a1b022d7516f4adb37fe9fb28ec949c
Gerrit-PatchSet: 21
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] disk failure: handle EIOs on I/O to blocks

Posted by "David Ribeiro Alves (Code Review)" <ge...@cloudera.org>.
David Ribeiro Alves has posted comments on this change.

Change subject: disk failure: handle EIOs on I/O to blocks
......................................................................


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/7030/3/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

PS3, Line 628: KUDU_RETURN_OR_HANDLE_EIO
I like this version of the macro (without the second call to HandleEIO()) . Maybe drop the KUDU_ prefix (or add an alias)
How about:
RETURN_NOT_OK_HANDLE_EIO()


http://gerrit.cloudera.org:8080/#/c/7030/3/src/kudu/util/status.h
File src/kudu/util/status.h:

PS3, Line 104: breif
typo


-- 
To view, visit http://gerrit.cloudera.org:8080/7030
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia03bfb711a1b022d7516f4adb37fe9fb28ec949c
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] disk failure: error-handling macros in blocks

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Adar Dembo,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/7030

to look at the new patch set (#19).

Change subject: disk failure: error-handling macros in blocks
......................................................................

disk failure: error-handling macros in blocks

This patch adds macros to trigger error-handling code in the event of
disk failure during block IO. Block- and container-level functions that
call env or cache functions (e.g. File::Append, FileCache::DeleteFile)
that may result in disk failure can now run callbacks that, in the
future, will fail and shutdown tablets in the failed directory,
preventing further IO to the disk.

Disk failures are indicated by one of EIO, ENODEV, ENXIO, and EROFS,
although this list can be extended as needed in the future.

As failure-handling relies on the addition of a handful of other
features, the meat of the error-handling callback will come in a later
patch. This patch only serves to surround block IO calls with macros.
Failures elswhere (e.g. when reading instance or tablet metadata) will
be handled in separate patches.

This is part of a series of patches to handle disk failure. See
section 2.2 of:
https://docs.google.com/document/d/1zZk-vb_ETKUuePcZ9ZqoSK2oPvAAaEV1sjDXes8Pxgk/edit?usp=sharing

Change-Id: Ia03bfb711a1b022d7516f4adb37fe9fb28ec949c
---
M src/kudu/fs/block_manager.h
M src/kudu/fs/error_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/util/status.h
7 files changed, 258 insertions(+), 112 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/30/7030/19
-- 
To view, visit http://gerrit.cloudera.org:8080/7030
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia03bfb711a1b022d7516f4adb37fe9fb28ec949c
Gerrit-PatchSet: 19
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] WIP disk failure: error-handling macros in blocks

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/7030

to look at the new patch set (#15).

Change subject: WIP disk failure: error-handling macros in blocks
......................................................................

WIP disk failure: error-handling macros in blocks

This patch adds macros to trigger error-handling code in the event of
disk failure during block IO. Block- and container-level functions that
call env or cache functions (e.g. File::Append, FileCache::DeleteFile)
that may result in disk failure can now run callbacks that, in the
future, will fail and shutdown tablets in the failed directory,
preventing further IO to the disk.

Failures elswhere (e.g. when reading instance or tablet metadata) will
be handled in separate patches.

Disk failures are indicated by one of EIO, ENODEV, ENXIO, and EROFS,
although this list can be extended as needed in the future.

WIP: As failure handling relies on the addition of a handful of other
features, end-to-end testing of handling disk failure will come in a
later patch. This patch only serves to cover calls with macros.

This is part of a series of patches to handle disk failure. See
section 2.2 of:
https://docs.google.com/document/d/1zZk-vb_ETKUuePcZ9ZqoSK2oPvAAaEV1sjDXes8Pxgk/edit?usp=sharing

Change-Id: Ia03bfb711a1b022d7516f4adb37fe9fb28ec949c
---
M src/kudu/fs/error_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/util/status.h
5 files changed, 196 insertions(+), 69 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/30/7030/15
-- 
To view, visit http://gerrit.cloudera.org:8080/7030
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia03bfb711a1b022d7516f4adb37fe9fb28ec949c
Gerrit-PatchSet: 15
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] WIP disk failure: coordinate disk failure handling

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/7030

to look at the new patch set (#4).

Change subject: WIP disk failure: coordinate disk failure handling
......................................................................

WIP disk failure: coordinate disk failure handling

This patch adds the logic required to prevent failure on disk failure.
Checks that previously depended on successful disk IO now yield if the
returned status' POSIX code matches one corresponding to disk failure.

For the most part, failure handling is done by the lowest abstraction to
touch disk: blocks and containers.

A set of tests are added in ts_disk_failure-itest.

TODO:
- crash if tablet metadata dir is bad
- don't crash if a disk can't be read at startup and we try to
  CheckIntegrity, but it's missing

Change-Id: Ia03bfb711a1b022d7516f4adb37fe9fb28ec949c
---
M src/kudu/fs/block_manager.h
M src/kudu/fs/error_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/fs-test-util.h
M src/kudu/fs/log_block_manager.cc
M src/kudu/master/sys_catalog.cc
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tablet/tablet_replica_mm_ops.cc
M src/kudu/tablet/transactions/transaction_driver.cc
M src/kudu/tablet/transactions/write_transaction.cc
M src/kudu/tserver/CMakeLists.txt
A src/kudu/tserver/ts_disk_failure-test.cc
14 files changed, 327 insertions(+), 55 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/30/7030/4
-- 
To view, visit http://gerrit.cloudera.org:8080/7030
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia03bfb711a1b022d7516f4adb37fe9fb28ec949c
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] disk failure: error-handling macros in blocks

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Adar Dembo, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/7030

to look at the new patch set (#20).

Change subject: disk failure: error-handling macros in blocks
......................................................................

disk failure: error-handling macros in blocks

This patch adds macros to trigger error-handling code in the event of
disk failure during block IO. Block- and container-level functions that
call env or cache functions (e.g. File::Append, FileCache::DeleteFile)
that may result in disk failure can now run callbacks that, in the
future, will fail and shutdown tablets in the failed directory,
preventing further IO to the disk.

Disk failures are indicated by one of EIO, ENODEV, ENXIO, and EROFS,
although this list can be extended as needed in the future.

As failure-handling relies on the addition of a handful of other
features, the meat of the error-handling callback will come in a later
patch. This patch only serves to surround block IO calls with macros.
Failures elswhere (e.g. when reading instance or tablet metadata) will
be handled in separate patches.

This is part of a series of patches to handle disk failure. See
section 2.2 of:
https://docs.google.com/document/d/1zZk-vb_ETKUuePcZ9ZqoSK2oPvAAaEV1sjDXes8Pxgk/edit?usp=sharing

Change-Id: Ia03bfb711a1b022d7516f4adb37fe9fb28ec949c
---
M src/kudu/fs/block_manager.h
M src/kudu/fs/error_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/util/status.h
7 files changed, 257 insertions(+), 112 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/30/7030/20
-- 
To view, visit http://gerrit.cloudera.org:8080/7030
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia03bfb711a1b022d7516f4adb37fe9fb28ec949c
Gerrit-PatchSet: 20
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] WIP disk failure: coordinate disk failure handling

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change.

Change subject: WIP disk failure: coordinate disk failure handling
......................................................................


Patch Set 9:

(23 comments)

http://gerrit.cloudera.org:8080/#/c/7030/9/src/kudu/fs/block_manager.h
File src/kudu/fs/block_manager.h:

Line 141:   virtual void HandleError(const Status& s, DataDir* dir) const = 0;
> Why does HandleError() need to be part of the external API to WritableBlock
Done


http://gerrit.cloudera.org:8080/#/c/7030/9/src/kudu/fs/error_manager.h
File src/kudu/fs/error_manager.h:

Line 45:   RETURN_NOT_OK(s_); \
> Wrong patch, but shouldn't this be just 'return s'? We've already establish
Done


Line 56:   RETURN_NOT_OK(s_); \
> Same.
Done


Line 61: #define HANDLE_DISK_FAILURE(status_expr, err_handler) do { \
> Since this is so similar to RETURN_NOT_OK_HANDLE would prefer if the macro 
I would agree, but RETURN_NOT_OK_HANDLE and RETURN_NOT_OK_HANDLE_ERROR are both non-specific to disk failures.


Line 64:       (err_handler); \
> Nit: indentation
Done


http://gerrit.cloudera.org:8080/#/c/7030/9/src/kudu/fs/file_block_manager.cc
File src/kudu/fs/file_block_manager.cc:

Line 370:     if (PREDICT_FALSE(IsDiskFailure(sync))) {
> Shouldn't these be HANDLE_DISK_FAILURE calls instead?
HandleError() should be sufficient (without the PREDICT_FALSE here)


http://gerrit.cloudera.org:8080/#/c/7030/9/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

Line 399:   void HandleError(const Status& s, DataDir* dir) const;
> This isn't an accessor; bring it up into the other section and doc it.
Done


Line 402:   friend class LogWritableBlock;
> Don't need? HandleError() is public after all.
Done


Line 498: Status LogBlockContainer::Create(LogBlockManager* block_manager,
> What about disk failures during container creation?
Since this is a static method, the handling wraps the Create() method. With the right input to Create(), this could also be handled in the Create() call itself to better align with the idea of only wrapping env calls.


Line 604:   // Open the existing metadata and data files for writing.
> What about failures here?
Done


Line 648:   RETURN_NOT_OK(block_manager()->env()->NewRandomAccessFile(metadata_path, &metadata_reader));
> What about failures here?
Done


Line 656:     read_status = pb_reader.ReadNextPB(&record);
> And here?
Done


Line 858:   return metadata_file_->Append(pb);
> Failures here?
Done


Line 881:     return metadata_file_->Sync();
> What about this one?
Done


Line 886: Status LogBlockContainer::ReopenMetadataWriter() {
> And failures here?
Done


Line 1090:   DataDir* mutable_data_dir() {
> Where is this used?
It was used in calls to HandleError (probably should've been private!). Since HandleError() doesn't need a specified dir, I've taken it out.


Line 1305:   DataDir* mutable_data_dir() {
> And this?
Same as above.


http://gerrit.cloudera.org:8080/#/c/7030/9/src/kudu/tablet/delta_tracker.cc
File src/kudu/tablet/delta_tracker.cc:

Line 696:   Status s = FlushDMS(old_dms.get(), &dfr, flush_type);
> Shouldn't we just RETURN_NOT_OK here and drop the CHECK? Isn't the expectat
Done


http://gerrit.cloudera.org:8080/#/c/7030/9/src/kudu/tablet/metadata.proto
File src/kudu/tablet/metadata.proto:

Line 160:   // The Tablet has been stopped after a failure. In this state, like FAILED,
> Why do we need a new state for this? Why isn't FAILED sufficient?
When running into a disk failure, there's eventually a call to TabletReplica::Shutdown(). The final state of this has indicate that the tablet is also shutdown (to ensure Shutdown() remains idempotent).


http://gerrit.cloudera.org:8080/#/c/7030/9/src/kudu/tablet/tablet_replica_mm_ops.cc
File src/kudu/tablet/tablet_replica_mm_ops.cc:

Line 138:     tablet_replica_->tablet()->rowsets_flush_sem_.unlock();
> Can you use an RAII-style guard for the acquisition forof rowsets_flush_sem
Hrm, I suppose it could pass rowsets_flush_sem_ ownership to some ScopedLock.


Line 196:                             max_idx_to_replay_size);
> Nit: indentation.
Done


http://gerrit.cloudera.org:8080/#/c/7030/9/src/kudu/tserver/tserver.proto
File src/kudu/tserver/tserver.proto:

Line 97:     // The tablet is in a failed state.
> TABLET_NOT_RUNNING is insufficiently descriptive?
This is here to indicate to the leader which tablets that need to be replicated (TABLET_FAILED). TABLET_NOT_RUNNING shouldn't trigger replication.


http://gerrit.cloudera.org:8080/#/c/7030/9/src/kudu/util/status.h
File src/kudu/util/status.h:

Line 433: inline bool IsDiskFailure(const Status& s) {
> Would be nice to add a comment to each explaining why it's included.
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7030
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia03bfb711a1b022d7516f4adb37fe9fb28ec949c
Gerrit-PatchSet: 9
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] WIP disk failure: coordinate disk failure handling

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/7030

to look at the new patch set (#6).

Change subject: WIP disk failure: coordinate disk failure handling
......................................................................

WIP disk failure: coordinate disk failure handling

This patch adds the logic required to prevent a crash on disk failure.
Checks that previously depended on successful disk IO now yield if the
returned status' POSIX code matches one corresponding to disk failure.

For the most part, failure handling is done by the lowest abstraction to
touch disk: blocks and containers.

A set of basic tests are added in ts_disk_failure-itest.

TODO:
- crash if tablet metadata dir is bad
- don't crash if a disk can't be read at startup and we try to
  CheckIntegrity, but it's missing

Change-Id: Ia03bfb711a1b022d7516f4adb37fe9fb28ec949c
---
M src/kudu/consensus/consensus_peers.cc
M src/kudu/consensus/consensus_queue.cc
M src/kudu/fs/block_manager.h
M src/kudu/fs/error_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/fs-test-util.h
M src/kudu/fs/fs_manager.h
M src/kudu/fs/log_block_manager.cc
M src/kudu/master/sys_catalog.cc
M src/kudu/tablet/delta_tracker.cc
M src/kudu/tablet/metadata.proto
M src/kudu/tablet/mvcc.cc
M src/kudu/tablet/mvcc.h
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
M src/kudu/tablet/tablet_replica_mm_ops.cc
M src/kudu/tablet/transactions/alter_schema_transaction.h
M src/kudu/tablet/transactions/transaction.h
M src/kudu/tablet/transactions/transaction_driver.cc
M src/kudu/tablet/transactions/transaction_driver.h
M src/kudu/tablet/transactions/transaction_tracker-test.cc
M src/kudu/tablet/transactions/transaction_tracker.cc
M src/kudu/tablet/transactions/transaction_tracker.h
M src/kudu/tablet/transactions/write_transaction.cc
M src/kudu/tablet/transactions/write_transaction.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tserver/CMakeLists.txt
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/tserver/tablet_service.cc
A src/kudu/tserver/ts_disk_failure-test.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/tserver.proto
M src/kudu/util/status.h
35 files changed, 488 insertions(+), 80 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/30/7030/6
-- 
To view, visit http://gerrit.cloudera.org:8080/7030
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia03bfb711a1b022d7516f4adb37fe9fb28ec949c
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] disk failure: error-handling macros in blocks

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change.

Change subject: disk failure: error-handling macros in blocks
......................................................................


Patch Set 17:

(26 comments)

http://gerrit.cloudera.org:8080/#/c/7030/14/src/kudu/fs/data_dirs.h
File src/kudu/fs/data_dirs.h:

Line 268:   DataDir* FindDataDirByUuidIndex(uint16_t uuid_idx) const;
> Mistake?
Done


http://gerrit.cloudera.org:8080/#/c/7030/14/src/kudu/fs/error_manager.h
File src/kudu/fs/error_manager.h:

PS14, Line 48: 
             :   return s
> The handler is run on any failure, not just disk failures.
Changed this to call on disk failures, which I found to actually be useful.


http://gerrit.cloudera.org:8080/#/c/7030/14/src/kudu/fs/file_block_manager.cc
File src/kudu/fs/file_block_manager.cc:

Line 238:   void HandleError(const Status& s) const;
> Shouldn't be virtual as it's not participating in the class hierarchy at al
Done


Line 240:  private:
> Just combine into HandleError().
Done


Line 386: 
> Maybe do a single HandleError() down here instead?
I left this as two separate calls error-handling calls (e.g. if they have different errors), but I did move them down here.


Line 417:  private:
> Not virtual.
Done


Line 419:   FileBlockManager* block_manager_;
> This is only called by HandleError(); how about moving its implementation i
Done


Line 482:   return ReadV(offset, &results);
> Can't this fail with an IO error too?
Done


Line 539:   return Status::OK();
> We're no longer returning early if one SyncDir() fails. I don't necessarily
This was unintentional, thanks for checking. Here I think it'd be best to keep the earlier behavior but with the error handling callback.


Line 649:   if (s.ok()) {
> For this failure (and/or the next one), should we try a new data directory 
I'm not sure what that would buy us since at that point, we already know a slice of the tablet is effectively unusable and the tablet will be shut down.


Line 671:       error_manager_->RunErrorNotificationCb(dd_manager()->FindDataDirByUuidIndex( \
> Shouldn't this (and the above) use HANDLE_DISK_FAILURE?
Done


Line 700: 
> Shouldn't we return an actual error though? If we return OK, the tablet flu
Right. Actually I'll wrap file_cache_.DeleteFile() with a macro, since it should be caught there anyway. Same with the above.


http://gerrit.cloudera.org:8080/#/c/7030/9/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

Line 498: 
> What's missing? Open() is also a static method and you've managed to insert
Fair point. A couple more macros should do it.


http://gerrit.cloudera.org:8080/#/c/7030/14/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

PS14, Line 682:   HandleError(read_status);
              :   return read_status;
> Nit: move this under the "If we've made it here..." comment, since it appli
Done


Line 794:   scoped_refptr<LogBlock> lb = block_manager()->AddLogBlock(
> RETURN_NOT_OK_HANDLE_ERROR?
Done. I put it in SyncContainer.


Line 1095:   }
> Shouldn't be virtual.
HandleError() is called any place with RETURN_NOT_OK_HANDLE_ERROR(). You're right that mutable_data_dir() isn't needed anymore.


Line 1253:       VLOG(3) << "Syncing block " << id();
> This is wrong, look at the comment on L1234-1236.
Ah I see, thanks for pointing that out. Done


Line 1309:   // The owning container. Must outlive this block.
> Not virtual.
Done


PS14, Line 1309:   // The owning container. Must outlive this block.
               :   LogBlockContainer* container_;
               : 
               :   // A reference to this block's metadata.
               :   scoped_refptr<internal::LogBlock> log_block_;
               : 
               :   /
> Same; where is this used?
Done


Line 1517:         if (untested_block_size) {
> Bad conflict resolution here.
Done


Line 1745: }
> Hmm, but we could try a new data directory right?
This still indicates that part of the tablet's data is probably lost due to disk failure, so, in the context of disk failures, I think it makes sense to error out here and shut down the tablet.


http://gerrit.cloudera.org:8080/#/c/7030/14/src/kudu/tablet/mvcc.cc
File src/kudu/tablet/mvcc.cc:

Line 93:   CHECK_EQ(old_state, RESERVED) << "transaction with timestamp " << timestamp.ToString()
> Belongs in the patch dealing with transactions?
Done


http://gerrit.cloudera.org:8080/#/c/7030/14/src/kudu/tserver/ts_tablet_manager.cc
File src/kudu/tserver/ts_tablet_manager.cc:

Line 1094
> Likewise, the changes to this file don't seem like they belong in this patc
Done


http://gerrit.cloudera.org:8080/#/c/7030/15/src/kudu/tserver/ts_tablet_manager.cc
File src/kudu/tserver/ts_tablet_manager.cc:

Line 602:       return Status::NotFound("Tablet not found", tablet_id);
> error: no viable conversion from 'scoped_refptr<kudu::tablet::TabletReplica
Done


Line 607:       *error_code = TabletServerErrorPB::TABLET_NOT_RUNNING;
> error: use of undeclared identifier 'reason' [clang-diagnostic-error]
Done


http://gerrit.cloudera.org:8080/#/c/7030/14/src/kudu/util/status.h
File src/kudu/util/status.h:

PS14, Line 466: 
              : 
              : inline Status& Status::operator=(Status&& s) {
> Adding a new member function to Status doesn't break ABI compatibility, so 
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7030
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia03bfb711a1b022d7516f4adb37fe9fb28ec949c
Gerrit-PatchSet: 17
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] disk failure: error-handling macros in blocks

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: disk failure: error-handling macros in blocks
......................................................................


Patch Set 19: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7030
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia03bfb711a1b022d7516f4adb37fe9fb28ec949c
Gerrit-PatchSet: 19
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] WIP disk failure: coordinate disk failure handling

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/7030

to look at the new patch set (#5).

Change subject: WIP disk failure: coordinate disk failure handling
......................................................................

WIP disk failure: coordinate disk failure handling

This patch adds the logic required to prevent a crash on disk failure.
Checks that previously depended on successful disk IO now yield if the
returned status' POSIX code matches one corresponding to disk failure.

For the most part, failure handling is done by the lowest abstraction to
touch disk: blocks and containers.

A set of basic tests are added in ts_disk_failure-itest.

TODO:
- crash if tablet metadata dir is bad
- don't crash if a disk can't be read at startup and we try to
  CheckIntegrity, but it's missing

Change-Id: Ia03bfb711a1b022d7516f4adb37fe9fb28ec949c
---
M src/kudu/consensus/consensus_peers.cc
M src/kudu/consensus/consensus_queue.cc
M src/kudu/fs/block_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/fs-test-util.h
M src/kudu/fs/fs_manager.h
M src/kudu/fs/log_block_manager.cc
M src/kudu/master/sys_catalog.cc
M src/kudu/tablet/delta_tracker.cc
M src/kudu/tablet/metadata.proto
M src/kudu/tablet/mvcc.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
M src/kudu/tablet/tablet_replica_mm_ops.cc
M src/kudu/tablet/transactions/transaction_driver.cc
M src/kudu/tablet/transactions/write_transaction.cc
M src/kudu/tserver/CMakeLists.txt
M src/kudu/tserver/tablet_service.cc
A src/kudu/tserver/ts_disk_failure-test.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/tserver.proto
24 files changed, 408 insertions(+), 72 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/30/7030/5
-- 
To view, visit http://gerrit.cloudera.org:8080/7030
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia03bfb711a1b022d7516f4adb37fe9fb28ec949c
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] WIP disk failure: coordinate disk failure handling

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/7030

to look at the new patch set (#7).

Change subject: WIP disk failure: coordinate disk failure handling
......................................................................

WIP disk failure: coordinate disk failure handling

This patch adds the logic required to prevent a crash on disk failure.

Disk failure handling happens in a few places:
- block/container-level functions that call env functions that may result in
disk failure can run callbacks to fail/shutdown tablets in the parent data dir
- tablet-level functions that CHECK for failures are ended early if the tablet
is known to have data on a bad disk
- transactions can now be canceled to force a shutdown of a tablet replica
- tablets in FAILED or the new FAILED_AND_SHUTDOWN state will trigger
replication

A set of basic tests are added in ts_disk_failure-itest.

TODO:
- crash if tablet metadata dir is bad
- don't crash if a disk can't be read at startup and we try to
  CheckIntegrity, but it's missing

Change-Id: Ia03bfb711a1b022d7516f4adb37fe9fb28ec949c
---
M src/kudu/consensus/consensus_peers.cc
M src/kudu/consensus/consensus_queue.cc
M src/kudu/fs/block_manager.h
M src/kudu/fs/error_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/fs-test-util.h
M src/kudu/fs/fs_manager.h
M src/kudu/fs/log_block_manager.cc
M src/kudu/master/sys_catalog.cc
M src/kudu/tablet/delta_tracker.cc
M src/kudu/tablet/metadata.proto
M src/kudu/tablet/mvcc.cc
M src/kudu/tablet/mvcc.h
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
M src/kudu/tablet/tablet_replica_mm_ops.cc
M src/kudu/tablet/transactions/alter_schema_transaction.h
M src/kudu/tablet/transactions/transaction.h
M src/kudu/tablet/transactions/transaction_driver.cc
M src/kudu/tablet/transactions/transaction_driver.h
M src/kudu/tablet/transactions/transaction_tracker-test.cc
M src/kudu/tablet/transactions/transaction_tracker.cc
M src/kudu/tablet/transactions/transaction_tracker.h
M src/kudu/tablet/transactions/write_transaction.cc
M src/kudu/tablet/transactions/write_transaction.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tserver/CMakeLists.txt
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/tserver/tablet_service.cc
A src/kudu/tserver/ts_disk_failure-test.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/tserver.proto
M src/kudu/util/status.h
35 files changed, 490 insertions(+), 80 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/30/7030/7
-- 
To view, visit http://gerrit.cloudera.org:8080/7030
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia03bfb711a1b022d7516f4adb37fe9fb28ec949c
Gerrit-PatchSet: 7
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] disk failure: handle EIOs on I/O to blocks

Posted by "David Ribeiro Alves (Code Review)" <ge...@cloudera.org>.
David Ribeiro Alves has posted comments on this change.

Change subject: disk failure: handle EIOs on I/O to blocks
......................................................................


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7030/3/src/kudu/cfile/cfile_reader.cc
File src/kudu/cfile/cfile_reader.cc:

PS3, Line 184: KUDU_RETURN_OR_HANDLE(EIO, block_->Read(0, &mal), block_->HandleEIO());
don't quite follow this. if the read fails with an EIO will this continue to do anything? doesn't seem like it should but maybe I'm misunderstanding. also if the block knows how to handle EIOs why does this need to make an explicit call?

like I said in the other patch, give me a bit to come up with a suggestion.


-- 
To view, visit http://gerrit.cloudera.org:8080/7030
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia03bfb711a1b022d7516f4adb37fe9fb28ec949c
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] WIP disk failure: coordinate disk failure handling

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change.

Change subject: WIP disk failure: coordinate disk failure handling
......................................................................


Patch Set 8:

(10 comments)

This patch is getting a bit big and there are a few pieces that can be separated. Will break it apart.

http://gerrit.cloudera.org:8080/#/c/7030/3/src/kudu/cfile/cfile_reader.cc
File src/kudu/cfile/cfile_reader.cc:

PS3, Line 184: RETURN_NOT_OK(block_->Read(0, &mal));
> don't quite follow this. if the read fails with an EIO will this continue t
This acts like a RETURN_NOT_OK with an extra step to check the posix code.

You're right though, they shouldn't be in this layer; removed.


http://gerrit.cloudera.org:8080/#/c/7030/4/src/kudu/fs/fs-test-util.h
File src/kudu/fs/fs-test-util.h:

Line 86:   void HandleError(const Status& /* s */, DataDir* /* dir */) const override {
> warning: parameter 'dir' is unused [misc-unused-parameters]
Done


Line 86:   void HandleError(const Status& /* s */, DataDir* /* dir */) const override {
> warning: parameter 's' is unused [misc-unused-parameters]
Done


Line 87:   }
> warning: redundant return statement at the end of a function with a void re
Done


http://gerrit.cloudera.org:8080/#/c/7030/4/src/kudu/tablet/tablet.cc
File src/kudu/tablet/tablet.cc:

Line 851:   // a batch which contains some duplicate keys) we need to do so now.
> warning: do not use 'else' after 'return' [readability-else-after-return]
Done


http://gerrit.cloudera.org:8080/#/c/7030/4/src/kudu/tserver/ts_disk_failure-test.cc
File src/kudu/tserver/ts_disk_failure-test.cc:

Line 9: #include "kudu/gutil/strings/substitute.h"
> warning: #includes are not sorted properly [llvm-include-order]
Done


Line 28: using strings::Substitute;
> warning: using decl 'set' is unused [misc-unused-using-decls]
Done


Line 34: class TSDiskFailureTest : public TabletServerTestBase {
> warning: using decl 'RpcController' is unused [misc-unused-using-decls]
Done


Line 36:   virtual void SetUp() override {
> warning: using decl 'TabletReplica' is unused [misc-unused-using-decls]
Done


Line 37:     TabletServerTestBase::SetUp();
> warning: using decl 'TabletStatePB' is unused [misc-unused-using-decls]
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7030
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia03bfb711a1b022d7516f4adb37fe9fb28ec949c
Gerrit-PatchSet: 8
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] disk failure: error-handling macros in blocks

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change.

Change subject: disk failure: error-handling macros in blocks
......................................................................


Patch Set 18:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/7030/17/src/kudu/fs/file_block_manager.cc
File src/kudu/fs/file_block_manager.cc:

PS17, Line 534: RETURN_NOT_OK_HANDLE_DISK_FAILURE(env_->SyncDir(s),
              :           error_manager_->RunErrorNotificationCb(location.data_dir()));
              :     }
> Combine?
Ah thanks, this was a bit different in an intermediate-version. Done


http://gerrit.cloudera.org:8080/#/c/7030/14/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

Line 1095:   LogBlockContainer* container_;
> Right but I don't see any RETURN_NOT_OK_HANDLE_ERROR calls in LogWritableBl
Got it, done.


Line 1745:     return;
> I see. I think the failure handling policy for a tablet is out of scope for
Done


http://gerrit.cloudera.org:8080/#/c/7030/17/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

Line 1304:   // A reference to this block's metadata.
> Also unnecessar
Ahh, right, because IO is put off to the containers. Didn't catch that earlier.


Line 2336: 
> What about this one?
Done


Line 2345:   // Disk failures do not suffer from this issue because, on the next startup,
> And the calls in here?
Done


http://gerrit.cloudera.org:8080/#/c/7030/17/src/kudu/tserver/ts_tablet_manager.cc
File src/kudu/tserver/ts_tablet_manager.cc:

> There are still some changes to this file that don't belong in this patch. 
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7030
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia03bfb711a1b022d7516f4adb37fe9fb28ec949c
Gerrit-PatchSet: 18
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] WIP disk failure: coordinate disk failure handling

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/7030

to look at the new patch set (#10).

Change subject: WIP disk failure: coordinate disk failure handling
......................................................................

WIP disk failure: coordinate disk failure handling

This patch adds the logic required to prevent a crash on disk failure.

Disk failure handling happens in a few places:
- block/container-level functions that call env functions that may
  result in disk failure can run callbacks to fail/shutdown tablets in
  the parent data dir
- tablet-level functions that CHECK for failures are ended early if the
  tablet is known to have data on a bad disk
- transactions can now be canceled to force a shutdown of a tablet
  replica instead of waiting for transactions to complete
- tablets in FAILED or the new FAILED_AND_SHUTDOWN state will trigger
  replication
- failure at startup (IO to instance files) is covered in a later patch

A set of basic tests are added in ts_disk_failure-itest.

TODO:
- crash if tablet metadata dir is bad

Change-Id: Ia03bfb711a1b022d7516f4adb37fe9fb28ec949c
---
M src/kudu/consensus/consensus_peers.cc
M src/kudu/consensus/consensus_queue.cc
M src/kudu/fs/error_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/master/sys_catalog.cc
M src/kudu/tablet/delta_tracker.cc
M src/kudu/tablet/metadata.proto
M src/kudu/tablet/mvcc.cc
M src/kudu/tablet/mvcc.h
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
M src/kudu/tablet/tablet_replica_mm_ops.cc
M src/kudu/tablet/transactions/alter_schema_transaction.h
M src/kudu/tablet/transactions/transaction.h
M src/kudu/tablet/transactions/transaction_driver.cc
M src/kudu/tablet/transactions/transaction_driver.h
M src/kudu/tablet/transactions/transaction_tracker-test.cc
M src/kudu/tablet/transactions/transaction_tracker.cc
M src/kudu/tablet/transactions/transaction_tracker.h
M src/kudu/tablet/transactions/write_transaction.cc
M src/kudu/tablet/transactions/write_transaction.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tserver/CMakeLists.txt
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/tserver/tablet_service.cc
A src/kudu/tserver/ts_disk_failure-test.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/tserver.proto
M src/kudu/util/status.h
32 files changed, 511 insertions(+), 101 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/30/7030/10
-- 
To view, visit http://gerrit.cloudera.org:8080/7030
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia03bfb711a1b022d7516f4adb37fe9fb28ec949c
Gerrit-PatchSet: 10
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] disk failure: error-handling macros in blocks

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: disk failure: error-handling macros in blocks
......................................................................


Patch Set 21: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/7030
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia03bfb711a1b022d7516f4adb37fe9fb28ec949c
Gerrit-PatchSet: 21
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] WIP disk failure: coordinate disk failure handling

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/7030

to look at the new patch set (#11).

Change subject: WIP disk failure: coordinate disk failure handling
......................................................................

WIP disk failure: coordinate disk failure handling

This patch adds the logic required to prevent a crash on disk failure.

Disk failure handling happens in a few places:
- block/container-level functions that call env functions that may
  result in disk failure can run callbacks to fail/shutdown tablets in
  the parent data dir
- tablet-level functions that CHECK for failures are ended early if the
  tablet is known to have data on a bad disk
- transactions can now be canceled to force a shutdown of a tablet
  replica instead of waiting for transactions to complete
- tablets in FAILED or the new FAILED_AND_SHUTDOWN state will trigger
  replication
- scans that run into a disk failure will return with the new
  TABLET_FAILED state, or find another server if fault tolerant
- failure at startup (IO to instance files) is covered in a later patch

A set of basic tests are added in ts_disk_failure-itest.

TODO:
- crash if tablet metadata dir is bad and when opening the tablet
  metadata

Change-Id: Ia03bfb711a1b022d7516f4adb37fe9fb28ec949c
---
M src/kudu/client/scanner-internal.cc
M src/kudu/consensus/consensus_peers.cc
M src/kudu/consensus/consensus_queue.cc
M src/kudu/fs/data_dirs.h
M src/kudu/fs/error_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/master/sys_catalog.cc
M src/kudu/tablet/delta_tracker.cc
M src/kudu/tablet/metadata.proto
M src/kudu/tablet/mvcc.cc
M src/kudu/tablet/mvcc.h
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
M src/kudu/tablet/tablet_replica_mm_ops.cc
M src/kudu/tablet/transactions/alter_schema_transaction.h
M src/kudu/tablet/transactions/transaction.h
M src/kudu/tablet/transactions/transaction_driver.cc
M src/kudu/tablet/transactions/transaction_driver.h
M src/kudu/tablet/transactions/transaction_tracker-test.cc
M src/kudu/tablet/transactions/transaction_tracker.cc
M src/kudu/tablet/transactions/transaction_tracker.h
M src/kudu/tablet/transactions/write_transaction.cc
M src/kudu/tablet/transactions/write_transaction.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tserver/CMakeLists.txt
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/tserver/tablet_service.cc
A src/kudu/tserver/ts_disk_failure-test.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/ts_tablet_manager.h
M src/kudu/tserver/tserver.proto
M src/kudu/util/status.h
35 files changed, 585 insertions(+), 119 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/30/7030/11
-- 
To view, visit http://gerrit.cloudera.org:8080/7030
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia03bfb711a1b022d7516f4adb37fe9fb28ec949c
Gerrit-PatchSet: 11
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>