You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Andrew Wong (Code Review)" <ge...@cloudera.org> on 2017/11/01 19:29:07 UTC

[kudu-CR] disk failure: pass Statuses instead of failing CHECKs along the Flush DMS path

Andrew Wong has uploaded this change for review. ( http://gerrit.cloudera.org:8080/8441


Change subject: disk failure: pass Statuses instead of failing CHECKs along the Flush DMS path
......................................................................

disk failure: pass Statuses instead of failing CHECKs along the Flush DMS path

There are various fatal safety checks along paths surrounding deltas and
the flushing of them. Some of these checks seem redundant given the
layering of these calls; i.e. the fatality of an operation can be
handled at a single point by passing around Statuses.

This patch updates the Flush DMS path to make this the case.

Change-Id: I0241cd4206ce77ef1fb334458b091bc2092f4141
---
M src/kudu/tablet/delta_tracker.cc
M src/kudu/tablet/delta_tracker.h
M src/kudu/tablet/diskrowset-test.cc
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/rowset.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet_replica_mm_ops.cc
8 files changed, 92 insertions(+), 36 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/41/8441/1
-- 
To view, visit http://gerrit.cloudera.org:8080/8441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I0241cd4206ce77ef1fb334458b091bc2092f4141
Gerrit-Change-Number: 8441
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>

[kudu-CR] tablet: make various update paths atomic

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Alexey Serbin, Mike Percy, Kudu Jenkins, Adar Dembo, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/8441

to look at the new patch set (#6).

Change subject: tablet: make various update paths atomic
......................................................................

tablet: make various update paths atomic

A few codepaths in the tablet subsystem aren't atomic, i.e. if they fail
mid-method (e.g. due to a file error), they leave some in-memory
structures updated and others untouched. This has been safe because we
CHECKed to ensure their success. In preparation for _not_ crashing in
these methods, this patch refactors some of these functions to be
atomic, and notes others that still have the possibility of failing in
such a state (these calls still trigger a CHECK failure).

There are no functional changes in this patch.

Change-Id: I0241cd4206ce77ef1fb334458b091bc2092f4141
---
M src/kudu/tablet/delta_compaction.cc
M src/kudu/tablet/delta_compaction.h
M src/kudu/tablet/delta_tracker.cc
M src/kudu/tablet/delta_tracker.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/metadata-test.cc
M src/kudu/tablet/rowset_metadata.cc
M src/kudu/tablet/rowset_metadata.h
9 files changed, 223 insertions(+), 169 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/41/8441/6
-- 
To view, visit http://gerrit.cloudera.org:8080/8441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0241cd4206ce77ef1fb334458b091bc2092f4141
Gerrit-Change-Number: 8441
Gerrit-PatchSet: 6
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] tablet: check for stopped in drivers of IO

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8441 )

Change subject: tablet: check for stopped in drivers of IO
......................................................................


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8441/4/src/kudu/tablet/diskrowset.cc
File src/kudu/tablet/diskrowset.cc:

http://gerrit.cloudera.org:8080/#/c/8441/4/src/kudu/tablet/diskrowset.cc@592
PS4, Line 592: RETURN_NOT_OK
> A couple of questions:
The CHECK_OK was to ensure this entire function happened atomically, not to ensure that UpdateDeltaTracker succeeds.

If this returns not OK right here, rowset_metadata::CommitUpdate() has already happened, but the DeltaTracker may not be updated, and our base data is also not updated, and it's hart to roll the RowSetMetadata back without potentially dropping updates on the floor.



-- 
To view, visit http://gerrit.cloudera.org:8080/8441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0241cd4206ce77ef1fb334458b091bc2092f4141
Gerrit-Change-Number: 8441
Gerrit-PatchSet: 4
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Fri, 17 Nov 2017 00:29:29 +0000
Gerrit-HasComments: Yes

[kudu-CR] tablet: make various update paths atomic

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8441 )

Change subject: tablet: make various update paths atomic
......................................................................


Patch Set 7:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8441/6/src/kudu/tablet/rowset_metadata.h
File src/kudu/tablet/rowset_metadata.h:

http://gerrit.cloudera.org:8080/#/c/8441/6/src/kudu/tablet/rowset_metadata.h@94
PS6, Line 94:   void AddOrphanedBlocks(const std::vector<BlockId>& blocks);
> warning: function 'kudu::tablet::RowSetMetadata::AddOrphanedBlocks' has a d
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/8441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0241cd4206ce77ef1fb334458b091bc2092f4141
Gerrit-Change-Number: 8441
Gerrit-PatchSet: 7
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Mon, 20 Nov 2017 20:25:21 +0000
Gerrit-HasComments: Yes

[kudu-CR] tablet: make various update paths atomic

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/8441 )

Change subject: tablet: make various update paths atomic
......................................................................


Patch Set 11:

Any idea what the coverage of these code paths look like? eg with your injected failures test, do these paths actually get hit? Maybe it's worth some WARNING when these more complex cases get hit, so if we do have some subtle bug here, we will have smoking guns in the log that make us associate the issue with this complex rollback stuff.


-- 
To view, visit http://gerrit.cloudera.org:8080/8441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0241cd4206ce77ef1fb334458b091bc2092f4141
Gerrit-Change-Number: 8441
Gerrit-PatchSet: 11
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Tue, 21 Nov 2017 06:10:53 +0000
Gerrit-HasComments: No

[kudu-CR] disk failure: make various delta paths more robust to errors

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has removed a vote on this change.

Change subject: disk failure: make various delta paths more robust to errors
......................................................................


Removed Verified-1 by Kudu Jenkins (120)
-- 
To view, visit http://gerrit.cloudera.org:8080/8441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: I0241cd4206ce77ef1fb334458b091bc2092f4141
Gerrit-Change-Number: 8441
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] tablet: check for stopped in drivers of IO

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/8441 )

Change subject: tablet: check for stopped in drivers of IO
......................................................................


Patch Set 4:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/8441/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8441/4//COMMIT_MSG@15
PS4, Line 15: before returning, it
            : must again check that the tablet has been stopped
per comment in the code, not sure why you have to check on the way out in the success case?


http://gerrit.cloudera.org:8080/#/c/8441/4/src/kudu/tablet/tablet.cc
File src/kudu/tablet/tablet.cc:

http://gerrit.cloudera.org:8080/#/c/8441/4/src/kudu/tablet/tablet.cc@293
PS4, Line 293:   return CheckActiveState();
it strikes me as odd here (and a few places below) that you are calling CheckActiveState on the way out of the success-path of these functions. I would expect that if I got an error here that it wouldn't have transitioned to a new state


http://gerrit.cloudera.org:8080/#/c/8441/4/src/kudu/tablet/tablet.cc@369
PS4, Line 369:   return CheckActiveState();
same here - this check is already racy, right?


http://gerrit.cloudera.org:8080/#/c/8441/4/src/kudu/tablet/tablet.cc@957
PS4, Line 957:         return CheckActiveState();
same, not sure of the point of doing these on the way out and not just on the way in



-- 
To view, visit http://gerrit.cloudera.org:8080/8441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0241cd4206ce77ef1fb334458b091bc2092f4141
Gerrit-Change-Number: 8441
Gerrit-PatchSet: 4
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Tue, 14 Nov 2017 00:42:36 +0000
Gerrit-HasComments: Yes

[kudu-CR] disk failure: pass Statuses instead of failing CHECKs along the Flush DMS path

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/8441 )

Change subject: disk failure: pass Statuses instead of failing CHECKs along the Flush DMS path
......................................................................


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/delta_tracker.cc
File src/kudu/tablet/delta_tracker.cc:

http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/delta_tracker.cc@244
PS1, Line 244: size_t i = 0; i < list.size() - 1; i++
It's not introduced in this changelist, but I suspect it would it be an issue if list.size() == 0.

Consider rewriting with
for (size_t i = 1; i < list.size(); i++) {
  RETURN_NOT_OK_PREPEND(ValidateDeltaOrder(list[i - 1], list[i], type),
        "Failed to validate delta order");
}


http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/diskrowset.cc
File src/kudu/tablet/diskrowset.cc:

http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/diskrowset.cc@560
PS1, Line 560: MajorCompactDeltaStoresWithColumnIds
I tried to track this up to the call chains and in the chains

[Major|Minor]DeltaCompactionOp::Perform()
Tablet::CompactWorstDeltas()
DiskRowSet::MajorCompactDeltaStoresWithColumnIds()

expecting to see CHECK_OK() if compaction->UpdateDeltaTracker(delta_tracker_.get()) at line 590 returned non-OK status, but I didn't find it.  Did I miss something here?



-- 
To view, visit http://gerrit.cloudera.org:8080/8441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0241cd4206ce77ef1fb334458b091bc2092f4141
Gerrit-Change-Number: 8441
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Thu, 02 Nov 2017 04:41:10 +0000
Gerrit-HasComments: Yes

[kudu-CR] tablet: make various update paths atomic

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change. ( http://gerrit.cloudera.org:8080/8441 )

Change subject: tablet: make various update paths atomic
......................................................................


Patch Set 14: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/8441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0241cd4206ce77ef1fb334458b091bc2092f4141
Gerrit-Change-Number: 8441
Gerrit-PatchSet: 14
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Wed, 22 Nov 2017 03:02:46 +0000
Gerrit-HasComments: No

[kudu-CR] tablet: make various update paths atomic

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8441 )

Change subject: tablet: make various update paths atomic
......................................................................


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8441/3/src/kudu/tablet/diskrowset.cc
File src/kudu/tablet/diskrowset.cc:

http://gerrit.cloudera.org:8080/#/c/8441/3/src/kudu/tablet/diskrowset.cc@575
PS3, Line 575:   vector<BlockId> removed_blocks;
> Hrm I've been waffling back and forth between whether it'd be sufficient to
My attempt to justify rollback here:

It seems like the only methods that can update the RowSetMetadata's internal state is CommitUpdate(), which currently only gets called while under the DeltaTracker's compact_flush_lock (which is maybe an interesting in itself). That means that, so long as we're protected by the lock, nothing else will have changed the in-memory state, and we should be able to roll back to its pre-update state.



-- 
To view, visit http://gerrit.cloudera.org:8080/8441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0241cd4206ce77ef1fb334458b091bc2092f4141
Gerrit-Change-Number: 8441
Gerrit-PatchSet: 5
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Sat, 18 Nov 2017 01:17:40 +0000
Gerrit-HasComments: Yes

[kudu-CR] disk failure: make various delta paths more robust to errors

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8441 )

Change subject: disk failure: make various delta paths more robust to errors
......................................................................


Patch Set 2:

Did some cleanup here to fail more gracefully so we don't end up in an inconsistent state. Also made this patch specifically target compactions (flush DMS and Apply failures will be in different patches).


-- 
To view, visit http://gerrit.cloudera.org:8080/8441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0241cd4206ce77ef1fb334458b091bc2092f4141
Gerrit-Change-Number: 8441
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Mon, 06 Nov 2017 19:52:03 +0000
Gerrit-HasComments: No

[kudu-CR] tablet: check for stopped in drivers of IO

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Alexey Serbin, Mike Percy, Kudu Jenkins, Adar Dembo, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/8441

to look at the new patch set (#4).

Change subject: tablet: check for stopped in drivers of IO
......................................................................

tablet: check for stopped in drivers of IO

This patch adds the invariant that if a tablet function drives I/O, it
must first check that the tablet has not been stopped, and, before
returning, it must again check that the tablet has not been stopped.

E.g. Tablet::FlushUnlocked() is a function that drives IO and it is
drives the operation of flush MRSs. Thus, before doing anything, it must
check that the tablet has not been stopped, and before returning, it
must again check that the tablet has been stopped. The calling
maintenance manager thread must be able to handle errors returned in
this way.

For now, this is an optimization to allow tablet shutdown without
having to wait on maintenance ops. In the future, this can be used to
allow errors that leave in-memory state inconsistent to not crash,
provided they first stop the tablet. This would guarantee that nothing
can successfully return with inconsistent results.

In preparation for this, there are various fatal safety checks
surrounding such driving functions that are updated to simply return
errors. Their call-sites have been updated to check the state of the
tablet, ensuring these failures are safe.

This patch also refactors some void methods to return Statuses where
these checks may be possible, and vise versa when Statuses are not
needed.

Change-Id: I0241cd4206ce77ef1fb334458b091bc2092f4141
---
M src/kudu/tablet/delta_compaction.cc
M src/kudu/tablet/delta_compaction.h
M src/kudu/tablet/delta_tracker.cc
M src/kudu/tablet/delta_tracker.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/metadata-test.cc
M src/kudu/tablet/rowset_metadata.cc
M src/kudu/tablet/rowset_metadata.h
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet_mm_ops.cc
M src/kudu/tablet/tablet_replica_mm_ops.cc
12 files changed, 215 insertions(+), 179 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/41/8441/4
-- 
To view, visit http://gerrit.cloudera.org:8080/8441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0241cd4206ce77ef1fb334458b091bc2092f4141
Gerrit-Change-Number: 8441
Gerrit-PatchSet: 4
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] disk failure: make various delta paths more robust to errors

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8441 )

Change subject: disk failure: make various delta paths more robust to errors
......................................................................


Patch Set 3: Verified+1

The build failures seem to be all due to clock sync errors. Can't retrigger builds yet, but this should be reviewable.


-- 
To view, visit http://gerrit.cloudera.org:8080/8441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0241cd4206ce77ef1fb334458b091bc2092f4141
Gerrit-Change-Number: 8441
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Mon, 06 Nov 2017 20:23:14 +0000
Gerrit-HasComments: No

[kudu-CR] disk failure: make various delta paths more robust to errors

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8441 )

Change subject: disk failure: make various delta paths more robust to errors
......................................................................


Patch Set 2:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/delta_tracker.h
File src/kudu/tablet/delta_tracker.h:

http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/delta_tracker.h@197
PS1, Line 197:   // readers in 'stores'.
> Nit: indentation of the continuation lines.
Done


http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/delta_tracker.cc
File src/kudu/tablet/delta_tracker.cc:

http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/delta_tracker.cc@244
PS1, Line 244: size_t i = 1; i < list.size(); i++) {
> It's not introduced in this changelist, but I suspect it would it be an iss
Interesting catch, thanks!


http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/delta_tracker.cc@299
PS1, Line 299:       // Sanity check that the last store we are adding would logically appear
> What if we return from ValidateDeltasOrdered _after_ this? Won't our in-mem
I've done some reordering here so that if there's an error, no state will be changed.


http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/delta_tracker.cc@341
PS1, Line 341:   AtomicUpdateStores(to_remove, new_stores, type);
             :   vector<BlockId> removed_blocks;
             :   rowset_metadata_->Commit
> This is no longer true, and isn't that a problem?
I've done some reordering here so that if there's an error, no state will be changed.


http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/delta_tracker.cc@345
PS1, Line 345: 
> Not sure what this comment is explaining.
Sorry, was useful as a note to self.


http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/delta_tracker.cc@722
PS1, Line 722:     CHECK_EQ(redo_delta_stores_.back(), old_dms)
> Since component_lock_ is released between the swap and the cleanup, do we h
Given we do the same below, I think so. We're also protected by compact_flush_lock_ up at L687 which ensures a single compaction at a time (ie IIUC this prevents other threads updating the stores). I'll add a CHECK to be sure.


http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/delta_tracker.cc@758
PS1, Line 758:   // rename to final path!
> warning: missing username/bug in TODO [google-readability-todo]
Done


http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/diskrowset.cc
File src/kudu/tablet/diskrowset.cc:

http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/diskrowset.cc@529
PS1, Line 529:   RETURN_NOT_OK(DeltaTracker::Open(rowset_metadata_, num_rows,
> Nit: indentation of the continuation lines.
Done


http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/diskrowset.cc@560
PS1, Line 560: 
> I tried to track this up to the call chains and in the chains
Ah, thanks for digging that up!

Whether intentional or not, this CHECK_OK() served to ensure 1) that we were not left in a half-updated state when doing the atomic calls, and 2) that we were able to verify that correctness by reading from disk.

I just updated UpdateDeltaTracker to be actually atomic, so now the returned error serves only to tell us that we've hit an error reading from disk (which shouldn't be fatal at this level), and if we have, we're now still in a consistent state.


http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/diskrowset.cc@594
PS1, Line 594:     // Update the delta tracker with the changes.
> Indeed.
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/8441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0241cd4206ce77ef1fb334458b091bc2092f4141
Gerrit-Change-Number: 8441
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Mon, 06 Nov 2017 19:43:14 +0000
Gerrit-HasComments: Yes

[kudu-CR] tablet: make various update paths atomic

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Alexey Serbin, Mike Percy, Kudu Jenkins, Adar Dembo, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/8441

to look at the new patch set (#8).

Change subject: tablet: make various update paths atomic
......................................................................

tablet: make various update paths atomic

A few codepaths in the tablet subsystem aren't atomic, i.e. if they fail
mid-method (e.g. due to a file error), they leave some in-memory
structures updated and others untouched. This has been safe because we
CHECKed to ensure their success. In preparation for _not_ crashing in
these methods, this patch refactors some of these functions to be
atomic, and notes others that still have the possibility of failing in
such a state (these calls still trigger a CHECK failure).

There are no functional changes in this patch.

Change-Id: I0241cd4206ce77ef1fb334458b091bc2092f4141
---
M src/kudu/tablet/delta_compaction.cc
M src/kudu/tablet/delta_compaction.h
M src/kudu/tablet/delta_tracker.cc
M src/kudu/tablet/delta_tracker.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/metadata-test.cc
M src/kudu/tablet/rowset_metadata.cc
M src/kudu/tablet/rowset_metadata.h
9 files changed, 223 insertions(+), 169 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/41/8441/8
-- 
To view, visit http://gerrit.cloudera.org:8080/8441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0241cd4206ce77ef1fb334458b091bc2092f4141
Gerrit-Change-Number: 8441
Gerrit-PatchSet: 8
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] tablet: check for stopped in drivers of IO

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change. ( http://gerrit.cloudera.org:8080/8441 )

Change subject: tablet: check for stopped in drivers of IO
......................................................................


Patch Set 4:

(11 comments)

Overall I'm not convinced of whether the CHECK conversation to Status is safe on the one hand, or was ever critical on the other hand. I'll dig into it more and try to come up with my own opinion but it seems a little hand-wavy to me right now.

http://gerrit.cloudera.org:8080/#/c/8441/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8441/4//COMMIT_MSG@9
PS4, Line 9: invariant
nit: This seems like more of a requirement than an invariant to me. An invariant is more like something that we never have to worry about because it's always taken care of.


http://gerrit.cloudera.org:8080/#/c/8441/4//COMMIT_MSG@15
PS4, Line 15: before returning, it
            : must again check that the tablet has been stopped
> per comment in the code, not sure why you have to check on the way out in t
The idea is that we are allowed to corrupt in-memory data structures as long as nobody can ever read results after that. But I'm not sure what specific disk failure scenario calls for that strong of a guarantee.


http://gerrit.cloudera.org:8080/#/c/8441/4/src/kudu/tablet/delta_compaction.cc
File src/kudu/tablet/delta_compaction.cc:

http://gerrit.cloudera.org:8080/#/c/8441/4/src/kudu/tablet/delta_compaction.cc@388
PS4, Line 388:   tracker->AtomicUpdateStores(included_stores_, new_redo_stores, REDO);
I'm curious if the order in which we update the stores matters. If so, it deserves a comment. Offhand, I can't tell.


http://gerrit.cloudera.org:8080/#/c/8441/4/src/kudu/tablet/delta_compaction.cc@390
PS4, Line 390:  
insert: for UNDOs


http://gerrit.cloudera.org:8080/#/c/8441/4/src/kudu/tablet/delta_tracker.h
File src/kudu/tablet/delta_tracker.h:

http://gerrit.cloudera.org:8080/#/c/8441/4/src/kudu/tablet/delta_tracker.h@214
PS4, Line 214: Crashes
This no longer crashes, but that makes me nervous...


http://gerrit.cloudera.org:8080/#/c/8441/4/src/kudu/tablet/delta_tracker.cc
File src/kudu/tablet/delta_tracker.cc:

http://gerrit.cloudera.org:8080/#/c/8441/4/src/kudu/tablet/delta_tracker.cc@a212
PS4, Line 212: 
This was put here to detect programming errors. Are you sure we need to remove this?


http://gerrit.cloudera.org:8080/#/c/8441/4/src/kudu/tablet/diskrowset.cc
File src/kudu/tablet/diskrowset.cc:

http://gerrit.cloudera.org:8080/#/c/8441/4/src/kudu/tablet/diskrowset.cc@567
PS4, Line 567: out of sync
what does out of sync mean?


http://gerrit.cloudera.org:8080/#/c/8441/4/src/kudu/tablet/diskrowset.cc@569
PS4, Line 569: checked
By whom? What kind of guarantee do we get?


http://gerrit.cloudera.org:8080/#/c/8441/4/src/kudu/tablet/diskrowset.cc@592
PS4, Line 592: RETURN_NOT_OK
A couple of questions:
1. Since we had a CHECK_OK here before, shouldn't we put the warning comment above next to this line?
2. Why does disk failure "have to" leave the in-memory data structures in an inconsistent state?


http://gerrit.cloudera.org:8080/#/c/8441/4/src/kudu/tablet/tablet.cc
File src/kudu/tablet/tablet.cc:

http://gerrit.cloudera.org:8080/#/c/8441/4/src/kudu/tablet/tablet.cc@293
PS4, Line 293:   return CheckActiveState();
> it strikes me as odd here (and a few places below) that you are calling Che
not sure of the purpose of this one either since nothing has happened since we set the state ourselves


http://gerrit.cloudera.org:8080/#/c/8441/4/src/kudu/tablet/tablet_mm_ops.cc
File src/kudu/tablet/tablet_mm_ops.cc:

http://gerrit.cloudera.org:8080/#/c/8441/4/src/kudu/tablet/tablet_mm_ops.cc@264
PS4, Line 264: if (PREDICT_FALSE(!s.ok())) {
             :     LOG(WARNING) << Substitute("$0Major delta compaction failed on $1: $2",
             :         LogPrefix(), tablet_->tablet_id(), s.ToString());
             :     CHECK(tablet_->HasBeenStopped()) << "Failure is only allowed if the tablet "
             :                                         "has been stopped first";
             :   }
This logic is reasonable up at this high-level driver code but it should be a macro wrapping a function, not copy / pasted multiple times.



-- 
To view, visit http://gerrit.cloudera.org:8080/8441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0241cd4206ce77ef1fb334458b091bc2092f4141
Gerrit-Change-Number: 8441
Gerrit-PatchSet: 4
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Thu, 16 Nov 2017 09:29:05 +0000
Gerrit-HasComments: Yes

[kudu-CR] tablet: make various update paths atomic

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8441 )

Change subject: tablet: make various update paths atomic
......................................................................


Patch Set 13:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8441/12/src/kudu/tablet/diskrowset.cc
File src/kudu/tablet/diskrowset.cc:

http://gerrit.cloudera.org:8080/#/c/8441/12/src/kudu/tablet/diskrowset.cc@579
PS12, Line 579:     LOG_WITH_PREFIX(WARNING) << "Error during major delta compaction! Rolling back rowset metadata";
> WITH_PREFIX here so you get the tablet ID
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/8441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0241cd4206ce77ef1fb334458b091bc2092f4141
Gerrit-Change-Number: 8441
Gerrit-PatchSet: 13
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Tue, 21 Nov 2017 22:30:48 +0000
Gerrit-HasComments: Yes

[kudu-CR] tablet: make various update paths atomic

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Alexey Serbin, Mike Percy, Kudu Jenkins, Adar Dembo, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/8441

to look at the new patch set (#5).

Change subject: tablet: make various update paths atomic
......................................................................

tablet: make various update paths atomic

A few codepaths in the tablet subsystem aren't atomic, i.e. if they fail
mid-method (e.g. due to a file error), they leave some in-memory
structures updated and others untouched. This has been safe because we
CHECKed to ensure their success. In preparation for _not_ crashing in
these methods, this patch refactors some of these functions to be
atomic, and notes others that still have the possibility of failing in
such a state (these calls still trigger a CHECK failure).

There are no functional changes in this patch.

Change-Id: I0241cd4206ce77ef1fb334458b091bc2092f4141
---
M src/kudu/tablet/delta_compaction.cc
M src/kudu/tablet/delta_compaction.h
M src/kudu/tablet/delta_tracker.cc
M src/kudu/tablet/delta_tracker.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/metadata-test.cc
M src/kudu/tablet/rowset_metadata.cc
M src/kudu/tablet/rowset_metadata.h
9 files changed, 160 insertions(+), 149 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/41/8441/5
-- 
To view, visit http://gerrit.cloudera.org:8080/8441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0241cd4206ce77ef1fb334458b091bc2092f4141
Gerrit-Change-Number: 8441
Gerrit-PatchSet: 5
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] tablet: make various update paths atomic

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change. ( http://gerrit.cloudera.org:8080/8441 )

Change subject: tablet: make various update paths atomic
......................................................................


Patch Set 12: Code-Review+1


-- 
To view, visit http://gerrit.cloudera.org:8080/8441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0241cd4206ce77ef1fb334458b091bc2092f4141
Gerrit-Change-Number: 8441
Gerrit-PatchSet: 12
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Tue, 21 Nov 2017 20:39:58 +0000
Gerrit-HasComments: No

[kudu-CR] tablet: make various update paths atomic

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Alexey Serbin, Mike Percy, Kudu Jenkins, Adar Dembo, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/8441

to look at the new patch set (#12).

Change subject: tablet: make various update paths atomic
......................................................................

tablet: make various update paths atomic

A few codepaths in the tablet subsystem aren't atomic, i.e. if they fail
mid-method (e.g. due to a file error), they leave some in-memory
structures updated and others untouched. This has been safe because we
CHECKed to ensure their success. In preparation for _not_ crashing in
these methods, this patch refactors some of these functions to be
atomic, and notes others that still have the possibility of failing in
such a state (these calls still trigger a CHECK failure).

There are no functional changes in this patch.

Change-Id: I0241cd4206ce77ef1fb334458b091bc2092f4141
---
M src/kudu/tablet/delta_compaction.cc
M src/kudu/tablet/delta_compaction.h
M src/kudu/tablet/delta_tracker.cc
M src/kudu/tablet/delta_tracker.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/metadata-test.cc
M src/kudu/tablet/rowset_metadata.cc
M src/kudu/tablet/rowset_metadata.h
9 files changed, 224 insertions(+), 169 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/41/8441/12
-- 
To view, visit http://gerrit.cloudera.org:8080/8441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0241cd4206ce77ef1fb334458b091bc2092f4141
Gerrit-Change-Number: 8441
Gerrit-PatchSet: 12
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] tablet: make various update paths atomic

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8441 )

Change subject: tablet: make various update paths atomic
......................................................................


Patch Set 11:

> Patch Set 11:
> 
> Any idea what the coverage of these code paths look like? eg with your injected failures test, do these paths actually get hit? Maybe it's worth some WARNING when these more complex cases get hit, so if we do have some subtle bug here, we will have smoking guns in the log that make us associate the issue with this complex rollback stuff.

Yeah, most if not all of the paths here get hit at some point with enough dist test runs. I'll add a WARNING for the rollback.


-- 
To view, visit http://gerrit.cloudera.org:8080/8441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0241cd4206ce77ef1fb334458b091bc2092f4141
Gerrit-Change-Number: 8441
Gerrit-PatchSet: 11
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Tue, 21 Nov 2017 07:29:24 +0000
Gerrit-HasComments: No

[kudu-CR] tablet: make various update paths atomic

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Alexey Serbin, Mike Percy, Kudu Jenkins, Adar Dembo, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/8441

to look at the new patch set (#10).

Change subject: tablet: make various update paths atomic
......................................................................

tablet: make various update paths atomic

A few codepaths in the tablet subsystem aren't atomic, i.e. if they fail
mid-method (e.g. due to a file error), they leave some in-memory
structures updated and others untouched. This has been safe because we
CHECKed to ensure their success. In preparation for _not_ crashing in
these methods, this patch refactors some of these functions to be
atomic, and notes others that still have the possibility of failing in
such a state (these calls still trigger a CHECK failure).

There are no functional changes in this patch.

Change-Id: I0241cd4206ce77ef1fb334458b091bc2092f4141
---
M src/kudu/tablet/delta_compaction.cc
M src/kudu/tablet/delta_compaction.h
M src/kudu/tablet/delta_tracker.cc
M src/kudu/tablet/delta_tracker.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/metadata-test.cc
M src/kudu/tablet/rowset_metadata.cc
M src/kudu/tablet/rowset_metadata.h
9 files changed, 223 insertions(+), 169 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/41/8441/10
-- 
To view, visit http://gerrit.cloudera.org:8080/8441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0241cd4206ce77ef1fb334458b091bc2092f4141
Gerrit-Change-Number: 8441
Gerrit-PatchSet: 10
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] tablet: check for stopped in drivers of IO

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8441 )

Change subject: tablet: check for stopped in drivers of IO
......................................................................


Patch Set 4:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/8441/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8441/4//COMMIT_MSG@15
PS4, Line 15: before returning, it
            : must again check that the tablet has been stopped
> per comment in the code, not sure why you have to check on the way out in t
Done


http://gerrit.cloudera.org:8080/#/c/8441/4/src/kudu/tablet/delta_tracker.h
File src/kudu/tablet/delta_tracker.h:

http://gerrit.cloudera.org:8080/#/c/8441/4/src/kudu/tablet/delta_tracker.h@214
PS4, Line 214: Crashes
> After a second look, please ignore my above comment.
Ack


http://gerrit.cloudera.org:8080/#/c/8441/4/src/kudu/tablet/delta_tracker.cc
File src/kudu/tablet/delta_tracker.cc:

http://gerrit.cloudera.org:8080/#/c/8441/4/src/kudu/tablet/delta_tracker.cc@a212
PS4, Line 212: 
> This was put here to detect programming errors. Are you sure we need to rem
Ah, you're right. Just the Init() needs to be non-fatal.


http://gerrit.cloudera.org:8080/#/c/8441/4/src/kudu/tablet/delta_tracker.cc@a212
PS4, Line 212: 
> Nevermind, disregard the above comment. I think this is OK.
Ack


http://gerrit.cloudera.org:8080/#/c/8441/4/src/kudu/tablet/delta_tracker.cc@a299
PS4, Line 299: 
> curious why this comment was removed
Ah, I replaced it with the longer comment.


http://gerrit.cloudera.org:8080/#/c/8441/4/src/kudu/tablet/delta_tracker.cc@300
PS4, Line 300: *end_it
> Changing this to end disagrees with the comment above -- right? Is there a 
Ah, this was pretty subtle, maybe I should comment it. This was `start_it` before because we removed `start_it` through `end_it` before doing this validation, and so `start_it` actually pointed to just past the last store we removed.

I moved the removal down below because I was thinking of using RETURN_NOT_OK() here instead of WARN_NOT_OK(), and I wanted this function to be atomic. I suppose now it's not as important since this only WARNS, but I can move it back.


http://gerrit.cloudera.org:8080/#/c/8441/4/src/kudu/tablet/delta_tracker.cc@340
PS4, Line 340: rowset_metadata_->CommitUpdate(update);
> Why would we commit before flushing?
Hrm, I'm not sure the ordering here is too important. The two are in-memory operations that can't fail.

I moved them because I pulled out OpenDeltaReaders() out of AtomicUpdateStores() and wanted to keep them paired, but I suppose that shouldn't matter either.


http://gerrit.cloudera.org:8080/#/c/8441/4/src/kudu/tablet/tablet.cc
File src/kudu/tablet/tablet.cc:

http://gerrit.cloudera.org:8080/#/c/8441/4/src/kudu/tablet/tablet.cc@293
PS4, Line 293:   return CheckActiveState();
> it strikes me as odd here (and a few places below) that you are calling Che
Done


http://gerrit.cloudera.org:8080/#/c/8441/4/src/kudu/tablet/tablet.cc@369
PS4, Line 369:   return CheckActiveState();
> same here - this check is already racy, right?
Done


http://gerrit.cloudera.org:8080/#/c/8441/4/src/kudu/tablet/tablet.cc@957
PS4, Line 957:         return CheckActiveState();
> same, not sure of the point of doing these on the way out and not just on t
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/8441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0241cd4206ce77ef1fb334458b091bc2092f4141
Gerrit-Change-Number: 8441
Gerrit-PatchSet: 4
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Fri, 17 Nov 2017 00:16:52 +0000
Gerrit-HasComments: Yes

[kudu-CR] tablet: make various update paths atomic

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change. ( http://gerrit.cloudera.org:8080/8441 )

Change subject: tablet: make various update paths atomic
......................................................................


Patch Set 11: Code-Review+2

This looks pretty good to me. Todd may also want to take a look.


-- 
To view, visit http://gerrit.cloudera.org:8080/8441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0241cd4206ce77ef1fb334458b091bc2092f4141
Gerrit-Change-Number: 8441
Gerrit-PatchSet: 11
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Tue, 21 Nov 2017 04:54:25 +0000
Gerrit-HasComments: No

[kudu-CR] disk failure: make various delta paths more robust to errors

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Adar Dembo, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/8441

to look at the new patch set (#2).

Change subject: disk failure: make various delta paths more robust to errors
......................................................................

disk failure: make various delta paths more robust to errors

There are various fatal safety checks along paths surrounding deltas.
Some of these CHECKs were in place to enforce in-memory consistency;
these have been made safe by making such wrapped calls handle failure
gracefully (i.e. without leaving some half-updated state), and by
returning errors instead of crashing.

This patch makes the API/behavioral updates:
- DeltaTracker::AtomicUpdateStores is made atomic by pulling out some
  access to disk and by making delta-order validation fatal only in case
  of logical errors rather than physical ones (e.g. disk failures)
- RowSetMetadata::CommitUpdate() now returns a list of blocks to orphan
  and relies on caller to orphan them via the new
  RowSetMetadata::AddOrphanedBlocks().
- It is now possible to reload the in-memory state of a RowSetMetadata
  instance (e.g. in the event of a failure and rollback is necessary).

The following are now atomic with the above changes and some reordering.
- DiskRowSet::MajorCompactDeltaStoresWithColumnIds()
- DeltaTracker::CommitDeltaStoreMetadataUpdate()
- MajorDeltaCompaction::UpdateDeltaTracker()

Change-Id: I0241cd4206ce77ef1fb334458b091bc2092f4141
---
M src/kudu/tablet/delta_compaction.cc
M src/kudu/tablet/delta_compaction.h
M src/kudu/tablet/delta_tracker.cc
M src/kudu/tablet/delta_tracker.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/metadata-test.cc
M src/kudu/tablet/rowset_metadata.cc
M src/kudu/tablet/rowset_metadata.h
8 files changed, 227 insertions(+), 162 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/41/8441/2
-- 
To view, visit http://gerrit.cloudera.org:8080/8441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0241cd4206ce77ef1fb334458b091bc2092f4141
Gerrit-Change-Number: 8441
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] disk failure: pass Statuses instead of failing CHECKs along the Flush DMS path

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/8441 )

Change subject: disk failure: pass Statuses instead of failing CHECKs along the Flush DMS path
......................................................................


Patch Set 1:

(7 comments)

This is thorny code that I'm not terribly familiar with, so please also get a review from Todd.

http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/delta_tracker.h
File src/kudu/tablet/delta_tracker.h:

http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/delta_tracker.h@197
PS1, Line 197:   Status ValidateDeltaOrder(const std::shared_ptr<DeltaStore>& first,
Nit: indentation of the continuation lines.


http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/delta_tracker.cc
File src/kudu/tablet/delta_tracker.cc:

http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/delta_tracker.cc@299
PS1, Line 299:     stores_to_update->erase(start_it, end_it);
What if we return from ValidateDeltasOrdered _after_ this? Won't our in-memory state be sort of corrupt?

I understand that the validation is for testing only anyway, but if the goal is to have a fully formed DeltaTracker in the event of an injected disk failure, we should probably think about the order here.


http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/delta_tracker.cc@341
PS1, Line 341:   // Once we successfully commit to the rowset metadata, let's ensure we update
             :   // the delta stores to maintain consistency between the two. We enforce this
             :   // with a CHECK_OK here.
This is no longer true, and isn't that a problem?


http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/delta_tracker.cc@345
PS1, Line 345:   // This goes down the path of opening blocks.
Not sure what this comment is explaining.


http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/delta_tracker.cc@722
PS1, Line 722:     std::lock_guard<rw_spinlock> lock(component_lock_);
Since component_lock_ is released between the swap and the cleanup, do we have assurances that the last entry of redo_delta_stores_ was old_dms?


http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/diskrowset.cc
File src/kudu/tablet/diskrowset.cc:

http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/diskrowset.cc@529
PS1, Line 529:                                     log_anchor_registry_,
Nit: indentation of the continuation lines.


http://gerrit.cloudera.org:8080/#/c/8441/1/src/kudu/tablet/diskrowset.cc@594
PS1, Line 594:   // TODO(awong): update here this message.
Indeed.



-- 
To view, visit http://gerrit.cloudera.org:8080/8441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0241cd4206ce77ef1fb334458b091bc2092f4141
Gerrit-Change-Number: 8441
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Wed, 01 Nov 2017 20:35:20 +0000
Gerrit-HasComments: Yes

[kudu-CR] tablet: check for stopped in drivers of IO

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change. ( http://gerrit.cloudera.org:8080/8441 )

Change subject: tablet: check for stopped in drivers of IO
......................................................................


Patch Set 4:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/8441/4/src/kudu/tablet/delta_tracker.h
File src/kudu/tablet/delta_tracker.h:

http://gerrit.cloudera.org:8080/#/c/8441/4/src/kudu/tablet/delta_tracker.h@214
PS4, Line 214: Crashes
> This no longer crashes, but that makes me nervous...
After a second look, please ignore my above comment.


http://gerrit.cloudera.org:8080/#/c/8441/4/src/kudu/tablet/delta_tracker.cc
File src/kudu/tablet/delta_tracker.cc:

http://gerrit.cloudera.org:8080/#/c/8441/4/src/kudu/tablet/delta_tracker.cc@a212
PS4, Line 212: 
> This was put here to detect programming errors. Are you sure we need to rem
Nevermind, disregard the above comment. I think this is OK.


http://gerrit.cloudera.org:8080/#/c/8441/4/src/kudu/tablet/delta_tracker.cc@a299
PS4, Line 299: 
curious why this comment was removed


http://gerrit.cloudera.org:8080/#/c/8441/4/src/kudu/tablet/delta_tracker.cc@300
PS4, Line 300: *end_it
Changing this to end disagrees with the comment above -- right? Is there a reason why this logic was changed?

The idea behind this check, as the comment notes, is to ensure that the stores being added (notably, the last element in that list) logically come before the first of the existing delta stores.


http://gerrit.cloudera.org:8080/#/c/8441/4/src/kudu/tablet/delta_tracker.cc@340
PS4, Line 340: rowset_metadata_->CommitUpdate(update);
Why would we commit before flushing?



-- 
To view, visit http://gerrit.cloudera.org:8080/8441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0241cd4206ce77ef1fb334458b091bc2092f4141
Gerrit-Change-Number: 8441
Gerrit-PatchSet: 4
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Fri, 17 Nov 2017 00:00:27 +0000
Gerrit-HasComments: Yes

[kudu-CR] tablet: make various update paths atomic

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/8441 )

Change subject: tablet: make various update paths atomic
......................................................................

tablet: make various update paths atomic

A few codepaths in the tablet subsystem aren't atomic, i.e. if they fail
mid-method (e.g. due to a file error), they leave some in-memory
structures updated and others untouched. This has been safe because we
CHECKed to ensure their success. In preparation for _not_ crashing in
these methods, this patch refactors some of these functions to be
atomic, and notes others that still have the possibility of failing in
such a state (these calls still trigger a CHECK failure).

There are no functional changes in this patch.

Change-Id: I0241cd4206ce77ef1fb334458b091bc2092f4141
Reviewed-on: http://gerrit.cloudera.org:8080/8441
Reviewed-by: Mike Percy <mp...@apache.org>
Tested-by: Kudu Jenkins
---
M src/kudu/tablet/delta_compaction.cc
M src/kudu/tablet/delta_compaction.h
M src/kudu/tablet/delta_tracker.cc
M src/kudu/tablet/delta_tracker.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/metadata-test.cc
M src/kudu/tablet/rowset_metadata.cc
M src/kudu/tablet/rowset_metadata.h
9 files changed, 224 insertions(+), 169 deletions(-)

Approvals:
  Mike Percy: Looks good to me, approved
  Kudu Jenkins: Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/8441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I0241cd4206ce77ef1fb334458b091bc2092f4141
Gerrit-Change-Number: 8441
Gerrit-PatchSet: 15
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] disk failure: make various delta paths more robust to errors

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8441 )

Change subject: disk failure: make various delta paths more robust to errors
......................................................................


Patch Set 3:

(7 comments)

Inching back towards the ostrich method and putting in an IsGTest() hack here.

http://gerrit.cloudera.org:8080/#/c/8441/3/src/kudu/tablet/delta_compaction.cc
File src/kudu/tablet/delta_compaction.cc:

http://gerrit.cloudera.org:8080/#/c/8441/3/src/kudu/tablet/delta_compaction.cc@372
PS3, Line 372:   RETURN_NOT_OK(tracker->OpenDeltaReaders(new_redo_blocks, &new_redo_stores, REDO));
> Can you add a TODO here that these OpenDeltaReaders calls could actually ha
Done


http://gerrit.cloudera.org:8080/#/c/8441/3/src/kudu/tablet/delta_tracker.h
File src/kudu/tablet/delta_tracker.h:

http://gerrit.cloudera.org:8080/#/c/8441/3/src/kudu/tablet/delta_tracker.h@164
PS3, Line 164: in-memory delta stores and then persists them to disk
> I think this is kind of unclear.
Done


http://gerrit.cloudera.org:8080/#/c/8441/3/src/kudu/tablet/delta_tracker.cc
File src/kudu/tablet/delta_tracker.cc:

http://gerrit.cloudera.org:8080/#/c/8441/3/src/kudu/tablet/delta_tracker.cc@307
PS3, Line 307:   if (start_it != end_it) {
> is this necessary? erase() with an empty range should be safe
Done


http://gerrit.cloudera.org:8080/#/c/8441/3/src/kudu/tablet/delta_tracker.cc@719
PS3, Line 719:   // In case the flush fails, remove the DeltaMemStore from the store list.
> I dont think this is safe. Another update could have landed in the new 'dms
Trying to determine whether best-effort StoppedTablet checks are sufficient here to prevent us from returning bad data.

And I guess a bigger question is whether this could lead us down a darker path of trying to do something that relies on the dropped 
dms.


http://gerrit.cloudera.org:8080/#/c/8441/3/src/kudu/tablet/delta_tracker.cc@720
PS3, Line 720:   auto cleanup_redo_delta_stores = MakeScopedCleanup([&] {
> nit: can use the new SCOPED_CLEANUP macro here
Mm I'm not sure about that, since it needs to be canceled and needs a name?


http://gerrit.cloudera.org:8080/#/c/8441/3/src/kudu/tablet/delta_tracker.cc@757
PS3, Line 757:   // TODO(unknown): wherever we write stuff, we should write to a tmp path and
> I think this can just be removed now, it's super ancient (goes back to http
Done


http://gerrit.cloudera.org:8080/#/c/8441/3/src/kudu/tablet/diskrowset.cc
File src/kudu/tablet/diskrowset.cc:

http://gerrit.cloudera.org:8080/#/c/8441/3/src/kudu/tablet/diskrowset.cc@575
PS3, Line 575:   // revert changes in case of an error.
> not sure this is safe in the general case, again because as soon as we swap
Hrm I've been waffling back and forth between whether it'd be sufficient to assert that the tablet's been stopped here. I think it might take even more CheckNotStopped calls in many places, not just the DiskRowSet/DeltaTracker.

I guess the big question is whether it's OK to have the {DeltaTracker, DiskRowSet, RowSetMetadata} be out of sync with each other (or if this is OK, rolling back changes. I think this might actually be more questionable), provided Stopped tablet checks are best-effort.

Per your suggestion, another solution might be to sidestep this whole issue with a IsGTest() workaround or somesuch.



-- 
To view, visit http://gerrit.cloudera.org:8080/8441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0241cd4206ce77ef1fb334458b091bc2092f4141
Gerrit-Change-Number: 8441
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Tue, 07 Nov 2017 21:28:02 +0000
Gerrit-HasComments: Yes

[kudu-CR] tablet: check for stopped in drivers of IO

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change. ( http://gerrit.cloudera.org:8080/8441 )

Change subject: tablet: check for stopped in drivers of IO
......................................................................


Patch Set 4:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/8441/4/src/kudu/tablet/delta_compaction.cc
File src/kudu/tablet/delta_compaction.cc:

http://gerrit.cloudera.org:8080/#/c/8441/4/src/kudu/tablet/delta_compaction.cc@372
PS4, Line 372:   // TODO(awong): pull these OpenDeltaReaders() calls out of the critical path
Please add a high level comment to this function regarding when we're allowed to do I/O vs when we must not.

Here, we can write something like: 

    1. Perform I/O that could possibly fail.


http://gerrit.cloudera.org:8080/#/c/8441/4/src/kudu/tablet/delta_compaction.cc@386
PS4, Line 386:   // Even if we didn't create any new redo blocks, we still need to update the
Add:

  2. Now we perform critical in-memory maintenance that must be performed atomically. We must maintain consistency between the REDO and UNDO stores.

I think we could also add a ScopedAllowIO(false) or something similar (might need to add that feature in thread_restrictions.h) in order to future-proof that invariant.


http://gerrit.cloudera.org:8080/#/c/8441/4/src/kudu/tablet/diskrowset.cc
File src/kudu/tablet/diskrowset.cc:

http://gerrit.cloudera.org:8080/#/c/8441/4/src/kudu/tablet/diskrowset.cc@a577
PS4, Line 577: 
             : 
             : 
why was this comment removed?



-- 
To view, visit http://gerrit.cloudera.org:8080/8441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0241cd4206ce77ef1fb334458b091bc2092f4141
Gerrit-Change-Number: 8441
Gerrit-PatchSet: 4
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Fri, 17 Nov 2017 00:48:34 +0000
Gerrit-HasComments: Yes

[kudu-CR] tablet: make various update paths atomic

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/8441 )

Change subject: tablet: make various update paths atomic
......................................................................


Patch Set 12:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8441/12/src/kudu/tablet/diskrowset.cc
File src/kudu/tablet/diskrowset.cc:

http://gerrit.cloudera.org:8080/#/c/8441/12/src/kudu/tablet/diskrowset.cc@579
PS12, Line 579:     LOG(WARNING) << "Error during major delta compaction! Rolling back rowset metadata";
WITH_PREFIX here so you get the tablet ID



-- 
To view, visit http://gerrit.cloudera.org:8080/8441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0241cd4206ce77ef1fb334458b091bc2092f4141
Gerrit-Change-Number: 8441
Gerrit-PatchSet: 12
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Tue, 21 Nov 2017 19:04:54 +0000
Gerrit-HasComments: Yes

[kudu-CR] tablet: make various update paths atomic

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8441 )

Change subject: tablet: make various update paths atomic
......................................................................


Patch Set 5:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/8441/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8441/4//COMMIT_MSG@9
PS4, Line 9: he tablet
> nit: This seems like more of a requirement than an invariant to me. An inva
Ack


http://gerrit.cloudera.org:8080/#/c/8441/4/src/kudu/tablet/delta_compaction.cc
File src/kudu/tablet/delta_compaction.cc:

http://gerrit.cloudera.org:8080/#/c/8441/4/src/kudu/tablet/delta_compaction.cc@372
PS4, Line 372: 
> Please add a high level comment to this function regarding when we're allow
Done


http://gerrit.cloudera.org:8080/#/c/8441/4/src/kudu/tablet/delta_compaction.cc@386
PS4, Line 386:     RETURN_NOT_OK(tracker->OpenDeltaReaders(new_undo_blocks, &new_undo_stores, UNDO));
> Add:
Done


http://gerrit.cloudera.org:8080/#/c/8441/4/src/kudu/tablet/delta_compaction.cc@388
PS4, Line 388: 
> I'm curious if the order in which we update the stores matters. If so, it d
I don't think so given this all happens under locks.


http://gerrit.cloudera.org:8080/#/c/8441/4/src/kudu/tablet/delta_compaction.cc@390
PS4, Line 390: 
> insert: for UNDOs
Done


http://gerrit.cloudera.org:8080/#/c/8441/4/src/kudu/tablet/diskrowset.cc
File src/kudu/tablet/diskrowset.cc:

http://gerrit.cloudera.org:8080/#/c/8441/4/src/kudu/tablet/diskrowset.cc@a577
PS4, Line 577: 
             : 
             : 
> why was this comment removed?
Done


http://gerrit.cloudera.org:8080/#/c/8441/4/src/kudu/tablet/diskrowset.cc@567
PS4, Line 567: 
> what does out of sync mean?
Done


http://gerrit.cloudera.org:8080/#/c/8441/4/src/kudu/tablet/diskrowset.cc@569
PS4, Line 569: 
> By whom? What kind of guarantee do we get?
Took this out and replaced with CHECKs.


http://gerrit.cloudera.org:8080/#/c/8441/4/src/kudu/tablet/tablet_mm_ops.cc
File src/kudu/tablet/tablet_mm_ops.cc:

http://gerrit.cloudera.org:8080/#/c/8441/4/src/kudu/tablet/tablet_mm_ops.cc@264
PS4, Line 264: oped_refptr<AtomicGauge<uint32_t> > MajorDeltaCompactionOp::RunningGauge() const {
             :   return tablet_->metrics()->delta_major_compact_rs_running;
             : }
             : 
             : ////////////////////////////////////////////////////////////
             : // 
> This logic is reasonable up at this high-level driver code but it should be
Ack



-- 
To view, visit http://gerrit.cloudera.org:8080/8441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0241cd4206ce77ef1fb334458b091bc2092f4141
Gerrit-Change-Number: 8441
Gerrit-PatchSet: 5
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Fri, 17 Nov 2017 17:45:53 +0000
Gerrit-HasComments: Yes

[kudu-CR] disk failure: make various delta paths more robust to errors

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Adar Dembo, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/8441

to look at the new patch set (#3).

Change subject: disk failure: make various delta paths more robust to errors
......................................................................

disk failure: make various delta paths more robust to errors

There are various fatal safety checks along paths surrounding deltas.
Some of these CHECKs were in place to enforce in-memory consistency;
these have been made safe by making such wrapped calls handle failure
gracefully (i.e. without leaving some half-updated state), and by
returning errors instead of crashing.

This patch makes the API/behavioral updates:
- DeltaTracker::AtomicUpdateStores is made atomic by pulling out some
  access to disk and by making delta-order validation fatal only in case
  of logical errors rather than physical ones (e.g. disk failures)
- RowSetMetadata::CommitUpdate() now returns a list of blocks to orphan
  and relies on caller to orphan them via the new
  RowSetMetadata::AddOrphanedBlocks().
- It is now possible to reload the in-memory state of a RowSetMetadata
  instance from a protobuf message (e.g. in the event of a failure and
  rollback is necessary).

The following are now atomic with the above changes and some reordering.
- DiskRowSet::MajorCompactDeltaStoresWithColumnIds()
- DeltaTracker::CommitDeltaStoreMetadataUpdate()
- MajorDeltaCompaction::UpdateDeltaTracker()

Change-Id: I0241cd4206ce77ef1fb334458b091bc2092f4141
---
M src/kudu/tablet/delta_compaction.cc
M src/kudu/tablet/delta_compaction.h
M src/kudu/tablet/delta_tracker.cc
M src/kudu/tablet/delta_tracker.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/metadata-test.cc
M src/kudu/tablet/rowset_metadata.cc
M src/kudu/tablet/rowset_metadata.h
8 files changed, 229 insertions(+), 164 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/41/8441/3
-- 
To view, visit http://gerrit.cloudera.org:8080/8441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0241cd4206ce77ef1fb334458b091bc2092f4141
Gerrit-Change-Number: 8441
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] tablet: make various update paths atomic

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change. ( http://gerrit.cloudera.org:8080/8441 )

Change subject: tablet: make various update paths atomic
......................................................................


Patch Set 13: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/8441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0241cd4206ce77ef1fb334458b091bc2092f4141
Gerrit-Change-Number: 8441
Gerrit-PatchSet: 13
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Tue, 21 Nov 2017 22:35:06 +0000
Gerrit-HasComments: No

[kudu-CR] tablet: make various update paths atomic

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Alexey Serbin, Mike Percy, Kudu Jenkins, Adar Dembo, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/8441

to look at the new patch set (#7).

Change subject: tablet: make various update paths atomic
......................................................................

tablet: make various update paths atomic

A few codepaths in the tablet subsystem aren't atomic, i.e. if they fail
mid-method (e.g. due to a file error), they leave some in-memory
structures updated and others untouched. This has been safe because we
CHECKed to ensure their success. In preparation for _not_ crashing in
these methods, this patch refactors some of these functions to be
atomic, and notes others that still have the possibility of failing in
such a state (these calls still trigger a CHECK failure).

There are no functional changes in this patch.

Change-Id: I0241cd4206ce77ef1fb334458b091bc2092f4141
---
M src/kudu/tablet/delta_compaction.cc
M src/kudu/tablet/delta_compaction.h
M src/kudu/tablet/delta_tracker.cc
M src/kudu/tablet/delta_tracker.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/metadata-test.cc
M src/kudu/tablet/rowset_metadata.cc
M src/kudu/tablet/rowset_metadata.h
9 files changed, 223 insertions(+), 169 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/41/8441/7
-- 
To view, visit http://gerrit.cloudera.org:8080/8441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0241cd4206ce77ef1fb334458b091bc2092f4141
Gerrit-Change-Number: 8441
Gerrit-PatchSet: 7
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] tablet: make various update paths atomic

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Alexey Serbin, Mike Percy, Kudu Jenkins, Adar Dembo, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/8441

to look at the new patch set (#13).

Change subject: tablet: make various update paths atomic
......................................................................

tablet: make various update paths atomic

A few codepaths in the tablet subsystem aren't atomic, i.e. if they fail
mid-method (e.g. due to a file error), they leave some in-memory
structures updated and others untouched. This has been safe because we
CHECKed to ensure their success. In preparation for _not_ crashing in
these methods, this patch refactors some of these functions to be
atomic, and notes others that still have the possibility of failing in
such a state (these calls still trigger a CHECK failure).

There are no functional changes in this patch.

Change-Id: I0241cd4206ce77ef1fb334458b091bc2092f4141
---
M src/kudu/tablet/delta_compaction.cc
M src/kudu/tablet/delta_compaction.h
M src/kudu/tablet/delta_tracker.cc
M src/kudu/tablet/delta_tracker.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/metadata-test.cc
M src/kudu/tablet/rowset_metadata.cc
M src/kudu/tablet/rowset_metadata.h
9 files changed, 224 insertions(+), 169 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/41/8441/13
-- 
To view, visit http://gerrit.cloudera.org:8080/8441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0241cd4206ce77ef1fb334458b091bc2092f4141
Gerrit-Change-Number: 8441
Gerrit-PatchSet: 13
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] tablet: check for stopped in drivers of IO

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8441 )

Change subject: tablet: check for stopped in drivers of IO
......................................................................


Patch Set 4:

I've changed the part of this patch that tries to enforce consistency with a higher-level check that the Tablet has been stopped.


-- 
To view, visit http://gerrit.cloudera.org:8080/8441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0241cd4206ce77ef1fb334458b091bc2092f4141
Gerrit-Change-Number: 8441
Gerrit-PatchSet: 4
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Mon, 13 Nov 2017 22:50:59 +0000
Gerrit-HasComments: No

[kudu-CR] disk failure: make various delta paths more robust to errors

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/8441 )

Change subject: disk failure: make various delta paths more robust to errors
......................................................................


Patch Set 3:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/8441/3/src/kudu/tablet/delta_compaction.cc
File src/kudu/tablet/delta_compaction.cc:

http://gerrit.cloudera.org:8080/#/c/8441/3/src/kudu/tablet/delta_compaction.cc@372
PS3, Line 372:   RETURN_NOT_OK(tracker->OpenDeltaReaders(new_redo_blocks, &new_redo_stores, REDO));
Can you add a TODO here that these OpenDeltaReaders calls could actually happen outside of the component_lock? it seems like these calls do some IO so it's not the best to be holding the DRS component lock during it since it'll block concurrent ops.


http://gerrit.cloudera.org:8080/#/c/8441/3/src/kudu/tablet/delta_tracker.h
File src/kudu/tablet/delta_tracker.h:

http://gerrit.cloudera.org:8080/#/c/8441/3/src/kudu/tablet/delta_tracker.h@164
PS3, Line 164: in-memory delta stores and then persists them to disk
I think this is kind of unclear.

More precisely this updates the in-memory *list* of delta stores, and then persists the *metadata* to disk. As written it sounds like it's changing the delta stores themselves, and flushing deltafiles or somesuch


http://gerrit.cloudera.org:8080/#/c/8441/3/src/kudu/tablet/delta_tracker.cc
File src/kudu/tablet/delta_tracker.cc:

http://gerrit.cloudera.org:8080/#/c/8441/3/src/kudu/tablet/delta_tracker.cc@307
PS3, Line 307:   if (start_it != end_it) {
is this necessary? erase() with an empty range should be safe


http://gerrit.cloudera.org:8080/#/c/8441/3/src/kudu/tablet/delta_tracker.cc@719
PS3, Line 719:   // In case the flush fails, remove the DeltaMemStore from the store list.
I dont think this is safe. Another update could have landed in the new 'dms_', so the fact that you've now dropped it on the floor means we've lost a user's update that they might think was committed, no?


http://gerrit.cloudera.org:8080/#/c/8441/3/src/kudu/tablet/delta_tracker.cc@720
PS3, Line 720:   auto cleanup_redo_delta_stores = MakeScopedCleanup([&] {
nit: can use the new SCOPED_CLEANUP macro here


http://gerrit.cloudera.org:8080/#/c/8441/3/src/kudu/tablet/delta_tracker.cc@757
PS3, Line 757:   // TODO(unknown): wherever we write stuff, we should write to a tmp path and
I think this can just be removed now, it's super ancient (goes back to https://github.com/apache/kudu/commit/0e3bd37ab8645a00001758fa5de84861ac443eef !)


http://gerrit.cloudera.org:8080/#/c/8441/3/src/kudu/tablet/diskrowset.cc
File src/kudu/tablet/diskrowset.cc:

http://gerrit.cloudera.org:8080/#/c/8441/3/src/kudu/tablet/diskrowset.cc@575
PS3, Line 575:   // revert changes in case of an error.
not sure this is safe in the general case, again because as soon as we swap in a new DMS, we can have operations coming in and committing with their dms_id set to the new DMS's id, and I'm not entirely clear on how we prevent losing them.

The answer of course could be that any failure immediately causes everything to shut down, but if you are shutting down then you also probably don't really need to re-flush the tablet metadata?



-- 
To view, visit http://gerrit.cloudera.org:8080/8441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0241cd4206ce77ef1fb334458b091bc2092f4141
Gerrit-Change-Number: 8441
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Tue, 07 Nov 2017 01:51:42 +0000
Gerrit-HasComments: Yes