You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Mike Percy (Code Review)" <ge...@cloudera.org> on 2017/01/26 06:11:40 UTC

[kudu-CR] KUDU-1853. Tablet copy: Don't orphan blocks on failure

Hello Adar Dembo, Todd Lipcon,

I'd like you to do a code review.  Please visit

    http://gerrit.cloudera.org:8080/5799

to review the following change.

Change subject: KUDU-1853. Tablet copy: Don't orphan blocks on failure
......................................................................

KUDU-1853. Tablet copy: Don't orphan blocks on failure

Previously, if a tablet copy failed we would orphan data blocks. This
patch makes it so that a failed tablet copy operation that does not
involve a process crash does not orphan data blocks.

Change-Id: Ifbc0720d460de8d912a241a5186ef6e8f524b830
---
M src/kudu/tserver/tablet_copy_client-test.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/tablet_copy_client.h
3 files changed, 101 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/99/5799/1
-- 
To view, visit http://gerrit.cloudera.org:8080/5799
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ifbc0720d460de8d912a241a5186ef6e8f524b830
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-1853. Tablet copy: Don't orphan blocks on failure

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change.

Change subject: KUDU-1853. Tablet copy: Don't orphan blocks on failure
......................................................................


Patch Set 2:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/5799/2/src/kudu/tserver/tablet_copy_client-test.cc
File src/kudu/tserver/tablet_copy_client-test.cc:

Line 263:   ASSERT_OK(client_->FetchAll(&listener));
> You can pass nullptr in for the status listener.
Ah, indeed we can.


http://gerrit.cloudera.org:8080/#/c/5799/2/src/kudu/tserver/tablet_copy_client.cc
File src/kudu/tserver/tablet_copy_client.cc:

Line 294: Status TabletCopyClient::Abort() {
> Would it be possible to reuse the existing "delete tablet" code path here? 
It should be possible but it would likely involve a bunch of refactoring to move the logic from TsTabletManager and the master into this class, along with a bunch of test changes that I'd rather not spend time doing at the moment.


http://gerrit.cloudera.org:8080/#/c/5799/2/src/kudu/tserver/tablet_copy_client.h
File src/kudu/tserver/tablet_copy_client.h:

PS2, Line 110: Does not delete any other type of data such as WALs,
             :   // consensus metadata, or the superblock.
> Why not? If the goal is to do the right thing for long lived tservers, we o
I didn't refactor this class yesterday because it's additional work for no additional feature, since the TSTabletManager already deals with those things.


PS2, Line 189:   bool started_;            // Session started.
             :   bool downloaded_wal_;     // WAL segments downloaded.
             :   bool downloaded_blocks_;  // Data blocks downloaded.
             :   bool finished_;           // Persisted new superblock.
> How about restructuring this as an enum?
An enum would require a particular order between downloading the wal and the blocks and I'd rather not enforce that.


Line 214:   std::vector<BlockId> persisted_blocks_;
> These are already available in the superblock created in DownloadBlocks(), 
I did it like this for a couple reasons:
1. It's easier to iterate over a vector than traverse all the blocks in a superblock.
2. We can enforce cleanup for stuff downloaded separately from the superblock, although that usage is something that only happens in tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/5799
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifbc0720d460de8d912a241a5186ef6e8f524b830
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1853. Tablet copy: Don't orphan blocks on failure

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/5799

to look at the new patch set (#3).

Change subject: KUDU-1853. Tablet copy: Don't orphan blocks on failure
......................................................................

KUDU-1853. Tablet copy: Don't orphan blocks on failure

Previously, if a tablet copy failed we would orphan data blocks. This
patch makes it so that a failed tablet copy operation that does not
involve a process crash does not orphan data blocks.

Change-Id: Ifbc0720d460de8d912a241a5186ef6e8f524b830
---
M src/kudu/tserver/tablet_copy_client-test.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/tablet_copy_client.h
3 files changed, 96 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/99/5799/3
-- 
To view, visit http://gerrit.cloudera.org:8080/5799
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifbc0720d460de8d912a241a5186ef6e8f524b830
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-1853. Tablet copy: Don't orphan blocks on failure

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change.

Change subject: KUDU-1853. Tablet copy: Don't orphan blocks on failure
......................................................................


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5799/5/src/kudu/tserver/ts_tablet_manager.cc
File src/kudu/tserver/ts_tablet_manager.cc:

Line 1041:   // or copying a tablet.
> Nit: we're not "copying" the tablet though; we're tombstoning a tablet that
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/5799
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifbc0720d460de8d912a241a5186ef6e8f524b830
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1853. Tablet copy: Don't orphan blocks on failure

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: KUDU-1853. Tablet copy: Don't orphan blocks on failure
......................................................................


Patch Set 2:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/5799/2/src/kudu/tserver/tablet_copy_client-test.cc
File src/kudu/tserver/tablet_copy_client-test.cc:

Line 263:   ASSERT_OK(client_->FetchAll(&listener));
You can pass nullptr in for the status listener.


http://gerrit.cloudera.org:8080/#/c/5799/2/src/kudu/tserver/tablet_copy_client.cc
File src/kudu/tserver/tablet_copy_client.cc:

Line 294: Status TabletCopyClient::Abort() {
Would it be possible to reuse the existing "delete tablet" code path here? Besides avoiding some code duplication, it'd also make it easy to delete all the other tablet state that we no longer need.


http://gerrit.cloudera.org:8080/#/c/5799/2/src/kudu/tserver/tablet_copy_client.h
File src/kudu/tserver/tablet_copy_client.h:

PS2, Line 110: Does not delete any other type of data such as WALs,
             :   // consensus metadata, or the superblock.
Why not? If the goal is to do the right thing for long lived tservers, we ought to clean up everything that was created.


PS2, Line 189:   bool started_;            // Session started.
             :   bool downloaded_wal_;     // WAL segments downloaded.
             :   bool downloaded_blocks_;  // Data blocks downloaded.
             :   bool finished_;           // Persisted new superblock.
How about restructuring this as an enum?


Line 214:   std::vector<BlockId> persisted_blocks_;
These are already available in the superblock created in DownloadBlocks(), though you'd need to make sure not to discard it on an early exit.


-- 
To view, visit http://gerrit.cloudera.org:8080/5799
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifbc0720d460de8d912a241a5186ef6e8f524b830
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1853. Tablet copy: Don't orphan blocks on failure

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/5799

to look at the new patch set (#4).

Change subject: KUDU-1853. Tablet copy: Don't orphan blocks on failure
......................................................................

KUDU-1853. Tablet copy: Don't orphan blocks on failure

Previously, if a tablet copy failed we would orphan data blocks. This
patch makes it so that a failed tablet copy operation that does not
involve a process crash does not orphan data blocks.

This also refactors some deletion logic out of TSTabletManager so that
TabletCopyClient will tombstone partially-copied tablets when the copy
operation fails.

Change-Id: Ifbc0720d460de8d912a241a5186ef6e8f524b830
---
M src/kudu/tserver/tablet_copy-test-base.h
M src/kudu/tserver/tablet_copy_client-test.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/tablet_copy_client.h
M src/kudu/tserver/tablet_copy_service-test.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/ts_tablet_manager.h
7 files changed, 195 insertions(+), 106 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/99/5799/4
-- 
To view, visit http://gerrit.cloudera.org:8080/5799
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifbc0720d460de8d912a241a5186ef6e8f524b830
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-1853. Tablet copy: Don't orphan blocks on failure

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/5799

to look at the new patch set (#5).

Change subject: KUDU-1853. Tablet copy: Don't orphan blocks on failure
......................................................................

KUDU-1853. Tablet copy: Don't orphan blocks on failure

Previously, if a tablet copy failed we would orphan data blocks. This
patch makes it so that a failed tablet copy operation that does not
involve a process crash does not orphan data blocks.

This also refactors some deletion logic out of TSTabletManager so that
TabletCopyClient will tombstone partially-copied tablets when the copy
operation fails.

Change-Id: Ifbc0720d460de8d912a241a5186ef6e8f524b830
---
M src/kudu/integration-tests/delete_table-test.cc
M src/kudu/tserver/tablet_copy-test-base.h
M src/kudu/tserver/tablet_copy_client-test.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/tablet_copy_client.h
M src/kudu/tserver/tablet_copy_service-test.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/ts_tablet_manager.h
8 files changed, 213 insertions(+), 120 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/99/5799/5
-- 
To view, visit http://gerrit.cloudera.org:8080/5799
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifbc0720d460de8d912a241a5186ef6e8f524b830
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-1853. Tablet copy: Don't orphan blocks on failure

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/5799

to look at the new patch set (#2).

Change subject: KUDU-1853. Tablet copy: Don't orphan blocks on failure
......................................................................

KUDU-1853. Tablet copy: Don't orphan blocks on failure

Previously, if a tablet copy failed we would orphan data blocks. This
patch makes it so that a failed tablet copy operation that does not
involve a process crash does not orphan data blocks.

Change-Id: Ifbc0720d460de8d912a241a5186ef6e8f524b830
---
M src/kudu/tserver/tablet_copy_client-test.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/tablet_copy_client.h
3 files changed, 101 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/99/5799/2
-- 
To view, visit http://gerrit.cloudera.org:8080/5799
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifbc0720d460de8d912a241a5186ef6e8f524b830
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-1853. Tablet copy: Don't orphan blocks on failure

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change.

Change subject: KUDU-1853. Tablet copy: Don't orphan blocks on failure
......................................................................


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/5799/4/src/kudu/tserver/tablet_copy_client.cc
File src/kudu/tserver/tablet_copy_client.cc:

PS4, Line 114:   Status s = Abort();
             :   if (PREDICT_FALSE(!s.ok())) {
             :     LOG_WITH_PREFIX(FATAL) << "Failed to fully clean up tablet after aborted copy: "
             :                            << s.ToString();
             :   }
> Why not just WARN_NOT_OK()? Why FATAL?
I was trying to keep the existing behavior that TSTabletManager exhibits on failure, but you're right -- this doesn't really need to be a fatal failure. I'll change it to WARNING.


Line 212:   // Set the data state to COPYING by default.
> Maybe "Set the data state to COPYING to signify that, on crash, this replic
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/5799
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifbc0720d460de8d912a241a5186ef6e8f524b830
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1853. Tablet copy: Don't orphan blocks on failure

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: KUDU-1853. Tablet copy: Don't orphan blocks on failure
......................................................................


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5799/5/src/kudu/tserver/ts_tablet_manager.cc
File src/kudu/tserver/ts_tablet_manager.cc:

Line 1041:   // or copying a tablet.
Nit: we're not "copying" the tablet though; we're tombstoning a tablet that was mid-copy. So this should be something like "We do not delete the superblock or the consensus metadata when tombstoning a tablet, whether it was whole or in the middle of a tablet copy."


-- 
To view, visit http://gerrit.cloudera.org:8080/5799
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifbc0720d460de8d912a241a5186ef6e8f524b830
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1853. Tablet copy: Don't orphan blocks on failure

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change.

Change subject: KUDU-1853. Tablet copy: Don't orphan blocks on failure
......................................................................


Patch Set 2:

(3 comments)

Heh, I tried to go the other way and it ended up looking complex to test. So I just moved all the deletion logic into the TabletCopyClient.

http://gerrit.cloudera.org:8080/#/c/5799/2/src/kudu/tserver/tablet_copy_client.cc
File src/kudu/tserver/tablet_copy_client.cc:

Line 294: Status TabletCopyClient::Abort() {
> It should be possible but it would likely involve a bunch of refactoring to
OK. It looks like I was able to do this and it was simpler than the alternative from a testing perspective.


http://gerrit.cloudera.org:8080/#/c/5799/2/src/kudu/tserver/tablet_copy_client.h
File src/kudu/tserver/tablet_copy_client.h:

PS2, Line 189:   bool started_;            // Session started.
             :   bool downloaded_wal_;     // WAL segments downloaded.
             :   bool downloaded_blocks_;  // Data blocks downloaded.
             :   bool finished_;           // Persisted new superblock.
> An enum would require a particular order between downloading the wal and th
Ended up doing this after all.


Line 214:   std::vector<BlockId> persisted_blocks_;
> I did it like this for a couple reasons:
I ended up taking your suggestion for this too.


-- 
To view, visit http://gerrit.cloudera.org:8080/5799
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifbc0720d460de8d912a241a5186ef6e8f524b830
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1853. Tablet copy: Don't orphan blocks on failure

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/5799

to look at the new patch set (#6).

Change subject: KUDU-1853. Tablet copy: Don't orphan blocks on failure
......................................................................

KUDU-1853. Tablet copy: Don't orphan blocks on failure

Previously, if a tablet copy failed we would orphan data blocks. This
patch makes it so that a failed tablet copy operation that does not
involve a process crash does not orphan data blocks.

This also refactors some deletion logic out of TSTabletManager so that
TabletCopyClient will tombstone partially-copied tablets when the copy
operation fails.

Change-Id: Ifbc0720d460de8d912a241a5186ef6e8f524b830
---
M src/kudu/integration-tests/delete_table-test.cc
M src/kudu/tserver/tablet_copy-test-base.h
M src/kudu/tserver/tablet_copy_client-test.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/tablet_copy_client.h
M src/kudu/tserver/tablet_copy_service-test.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/ts_tablet_manager.h
9 files changed, 224 insertions(+), 132 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/99/5799/6
-- 
To view, visit http://gerrit.cloudera.org:8080/5799
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifbc0720d460de8d912a241a5186ef6e8f524b830
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-1853. Tablet copy: Don't orphan blocks on failure

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has submitted this change and it was merged.

Change subject: KUDU-1853. Tablet copy: Don't orphan blocks on failure
......................................................................


KUDU-1853. Tablet copy: Don't orphan blocks on failure

Previously, if a tablet copy failed we would orphan data blocks. This
patch makes it so that a failed tablet copy operation that does not
involve a process crash does not orphan data blocks.

This also refactors some deletion logic out of TSTabletManager so that
TabletCopyClient will tombstone partially-copied tablets when the copy
operation fails.

Change-Id: Ifbc0720d460de8d912a241a5186ef6e8f524b830
Reviewed-on: http://gerrit.cloudera.org:8080/5799
Reviewed-by: Adar Dembo <ad...@cloudera.com>
Tested-by: Kudu Jenkins
---
M src/kudu/integration-tests/delete_table-test.cc
M src/kudu/tserver/tablet_copy-test-base.h
M src/kudu/tserver/tablet_copy_client-test.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/tablet_copy_client.h
M src/kudu/tserver/tablet_copy_service-test.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/ts_tablet_manager.h
9 files changed, 224 insertions(+), 132 deletions(-)

Approvals:
  Adar Dembo: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/5799
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ifbc0720d460de8d912a241a5186ef6e8f524b830
Gerrit-PatchSet: 7
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-1853. Tablet copy: Don't orphan blocks on failure

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: KUDU-1853. Tablet copy: Don't orphan blocks on failure
......................................................................


Patch Set 6: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/5799
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifbc0720d460de8d912a241a5186ef6e8f524b830
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] KUDU-1853. Tablet copy: Don't orphan blocks on failure

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: KUDU-1853. Tablet copy: Don't orphan blocks on failure
......................................................................


Patch Set 4:

(2 comments)

Looks good, though the test failures look real.

http://gerrit.cloudera.org:8080/#/c/5799/4/src/kudu/tserver/tablet_copy_client.cc
File src/kudu/tserver/tablet_copy_client.cc:

PS4, Line 114:   Status s = Abort();
             :   if (PREDICT_FALSE(!s.ok())) {
             :     LOG_WITH_PREFIX(FATAL) << "Failed to fully clean up tablet after aborted copy: "
             :                            << s.ToString();
             :   }
Why not just WARN_NOT_OK()? Why FATAL?


Line 212:   // Set the data state to COPYING by default.
Maybe "Set the data state to COPYING to signify that, on crash, this replica should be discarded."


-- 
To view, visit http://gerrit.cloudera.org:8080/5799
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifbc0720d460de8d912a241a5186ef6e8f524b830
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes