You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Todd Lipcon (Code Review)" <ge...@cloudera.org> on 2017/06/02 00:41:50 UTC

[kudu-CR] tool: add a 'local-replica cmeta set-term' tool

Hello Mike Percy,

I'd like you to do a code review.  Please visit

    http://gerrit.cloudera.org:8080/7049

to review the following change.

Change subject: tool: add a 'local-replica cmeta set-term' tool
......................................................................

tool: add a 'local-replica cmeta set-term' tool

Through abuse of some other tools (and restoring from a backup of a
cmeta file) I ended up with a situation where a tablet's WAL contained
operations from a term higher than the term listed in its cmeta file.
This caused the tablet to fail to start due to seeing this inconsistency
(the highest term in the WAL should always be <= the term in the cmeta).

This patch adds a tool that I wrote in order to recover from the
situation. The tool allows the operator to override the term stored in
the cmeta file. It's restricted to only bumping the term upwards, since
doing so is always "safe" whereas reducing it backwards could have
really bad consequences.

I also took the opportunity to add some new tests for the 'cmeta' tool
functionality.

Change-Id: I7525ffbe772f214e0972a6b450f3f1609109ca05
---
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_local_replica.cc
2 files changed, 109 insertions(+), 10 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/49/7049/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7049
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I7525ffbe772f214e0972a6b450f3f1609109ca05
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Mike Percy <mp...@apache.org>

[kudu-CR] tool: add a 'local-replica cmeta set-term' tool

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.

Change subject: tool: add a 'local-replica cmeta set-term' tool
......................................................................


Patch Set 2:

Alexey, I think we're waiting on a +2 from you

-- 
To view, visit http://gerrit.cloudera.org:8080/7049
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7525ffbe772f214e0972a6b450f3f1609109ca05
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] tool: add a 'local-replica cmeta set-term' tool

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change.

Change subject: tool: add a 'local-replica cmeta set-term' tool
......................................................................


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/7049/2/src/kudu/tools/tool_action_local_replica.cc
File src/kudu/tools/tool_action_local_replica.cc:

PS2, Line 291: =
I would expect term 0 were allowed.  Or it's some sort of optimization since the target term is supposed to be X+1, where X >= 0?


Line 305:         "specified term $0 must be higher than current term $1",
nit: than _the_ current term ?


PS2, Line 309:   // Make a copy of the old file before rewriting it.
             :   RETURN_NOT_OK(BackupConsensusMetadata(&fs_manager, tablet_id));
Would it be safer to first backup the existing metadata and load the cmeta from the backup file?  Or, maybe, add an extra  verification step into BackupConsensusMetadata() just to verify the backup file can be loaded after the backup is finished.

The concern is that if the source file is being updated, the generated backup might be inconsistent.


-- 
To view, visit http://gerrit.cloudera.org:8080/7049
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7525ffbe772f214e0972a6b450f3f1609109ca05
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] tool: add a 'local-replica cmeta set-term' tool

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.

Change subject: tool: add a 'local-replica cmeta set-term' tool
......................................................................


Patch Set 1:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/7049/1/src/kudu/tools/kudu-tool-test.cc
File src/kudu/tools/kudu-tool-test.cc:

Line 1486:   ASSERT_OK(mini_cluster_->WaitForTabletServerCount(1));
> This is guaranteed by MiniCluster::Start().
Done


Line 1518:                                              flags, tablet_id)));
> can we dump the pbc and validate that it's really 123 now?
Done


http://gerrit.cloudera.org:8080/#/c/7049/1/src/kudu/tools/tool_action_local_replica.cc
File src/kudu/tools/tool_action_local_replica.cc:

Line 306: 
> nit: clean up the spacing in this function a little
Done


Line 310:   cmeta->clear_voted_for();
> Why is it important to do this? I can take my answer in the form of a code 
Done


Line 798:       .Description("Bump the current term stored in consensus metadata")
> Maybe add that the new term must be greater than the old one? Either here o
Done


PS1, Line 800: "term"
> Since this is used in two places, our convention has been to hide it behind
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7049
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7525ffbe772f214e0972a6b450f3f1609109ca05
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] tool: add a 'local-replica cmeta set-term' tool

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change.

Change subject: tool: add a 'local-replica cmeta set-term' tool
......................................................................


Patch Set 2: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7049
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7525ffbe772f214e0972a6b450f3f1609109ca05
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] tool: add a 'local-replica cmeta set-term' tool

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change.

Change subject: tool: add a 'local-replica cmeta set-term' tool
......................................................................


Patch Set 1:

(2 comments)

I think this is a good idea because it's a relatively safe thing and we should really only suggest editing of the cmeta files as a last resort.

http://gerrit.cloudera.org:8080/#/c/7049/1/src/kudu/tools/kudu-tool-test.cc
File src/kudu/tools/kudu-tool-test.cc:

Line 1518:                                              flags, tablet_id)));
can we dump the pbc and validate that it's really 123 now?


http://gerrit.cloudera.org:8080/#/c/7049/1/src/kudu/tools/tool_action_local_replica.cc
File src/kudu/tools/tool_action_local_replica.cc:

Line 306: 
nit: clean up the spacing in this function a little


-- 
To view, visit http://gerrit.cloudera.org:8080/7049
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7525ffbe772f214e0972a6b450f3f1609109ca05
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] tool: add a 'local-replica cmeta set-term' tool

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has submitted this change and it was merged.

Change subject: tool: add a 'local-replica cmeta set-term' tool
......................................................................


tool: add a 'local-replica cmeta set-term' tool

Through abuse of some other tools (and restoring from a backup of a
cmeta file) I ended up with a situation where a tablet's WAL contained
operations from a term higher than the term listed in its cmeta file.
This caused the tablet to fail to start due to seeing this inconsistency
(the highest term in the WAL should always be <= the term in the cmeta).

This patch adds a tool that I wrote in order to recover from the
situation. The tool allows the operator to override the term stored in
the cmeta file. It's restricted to only bumping the term upwards, since
doing so is always "safe" whereas reducing it backwards could have
really bad consequences.

I also took the opportunity to add some new tests for the 'cmeta' tool
functionality.

Change-Id: I7525ffbe772f214e0972a6b450f3f1609109ca05
Reviewed-on: http://gerrit.cloudera.org:8080/7049
Reviewed-by: Adar Dembo <ad...@cloudera.com>
Reviewed-by: Mike Percy <mp...@apache.org>
Tested-by: Kudu Jenkins
Reviewed-by: Alexey Serbin <as...@cloudera.com>
---
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_local_replica.cc
2 files changed, 122 insertions(+), 11 deletions(-)

Approvals:
  Mike Percy: Looks good to me, but someone else must approve
  Adar Dembo: Looks good to me, but someone else must approve
  Alexey Serbin: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/7049
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I7525ffbe772f214e0972a6b450f3f1609109ca05
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] tool: add a 'local-replica cmeta set-term' tool

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.

Change subject: tool: add a 'local-replica cmeta set-term' tool
......................................................................


Patch Set 1:

curious whether you think this is still useful considering the 'kudu pbc edit' that I also have up for review.

This one is a little more constrained, which may be useful, but maybe not worth it vs just having the editor?

-- 
To view, visit http://gerrit.cloudera.org:8080/7049
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7525ffbe772f214e0972a6b450f3f1609109ca05
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] tool: add a 'local-replica cmeta set-term' tool

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: tool: add a 'local-replica cmeta set-term' tool
......................................................................


Patch Set 2: Code-Review+1

Deferring to Mike.

-- 
To view, visit http://gerrit.cloudera.org:8080/7049
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7525ffbe772f214e0972a6b450f3f1609109ca05
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] tool: add a 'local-replica cmeta set-term' tool

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: tool: add a 'local-replica cmeta set-term' tool
......................................................................


Patch Set 1:

(4 comments)

> curious whether you think this is still useful considering the
 > 'kudu pbc edit' that I also have up for review.
 > 
 > This one is a little more constrained, which may be useful, but
 > maybe not worth it vs just having the editor?

I definitely think this is useful, despite a more free-form editing tool.

http://gerrit.cloudera.org:8080/#/c/7049/1/src/kudu/tools/kudu-tool-test.cc
File src/kudu/tools/kudu-tool-test.cc:

Line 1486:   ASSERT_OK(mini_cluster_->WaitForTabletServerCount(1));
This is guaranteed by MiniCluster::Start().


http://gerrit.cloudera.org:8080/#/c/7049/1/src/kudu/tools/tool_action_local_replica.cc
File src/kudu/tools/tool_action_local_replica.cc:

Line 310:   cmeta->clear_voted_for();
Why is it important to do this? I can take my answer in the form of a code comment.


Line 798:       .Description("Bump the current term stored in consensus metadata")
Maybe add that the new term must be greater than the old one? Either here or in ExtraDescription.


PS1, Line 800: "term"
Since this is used in two places, our convention has been to hide it behind a kBlahArg constant.


-- 
To view, visit http://gerrit.cloudera.org:8080/7049
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7525ffbe772f214e0972a6b450f3f1609109ca05
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] tool: add a 'local-replica cmeta set-term' tool

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change.

Change subject: tool: add a 'local-replica cmeta set-term' tool
......................................................................


Patch Set 2: Code-Review+1

(3 comments)

http://gerrit.cloudera.org:8080/#/c/7049/2/src/kudu/tools/tool_action_local_replica.cc
File src/kudu/tools/tool_action_local_replica.cc:

PS2, Line 291: =
> I would expect term 0 were allowed.  Or it's some sort of optimization sinc
Yeah, the lowest term is 0 and you can't set the term <= the current term, so 0 shouldn't be allowed in any case.


Line 305:         "specified term $0 must be higher than current term $1",
> nit: than _the_ current term ?
Interesting. It sounds ok to me as-is in the context of a computer error message but in spoken language (or even in a comment) it would sound wrong. I think in this context things like definite articles are sometimes clipped in order to make the message shorter. That said, I could go either way on this.


PS2, Line 309:   // Make a copy of the old file before rewriting it.
             :   RETURN_NOT_OK(BackupConsensusMetadata(&fs_manager, tablet_id));
> Would it be safer to first backup the existing metadata and load the cmeta 
We have an FsManager lock from L297-L298 so nobody else should be messing with the files. If we didn't have that lock, this would definitely not be safe to do.


-- 
To view, visit http://gerrit.cloudera.org:8080/7049
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7525ffbe772f214e0972a6b450f3f1609109ca05
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] tool: add a 'local-replica cmeta set-term' tool

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/7049

to look at the new patch set (#2).

Change subject: tool: add a 'local-replica cmeta set-term' tool
......................................................................

tool: add a 'local-replica cmeta set-term' tool

Through abuse of some other tools (and restoring from a backup of a
cmeta file) I ended up with a situation where a tablet's WAL contained
operations from a term higher than the term listed in its cmeta file.
This caused the tablet to fail to start due to seeing this inconsistency
(the highest term in the WAL should always be <= the term in the cmeta).

This patch adds a tool that I wrote in order to recover from the
situation. The tool allows the operator to override the term stored in
the cmeta file. It's restricted to only bumping the term upwards, since
doing so is always "safe" whereas reducing it backwards could have
really bad consequences.

I also took the opportunity to add some new tests for the 'cmeta' tool
functionality.

Change-Id: I7525ffbe772f214e0972a6b450f3f1609109ca05
---
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_local_replica.cc
2 files changed, 122 insertions(+), 11 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/49/7049/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7049
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7525ffbe772f214e0972a6b450f3f1609109ca05
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>