You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Andrew Wong (Code Review)" <ge...@cloudera.org> on 2019/10/04 01:48:33 UTC

[kudu-CR] KUDU-2069 p7: add tool to enter/exit maintenance

Andrew Wong has uploaded this change for review. ( http://gerrit.cloudera.org:8080/14368


Change subject: KUDU-2069 p7: add tool to enter/exit maintenance
......................................................................

KUDU-2069 p7: add tool to enter/exit maintenance

Adds the following tools:
$ kudu tserver set_state enter_maintenance <masters> <ts_id>
$ kudu tserver set_state exit_maintenance <masters> <ts_id>

Change-Id: Ifab64dcb5b7b0d171502c9154c83113c82e5972d
---
M src/kudu/client/master_proxy_rpc.cc
M src/kudu/master/ts_manager.cc
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_common.cc
M src/kudu/tools/tool_action_tserver.cc
5 files changed, 118 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/68/14368/1
-- 
To view, visit http://gerrit.cloudera.org:8080/14368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ifab64dcb5b7b0d171502c9154c83113c82e5972d
Gerrit-Change-Number: 14368
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>

[kudu-CR] KUDU-2069 p7: add tool to enter/exit maintenance

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14368 )

Change subject: KUDU-2069 p7: add tool to enter/exit maintenance
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14368/1/src/kudu/tools/tool_action_tserver.cc
File src/kudu/tools/tool_action_tserver.cc:

http://gerrit.cloudera.org:8080/#/c/14368/1/src/kudu/tools/tool_action_tserver.cc@277
PS1, Line 277:       .AddMode(std::move(set_state))
> To me, 'state' is more about the result some actions, etc.  State is a more
I agree re: state, which is why I think it's more appropriate. Eg when you decommission a node (action), the end state is that it's decommissioned (state).

I disagree w.r.t this CLI tool specifically. If the current state isn't what the user expects, e.g. the edge on the "tserver state state machine" doesn't exist, the tool should fail. And I think the user should be clear what edge they want to go through with, since some transitions we might not want to support (e.g. decommissioned --> none? decommissioned --> maintenance mode? We _might_ decide to support these, but we might not, and I think having the user specify that they are ok with the transition by providing the actual transition via CLI is more future proof).



-- 
To view, visit http://gerrit.cloudera.org:8080/14368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifab64dcb5b7b0d171502c9154c83113c82e5972d
Gerrit-Change-Number: 14368
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Hannah Nguyen <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Fri, 04 Oct 2019 21:37:28 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2069 p7: add tool to enter/exit maintenance

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14368 )

Change subject: KUDU-2069 p7: add tool to enter/exit maintenance
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14368/1/src/kudu/tools/tool_action_tserver.cc
File src/kudu/tools/tool_action_tserver.cc:

http://gerrit.cloudera.org:8080/#/c/14368/1/src/kudu/tools/tool_action_tserver.cc@277
PS1, Line 277:       .AddMode(std::move(set_state))
> To me, 'mode' sounds user-friendlier as well.
To me, mode seems like a mode of operation. "Fast mode", "slow mode", "debug mode", etc., rather than transient states of being.

I think calling out the explicit transition is preferable over the state because it means only the transition the user wants to make is performed, and if the transition isn't viable anymore (e.g. because the tserver isn't in the state anymore), an error should be raised instead.



-- 
To view, visit http://gerrit.cloudera.org:8080/14368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifab64dcb5b7b0d171502c9154c83113c82e5972d
Gerrit-Change-Number: 14368
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Hannah Nguyen <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Fri, 04 Oct 2019 21:20:53 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2069 p7: add tool to enter/exit maintenance

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14368 )

Change subject: KUDU-2069 p7: add tool to enter/exit maintenance
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14368/1/src/kudu/tools/tool_action_tserver.cc
File src/kudu/tools/tool_action_tserver.cc:

http://gerrit.cloudera.org:8080/#/c/14368/1/src/kudu/tools/tool_action_tserver.cc@277
PS1, Line 277:       .AddMode(std::move(set_state))
> Yep, having strict operation mode might be an options as well  (like return
I favor strict for both cases.

If the tserver is concurrently decommissioned, did the operator mean to set it back to none? Probably not. If they're trying to enter maintenance mode and it's already there, it's a no op.



-- 
To view, visit http://gerrit.cloudera.org:8080/14368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifab64dcb5b7b0d171502c9154c83113c82e5972d
Gerrit-Change-Number: 14368
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Hannah Nguyen <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Sat, 05 Oct 2019 04:06:49 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2069 p7: add tool to enter/exit maintenance

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/14368 )

Change subject: KUDU-2069 p7: add tool to enter/exit maintenance
......................................................................


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/14368/4/src/kudu/master/master.proto
File src/kudu/master/master.proto:

http://gerrit.cloudera.org:8080/#/c/14368/4/src/kudu/master/master.proto@867
PS4, Line 867: DONT_IGNORE_MISSING_TSERVER
It seems DONT_IGNORE_MISSING_TSERVER is not a member of enumeration above.


http://gerrit.cloudera.org:8080/#/c/14368/4/src/kudu/master/ts_manager.h
File src/kudu/master/ts_manager.h:

http://gerrit.cloudera.org:8080/#/c/14368/4/src/kudu/master/ts_manager.h@44
PS4, Line 44: // Whether to allow the setting of a tserver state when the requested tserver
            : // hasn't been registered. This may be useful, e.g. if setting the state after
            : // restarting the master.
            : enum SetStateWhenTSDoesntExist {
            :   YES,
            :   NO,
            : };
Is it needed once the mirroring enumeration was introduced in *.proto?



-- 
To view, visit http://gerrit.cloudera.org:8080/14368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifab64dcb5b7b0d171502c9154c83113c82e5972d
Gerrit-Change-Number: 14368
Gerrit-PatchSet: 4
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Hannah Nguyen <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Tue, 08 Oct 2019 17:23:36 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2069 p7: add tool to enter/exit maintenance

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14368 )

Change subject: KUDU-2069 p7: add tool to enter/exit maintenance
......................................................................


Patch Set 4:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/14368/4/src/kudu/master/master.proto
File src/kudu/master/master.proto:

http://gerrit.cloudera.org:8080/#/c/14368/4/src/kudu/master/master.proto@867
PS4, Line 867: DONT_IGNORE_MISSING_TSERVER
> It seems DONT_IGNORE_MISSING_TSERVER is not a member of enumeration above.
Done


http://gerrit.cloudera.org:8080/#/c/14368/4/src/kudu/master/ts_manager.h
File src/kudu/master/ts_manager.h:

http://gerrit.cloudera.org:8080/#/c/14368/4/src/kudu/master/ts_manager.h@44
PS4, Line 44: // Whether to allow the setting of a tserver state when the requested tserver
            : // hasn't been registered. This may be useful, e.g. if setting the state after
            : // restarting the master.
            : enum SetStateWhenTSDoesntExist {
            :   YES,
            :   NO,
            : };
> Is it needed once the mirroring enumeration was introduced in *.proto?
Oops, good catch :)


http://gerrit.cloudera.org:8080/#/c/14368/4/src/kudu/master/ts_manager.cc
File src/kudu/master/ts_manager.cc:

http://gerrit.cloudera.org:8080/#/c/14368/4/src/kudu/master/ts_manager.cc@238
PS4, Line 238:  if (existing_state == ts_state) {
> Just to clarify: it seems ChangeTServerStateRequestPB is not affecting the 
Yep, that's intentional, since it is a no-op, and there thus isn't any harm in allowing it.


http://gerrit.cloudera.org:8080/#/c/14368/4/src/kudu/tools/kudu-tool-test.cc
File src/kudu/tools/kudu-tool-test.cc:

http://gerrit.cloudera.org:8080/#/c/14368/4/src/kudu/tools/kudu-tool-test.cc@4791
PS4, Line 4791:   ASSERT_TRUE(RunActionStderrString(
> Do you want to be explicit that exiting the maintenance mode for non-existi
Done


http://gerrit.cloudera.org:8080/#/c/14368/4/src/kudu/tools/tool_action_tserver.cc
File src/kudu/tools/tool_action_tserver.cc:

http://gerrit.cloudera.org:8080/#/c/14368/4/src/kudu/tools/tool_action_tserver.cc@44
PS4, Line 44: sets
> drop?
Done


http://gerrit.cloudera.org:8080/#/c/14368/4/src/kudu/tools/tool_action_tserver.cc@45
PS4, Line 45: tserver even if the tablet server
> style nit: use either 'tserver' or 'tablet server', but not both
Done


http://gerrit.cloudera.org:8080/#/c/14368/4/src/kudu/tools/tool_action_tserver.cc@46
PS4, Line 46: state
> state record ?
Done


http://gerrit.cloudera.org:8080/#/c/14368/4/src/kudu/tools/tool_action_tserver.cc@258
PS4, Line 258:  
> nit: maybe 'While in maintenance, failures of the ...'  ?
Done, reworded the message a bit


http://gerrit.cloudera.org:8080/#/c/14368/4/src/kudu/tools/tool_action_tserver.cc@259
PS4, Line 259: recovery
> nit: do you want to be more specific, i.e. change recovery --> re-replicati
Done


http://gerrit.cloudera.org:8080/#/c/14368/4/src/kudu/tools/tool_action_tserver.cc@265
PS4, Line 265: --allow_missing
> Nit: --allow_missing_tserver
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/14368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifab64dcb5b7b0d171502c9154c83113c82e5972d
Gerrit-Change-Number: 14368
Gerrit-PatchSet: 4
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Hannah Nguyen <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Tue, 08 Oct 2019 20:19:44 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2069 p7: add tool to enter/exit maintenance

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14368 )

Change subject: KUDU-2069 p7: add tool to enter/exit maintenance
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14368/1/src/kudu/tools/kudu-tool-test.cc
File src/kudu/tools/kudu-tool-test.cc:

http://gerrit.cloudera.org:8080/#/c/14368/1/src/kudu/tools/kudu-tool-test.cc@4784
PS1, Line 4784:   NO_FATALS(RunActionStdoutNone(
              :       Substitute("tserver set_state enter_maintenance $0 $1",
              :                  mini_master->bound_rpc_addr().ToString(), kDummyUuid)));
> Can you provide an example where this is a feature rather than a bug?
The series of events for a rolling restart might be scripted as:

restart the masters in series
for t in tservers:
  enter maintenance mode(t)
  stop, start, wait for bootstrap(t)
  exit maintenance mode(t)

In between starting the masters and the tservers getting the heartbeat back, should the tool fail? Should it wait? Can we guarantee that the tserver heartbeat will have been processed within a certain amount of time?

OTOH scripting retries might not be a bad option.



-- 
To view, visit http://gerrit.cloudera.org:8080/14368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifab64dcb5b7b0d171502c9154c83113c82e5972d
Gerrit-Change-Number: 14368
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Hannah Nguyen <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Sat, 05 Oct 2019 01:58:59 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2069 p7: add tool to enter/exit maintenance

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14368 )

Change subject: KUDU-2069 p7: add tool to enter/exit maintenance
......................................................................


Patch Set 4:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/14368/2/src/kudu/master/ts_manager.h
File src/kudu/master/ts_manager.h:

http://gerrit.cloudera.org:8080/#/c/14368/2/src/kudu/master/ts_manager.h@44
PS2, Line 44: // Whether to allow the setting of a tserver state when the requested tserver
            : // hasn't been registered. This may be useful, e.g. if setting the state after
            : // restarting the master.
            : enum SetStateWhenTSDoesntExist {
            :   YES,
            :   NO,
            : };
> If you like you can declare this as part of the wire protocol in master.pro
Done


http://gerrit.cloudera.org:8080/#/c/14368/2/src/kudu/master/ts_manager.cc
File src/kudu/master/ts_manager.cc:

http://gerrit.cloudera.org:8080/#/c/14368/2/src/kudu/master/ts_manager.cc@234
PS2, Line 234: B::HandleMissingTS handl
> Nit: "set" here is a little ambiguous: does it refer to the variable itself
Done

I went with ALLOW instead because IGNORE is a bit ambiguous IMO


http://gerrit.cloudera.org:8080/#/c/14368/2/src/kudu/tools/tool_action_tserver.cc
File src/kudu/tools/tool_action_tserver.cc:

http://gerrit.cloudera.org:8080/#/c/14368/2/src/kudu/tools/tool_action_tserver.cc@26
PS2, Line 26: #include <gflags/gflags.h>
> warning: #includes are not sorted properly [llvm-include-order]
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/14368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifab64dcb5b7b0d171502c9154c83113c82e5972d
Gerrit-Change-Number: 14368
Gerrit-PatchSet: 4
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Hannah Nguyen <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Tue, 08 Oct 2019 17:13:03 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2069 p7: add tool to enter/exit maintenance

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/14368 )

Change subject: KUDU-2069 p7: add tool to enter/exit maintenance
......................................................................


Patch Set 2: Code-Review+2

(2 comments)

http://gerrit.cloudera.org:8080/#/c/14368/2/src/kudu/master/ts_manager.h
File src/kudu/master/ts_manager.h:

http://gerrit.cloudera.org:8080/#/c/14368/2/src/kudu/master/ts_manager.h@44
PS2, Line 44: // Whether to allow the setting of a tserver state when the requested tserver
            : // hasn't been registered. This may be useful, e.g. if setting the state after
            : // restarting the master.
            : enum SetStateWhenTSDoesntExist {
            :   YES,
            :   NO,
            : };
If you like you can declare this as part of the wire protocol in master.proto and avoid the conversion between it and bool in master_service.cc.


http://gerrit.cloudera.org:8080/#/c/14368/2/src/kudu/master/ts_manager.cc
File src/kudu/master/ts_manager.cc:

http://gerrit.cloudera.org:8080/#/c/14368/2/src/kudu/master/ts_manager.cc@234
PS2, Line 234: set_when_ts_doesnt_exist
Nit: "set" here is a little ambiguous: does it refer to the variable itself (which is either set or unset), or does it refer to the desired behavior of the function? I know it's the latter, but I can't help but be tripped up whenever I read the variable name.

Maybe generalize it to a "mode" whose values include IGNORE_MISSING_TSERVER and DONT_IGNORE_MISSING_TSERVER?



-- 
To view, visit http://gerrit.cloudera.org:8080/14368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifab64dcb5b7b0d171502c9154c83113c82e5972d
Gerrit-Change-Number: 14368
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Hannah Nguyen <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Tue, 08 Oct 2019 16:46:03 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2069 p7: add tool to enter/exit maintenance

Posted by "Grant Henke (Code Review)" <ge...@cloudera.org>.
Grant Henke has posted comments on this change. ( http://gerrit.cloudera.org:8080/14368 )

Change subject: KUDU-2069 p7: add tool to enter/exit maintenance
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14368/1/src/kudu/tools/tool_action_tserver.cc
File src/kudu/tools/tool_action_tserver.cc:

http://gerrit.cloudera.org:8080/#/c/14368/1/src/kudu/tools/tool_action_tserver.cc@277
PS1, Line 277:       .AddMode(std::move(set_state))
Often we have called these states "modes". Like "maintenance mode", "decommission mode", etc. That may be another option to consider. 

Instead of `exit_maintenance` would it make sense to have a "standard" running mode? Then instead of existing maintenance mode you set back to the standard one? Or perhaps even go from maintenance to decommission in the future.



-- 
To view, visit http://gerrit.cloudera.org:8080/14368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifab64dcb5b7b0d171502c9154c83113c82e5972d
Gerrit-Change-Number: 14368
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Hannah Nguyen <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Fri, 04 Oct 2019 20:49:50 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2069 p7: add tool to enter/exit maintenance

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/14368 )

Change subject: KUDU-2069 p7: add tool to enter/exit maintenance
......................................................................


Patch Set 4:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/14368/4/src/kudu/master/ts_manager.cc
File src/kudu/master/ts_manager.cc:

http://gerrit.cloudera.org:8080/#/c/14368/4/src/kudu/master/ts_manager.cc@238
PS4, Line 238:  if (existing_state == ts_state) {
Just to clarify: it seems ChangeTServerStateRequestPB is not affecting the case when  (existing_state == ts_state).  Is it intentional?


http://gerrit.cloudera.org:8080/#/c/14368/4/src/kudu/tools/kudu-tool-test.cc
File src/kudu/tools/kudu-tool-test.cc:

http://gerrit.cloudera.org:8080/#/c/14368/4/src/kudu/tools/kudu-tool-test.cc@4791
PS4, Line 4791:   ASSERT_TRUE(RunActionStderrString(
Do you want to be explicit that exiting the maintenance mode for non-existing server is a success regardless of --allow_missing_tserver and no record for the server is there?  The scenario at line 4817 is close, but a bit different: it puts non-existing tserver out of the maintenance mode when it was in the maintenance mode.


http://gerrit.cloudera.org:8080/#/c/14368/4/src/kudu/tools/tool_action_tserver.cc
File src/kudu/tools/tool_action_tserver.cc:

http://gerrit.cloudera.org:8080/#/c/14368/4/src/kudu/tools/tool_action_tserver.cc@44
PS4, Line 44: sets
drop?


http://gerrit.cloudera.org:8080/#/c/14368/4/src/kudu/tools/tool_action_tserver.cc@45
PS4, Line 45: tserver even if the tablet server
style nit: use either 'tserver' or 'tablet server', but not both


http://gerrit.cloudera.org:8080/#/c/14368/4/src/kudu/tools/tool_action_tserver.cc@46
PS4, Line 46: state
state record ?


http://gerrit.cloudera.org:8080/#/c/14368/4/src/kudu/tools/tool_action_tserver.cc@258
PS4, Line 258:  
nit: maybe 'While in maintenance, failures of the ...'  ?


http://gerrit.cloudera.org:8080/#/c/14368/4/src/kudu/tools/tool_action_tserver.cc@259
PS4, Line 259: recovery
nit: do you want to be more specific, i.e. change recovery --> re-replication ?



-- 
To view, visit http://gerrit.cloudera.org:8080/14368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifab64dcb5b7b0d171502c9154c83113c82e5972d
Gerrit-Change-Number: 14368
Gerrit-PatchSet: 4
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Hannah Nguyen <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Tue, 08 Oct 2019 17:48:34 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2069 p7: add tool to enter/exit maintenance

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/14368 )

Change subject: KUDU-2069 p7: add tool to enter/exit maintenance
......................................................................


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/14368/1/src/kudu/tools/tool_action_tserver.cc
File src/kudu/tools/tool_action_tserver.cc:

http://gerrit.cloudera.org:8080/#/c/14368/1/src/kudu/tools/tool_action_tserver.cc@249
PS1, Line 249:   unique_ptr<Action> enter_maintenance =
             :       ActionBuilder("enter_maintenance", &EnterMaintenance)
             :       .Description("Put the Tablet Server in maintenance. Failures of the "
             :                    "Tablet Server will not lead to the recovery of tablet "
             :                    "replicas.")
             :       .AddRequiredParameter({ kMasterAddressesArg, kMasterAddressesArgDesc })
             :       .AddRequiredParameter({ kTServerIdArg, kTServerIdDesc })
             :       .Build();
             :   unique_ptr<Action> exit_maintenance =
             :       ActionBuilder("exit_maintenance", &ExitMaintenance)
             :       .Description("End maintenance of the Tablet Server.")
             :       .AddRequiredParameter({ kMasterAddressesArg, kMasterAddressesArgDesc })
             :       .AddRequiredParameter({ kTServerIdArg, kTServerIdDesc })
             :       .Build();
             :   unique_ptr<Mode> set_state = ModeBuilder("set_state")
             :       .Description("Set the state for a Kudu Tablet Server")
             :       .AddAction(std::move(enter_maintenance))
             :       .AddAction(std::move(exit_maintenance))
             :       .Build();
> I was thinking it'd look like:
Yep, I don't see why to have them separate either.


http://gerrit.cloudera.org:8080/#/c/14368/1/src/kudu/tools/tool_action_tserver.cc@277
PS1, Line 277:       .AddMode(std::move(set_state))
> Often we have called these states "modes". Like "maintenance mode", "decomm
To me, 'mode' sounds user-friendlier as well.

The approach with 'set_mode <maintenance|decomission|normal>' looks like a good alternative (that's what Grant suggested, as I understand).  It's sounds viable unless we are about to allow superposition of maintenance and decommission modes (and I don't think we want to allow such superposition, anyways).

Walking futher the bike-shedding alley, the CLI interface might be even as:

  'kudu tserver mode get'
  'kudu tserver mode set <maintenance|decomission|normal>'



-- 
To view, visit http://gerrit.cloudera.org:8080/14368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifab64dcb5b7b0d171502c9154c83113c82e5972d
Gerrit-Change-Number: 14368
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Hannah Nguyen <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Fri, 04 Oct 2019 21:04:02 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2069 p7: add tool to enter/exit maintenance

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14368 )

Change subject: KUDU-2069 p7: add tool to enter/exit maintenance
......................................................................


Patch Set 2:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/14368/1/src/kudu/master/ts_manager.cc
File src/kudu/master/ts_manager.cc:

http://gerrit.cloudera.org:8080/#/c/14368/1/src/kudu/master/ts_manager.cc@248
PS1, Line 248:   }
> Should we LOG(INFO) here too?
Done


http://gerrit.cloudera.org:8080/#/c/14368/1/src/kudu/tools/kudu-tool-test.cc
File src/kudu/tools/kudu-tool-test.cc:

http://gerrit.cloudera.org:8080/#/c/14368/1/src/kudu/tools/kudu-tool-test.cc@94
PS1, Line 94: #include "kudu/master/master.h"
> warning: #includes are not sorted properly [llvm-include-order]
Done


http://gerrit.cloudera.org:8080/#/c/14368/1/src/kudu/tools/kudu-tool-test.cc@4784
PS1, Line 4784:     ASSERT_EQ(TServerStatePB::MAINTENANCE_MODE, ts_manager->GetTServerState(ts_uuid));
              :     ASSERT_EQ(TServerStatePB::NONE, ts_manager->GetTServerState(kDummyUuid));
              :   }
> The series of events for a rolling restart might be scripted as:
We chatted about this on slack a bit. I've added a flag to allow users to force setting maintenance mode even the requested tserver doesn't exist. By default, this will error out (though the below command shouldn't fail, since there is state to remove).


http://gerrit.cloudera.org:8080/#/c/14368/1/src/kudu/tools/kudu-tool-test.cc@4795
PS1, Line 4795:   ASSERT_EQ(TServerStatePB::MAINTENANCE_MODE, ts_manager->GetTServerState(ts_uuid));
              :   ASSERT_EQ(TServerStatePB::NONE, ts_manager->GetTServerState(kDummyUuid));
              : 
              :   // But operators are able to specify a flag to force the maintenance mo
> Likewise, why doesn't this fail?
Ack



-- 
To view, visit http://gerrit.cloudera.org:8080/14368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifab64dcb5b7b0d171502c9154c83113c82e5972d
Gerrit-Change-Number: 14368
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Hannah Nguyen <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Tue, 08 Oct 2019 07:45:02 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2069 p7: add tool to enter/exit maintenance

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Adar Dembo, Grant Henke, Hannah Nguyen, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/14368

to look at the new patch set (#5).

Change subject: KUDU-2069 p7: add tool to enter/exit maintenance
......................................................................

KUDU-2069 p7: add tool to enter/exit maintenance

Adds the following tools:
$ kudu tserver state enter_maintenance <masters> <ts_id>
$ kudu tserver state exit_maintenance <masters> <ts_id>

This opts to have users specify the transition they expect to enact,
rather than the desired end state. This is because I expect the same
tool to be used for decommissioning. When that materializes, it will be
valuable to have users specify their expected starting state to avoid
accidentally overwriting tserver states unintentionally.

It also adds an option into the TSManager to allow the tool to raise an
error when running it with a tserver that doesn't exist. This isn't
always desired behavior of the TSManager because we will reload tserver
states when the master first starts up, at which point tservers may not
be registered yet. It is accessible through a CLI tool optional
argument.

Change-Id: Ifab64dcb5b7b0d171502c9154c83113c82e5972d
---
M src/kudu/client/master_proxy_rpc.cc
M src/kudu/master/master.proto
M src/kudu/master/master_service.cc
M src/kudu/master/ts_manager.cc
M src/kudu/master/ts_manager.h
M src/kudu/master/ts_state-test.cc
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_common.cc
M src/kudu/tools/tool_action_tserver.cc
9 files changed, 179 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/68/14368/5
-- 
To view, visit http://gerrit.cloudera.org:8080/14368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifab64dcb5b7b0d171502c9154c83113c82e5972d
Gerrit-Change-Number: 14368
Gerrit-PatchSet: 5
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Hannah Nguyen <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)

[kudu-CR] KUDU-2069 p7: add tool to enter/exit maintenance

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/14368 )

Change subject: KUDU-2069 p7: add tool to enter/exit maintenance
......................................................................


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/14368/1/src/kudu/master/ts_manager.cc
File src/kudu/master/ts_manager.cc:

http://gerrit.cloudera.org:8080/#/c/14368/1/src/kudu/master/ts_manager.cc@248
PS1, Line 248:     return Status::OK();
Should we LOG(INFO) here too?


http://gerrit.cloudera.org:8080/#/c/14368/1/src/kudu/tools/kudu-tool-test.cc
File src/kudu/tools/kudu-tool-test.cc:

http://gerrit.cloudera.org:8080/#/c/14368/1/src/kudu/tools/kudu-tool-test.cc@4784
PS1, Line 4784:   NO_FATALS(RunActionStdoutNone(
              :       Substitute("tserver set_state enter_maintenance $0 $1",
              :                  mini_master->bound_rpc_addr().ToString(), kDummyUuid)));
Why doesn't this fail in some way?


http://gerrit.cloudera.org:8080/#/c/14368/1/src/kudu/tools/kudu-tool-test.cc@4795
PS1, Line 4795: 
              :   NO_FATALS(RunActionStdoutNone(
              :       Substitute("tserver set_state exit_maintenance $0 $1",
              :                  mini_master->bound_rpc_addr().ToString(), kDummyUuid)));
Likewise, why doesn't this fail?



-- 
To view, visit http://gerrit.cloudera.org:8080/14368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifab64dcb5b7b0d171502c9154c83113c82e5972d
Gerrit-Change-Number: 14368
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Hannah Nguyen <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Sat, 05 Oct 2019 01:21:48 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2069 p7: add tool to enter/exit maintenance

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/14368 )

Change subject: KUDU-2069 p7: add tool to enter/exit maintenance
......................................................................


Patch Set 1:

> (1 comment)

OK, that sounds good to me.


-- 
To view, visit http://gerrit.cloudera.org:8080/14368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifab64dcb5b7b0d171502c9154c83113c82e5972d
Gerrit-Change-Number: 14368
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Hannah Nguyen <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Fri, 04 Oct 2019 20:55:32 +0000
Gerrit-HasComments: No

[kudu-CR] KUDU-2069 p7: add tool to enter/exit maintenance

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/14368 )

Change subject: KUDU-2069 p7: add tool to enter/exit maintenance
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14368/1/src/kudu/tools/kudu-tool-test.cc
File src/kudu/tools/kudu-tool-test.cc:

http://gerrit.cloudera.org:8080/#/c/14368/1/src/kudu/tools/kudu-tool-test.cc@4784
PS1, Line 4784:   NO_FATALS(RunActionStdoutNone(
              :       Substitute("tserver set_state enter_maintenance $0 $1",
              :                  mini_master->bound_rpc_addr().ToString(), kDummyUuid)));
> I don't think we can say with certainty that a tablet server will have alwa
Can you provide an example where this is a feature rather than a bug?

Would also be good to note this in a comment near the server-side code where this is allowed.



-- 
To view, visit http://gerrit.cloudera.org:8080/14368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifab64dcb5b7b0d171502c9154c83113c82e5972d
Gerrit-Change-Number: 14368
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Hannah Nguyen <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Sat, 05 Oct 2019 01:40:40 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2069 p7: add tool to enter/exit maintenance

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Adar Dembo, Grant Henke, Hannah Nguyen, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/14368

to look at the new patch set (#3).

Change subject: KUDU-2069 p7: add tool to enter/exit maintenance
......................................................................

KUDU-2069 p7: add tool to enter/exit maintenance

Adds the following tools:
$ kudu tserver state enter_maintenance <masters> <ts_id>
$ kudu tserver state exit_maintenance <masters> <ts_id>

This opts to have users specify the transition they expect to enact,
rather than the desired end state. This is because I expect the same
tool to be used for decommissioning. When that materializes, it will be
valuable to have users specify their expected starting state to avoid
accidentally overwriting tserver states unintentionally.

It also adds an option into the TSManager to allow the tool to raise an
error when running it with a tserver that doesn't exist. This isn't
always desired behavior of the TSManager because we will reload tserver
states when the master first starts up, at which point tservers may not
be registered yet. It is accessible through a CLI tool optional
argument.

Change-Id: Ifab64dcb5b7b0d171502c9154c83113c82e5972d
---
M src/kudu/client/master_proxy_rpc.cc
M src/kudu/master/master.proto
M src/kudu/master/master_service.cc
M src/kudu/master/ts_manager.cc
M src/kudu/master/ts_manager.h
M src/kudu/master/ts_state-test.cc
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_common.cc
M src/kudu/tools/tool_action_tserver.cc
9 files changed, 184 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/68/14368/3
-- 
To view, visit http://gerrit.cloudera.org:8080/14368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifab64dcb5b7b0d171502c9154c83113c82e5972d
Gerrit-Change-Number: 14368
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Hannah Nguyen <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)

[kudu-CR] KUDU-2069 p7: add tool to enter/exit maintenance

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/14368 )

Change subject: KUDU-2069 p7: add tool to enter/exit maintenance
......................................................................


Patch Set 5: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14368/4/src/kudu/master/ts_manager.cc
File src/kudu/master/ts_manager.cc:

http://gerrit.cloudera.org:8080/#/c/14368/4/src/kudu/master/ts_manager.cc@238
PS4, Line 238:  if (existing_state == ts_state) {
> Yep, that's intentional, since it is a no-op, and there thus isn't any harm
Thank you for the clarification.



-- 
To view, visit http://gerrit.cloudera.org:8080/14368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifab64dcb5b7b0d171502c9154c83113c82e5972d
Gerrit-Change-Number: 14368
Gerrit-PatchSet: 5
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Hannah Nguyen <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Tue, 08 Oct 2019 22:38:12 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2069 p7: add tool to enter/exit maintenance

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/14368 )

Change subject: KUDU-2069 p7: add tool to enter/exit maintenance
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14368/1/src/kudu/tools/tool_action_tserver.cc
File src/kudu/tools/tool_action_tserver.cc:

http://gerrit.cloudera.org:8080/#/c/14368/1/src/kudu/tools/tool_action_tserver.cc@277
PS1, Line 277:       .AddMode(std::move(set_state))
> To me, mode seems like a mode of operation. "Fast mode", "slow mode", "debu
To me, 'state' is more about the result some actions, etc.  State is a more detailed concept describing all the parts of an entity.  From that perspective, I don't think 'set state' is a right name for the command here.

As for the explicit transition, I think in CLI tool it's better to have an option to have idempotent mode of operations as well: that helps in scripting scenarios.



-- 
To view, visit http://gerrit.cloudera.org:8080/14368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifab64dcb5b7b0d171502c9154c83113c82e5972d
Gerrit-Change-Number: 14368
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Hannah Nguyen <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Fri, 04 Oct 2019 21:32:03 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2069 p7: add tool to enter/exit maintenance

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Adar Dembo, Grant Henke, Hannah Nguyen, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/14368

to look at the new patch set (#4).

Change subject: KUDU-2069 p7: add tool to enter/exit maintenance
......................................................................

KUDU-2069 p7: add tool to enter/exit maintenance

Adds the following tools:
$ kudu tserver state enter_maintenance <masters> <ts_id>
$ kudu tserver state exit_maintenance <masters> <ts_id>

This opts to have users specify the transition they expect to enact,
rather than the desired end state. This is because I expect the same
tool to be used for decommissioning. When that materializes, it will be
valuable to have users specify their expected starting state to avoid
accidentally overwriting tserver states unintentionally.

It also adds an option into the TSManager to allow the tool to raise an
error when running it with a tserver that doesn't exist. This isn't
always desired behavior of the TSManager because we will reload tserver
states when the master first starts up, at which point tservers may not
be registered yet. It is accessible through a CLI tool optional
argument.

Change-Id: Ifab64dcb5b7b0d171502c9154c83113c82e5972d
---
M src/kudu/client/master_proxy_rpc.cc
M src/kudu/master/master.proto
M src/kudu/master/master_service.cc
M src/kudu/master/ts_manager.cc
M src/kudu/master/ts_manager.h
M src/kudu/master/ts_state-test.cc
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_common.cc
M src/kudu/tools/tool_action_tserver.cc
9 files changed, 184 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/68/14368/4
-- 
To view, visit http://gerrit.cloudera.org:8080/14368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifab64dcb5b7b0d171502c9154c83113c82e5972d
Gerrit-Change-Number: 14368
Gerrit-PatchSet: 4
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Hannah Nguyen <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)

[kudu-CR] KUDU-2069 p7: add tool to enter/exit maintenance

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/14368 )

Change subject: KUDU-2069 p7: add tool to enter/exit maintenance
......................................................................

KUDU-2069 p7: add tool to enter/exit maintenance

Adds the following tools:
$ kudu tserver state enter_maintenance <masters> <ts_id>
$ kudu tserver state exit_maintenance <masters> <ts_id>

This opts to have users specify the transition they expect to enact,
rather than the desired end state. This is because I expect the same
tool to be used for decommissioning. When that materializes, it will be
valuable to have users specify their expected starting state to avoid
accidentally overwriting tserver states unintentionally.

It also adds an option into the TSManager to allow the tool to raise an
error when running it with a tserver that doesn't exist. This isn't
always desired behavior of the TSManager because we will reload tserver
states when the master first starts up, at which point tservers may not
be registered yet. It is accessible through a CLI tool optional
argument.

Change-Id: Ifab64dcb5b7b0d171502c9154c83113c82e5972d
Reviewed-on: http://gerrit.cloudera.org:8080/14368
Tested-by: Kudu Jenkins
Reviewed-by: Alexey Serbin <as...@cloudera.com>
Reviewed-by: Adar Dembo <ad...@cloudera.com>
---
M src/kudu/client/master_proxy_rpc.cc
M src/kudu/master/master.proto
M src/kudu/master/master_service.cc
M src/kudu/master/ts_manager.cc
M src/kudu/master/ts_manager.h
M src/kudu/master/ts_state-test.cc
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_common.cc
M src/kudu/tools/tool_action_tserver.cc
9 files changed, 179 insertions(+), 2 deletions(-)

Approvals:
  Kudu Jenkins: Verified
  Alexey Serbin: Looks good to me, but someone else must approve
  Adar Dembo: Looks good to me, approved

-- 
To view, visit http://gerrit.cloudera.org:8080/14368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ifab64dcb5b7b0d171502c9154c83113c82e5972d
Gerrit-Change-Number: 14368
Gerrit-PatchSet: 6
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Hannah Nguyen <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)

[kudu-CR] KUDU-2069 p7: add tool to enter/exit maintenance

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/14368 )

Change subject: KUDU-2069 p7: add tool to enter/exit maintenance
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14368/1/src/kudu/tools/tool_action_tserver.cc
File src/kudu/tools/tool_action_tserver.cc:

http://gerrit.cloudera.org:8080/#/c/14368/1/src/kudu/tools/tool_action_tserver.cc@249
PS1, Line 249:   unique_ptr<Action> enter_maintenance =
             :       ActionBuilder("enter_maintenance", &EnterMaintenance)
             :       .Description("Put the Tablet Server in maintenance. Failures of the "
             :                    "Tablet Server will not lead to the recovery of tablet "
             :                    "replicas.")
             :       .AddRequiredParameter({ kMasterAddressesArg, kMasterAddressesArgDesc })
             :       .AddRequiredParameter({ kTServerIdArg, kTServerIdDesc })
             :       .Build();
             :   unique_ptr<Action> exit_maintenance =
             :       ActionBuilder("exit_maintenance", &ExitMaintenance)
             :       .Description("End maintenance of the Tablet Server.")
             :       .AddRequiredParameter({ kMasterAddressesArg, kMasterAddressesArgDesc })
             :       .AddRequiredParameter({ kTServerIdArg, kTServerIdDesc })
             :       .Build();
             :   unique_ptr<Mode> set_state = ModeBuilder("set_state")
             :       .Description("Set the state for a Kudu Tablet Server")
             :       .AddAction(std::move(enter_maintenance))
             :       .AddAction(std::move(exit_maintenance))
             :       .Build();
BTW, how is decommissioning going to fit in there in the future?

Also, a bit of bike-shedding: 

Maybe, instead of 'set_state' have 'maintenance', and instead of 'enter_maintenance'/'exit_maintenance' have 'enter'/'exit' correspondingly.



-- 
To view, visit http://gerrit.cloudera.org:8080/14368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifab64dcb5b7b0d171502c9154c83113c82e5972d
Gerrit-Change-Number: 14368
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Hannah Nguyen <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Fri, 04 Oct 2019 19:21:10 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2069 p7: add tool to enter/exit maintenance

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/14368 )

Change subject: KUDU-2069 p7: add tool to enter/exit maintenance
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14368/1/src/kudu/tools/tool_action_tserver.cc
File src/kudu/tools/tool_action_tserver.cc:

http://gerrit.cloudera.org:8080/#/c/14368/1/src/kudu/tools/tool_action_tserver.cc@277
PS1, Line 277:       .AddMode(std::move(set_state))
> I favor strict for both cases.
I would prefer non-strict mode in both cases, and I think that's fine if somebody would prefer strict, but I don't understand why.  However, if I don't have an option to use non-strict option, I would prefer to use some other tool.

If the tserver is concurrently decommissioned I think it's a race, and the strict mode would not help there.



-- 
To view, visit http://gerrit.cloudera.org:8080/14368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifab64dcb5b7b0d171502c9154c83113c82e5972d
Gerrit-Change-Number: 14368
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Hannah Nguyen <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Sat, 05 Oct 2019 04:37:19 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2069 p7: add tool to enter/exit maintenance

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/14368 )

Change subject: KUDU-2069 p7: add tool to enter/exit maintenance
......................................................................


Patch Set 5: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/14368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifab64dcb5b7b0d171502c9154c83113c82e5972d
Gerrit-Change-Number: 14368
Gerrit-PatchSet: 5
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Hannah Nguyen <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Tue, 08 Oct 2019 22:39:51 +0000
Gerrit-HasComments: No

[kudu-CR] KUDU-2069 p7: add tool to enter/exit maintenance

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14368 )

Change subject: KUDU-2069 p7: add tool to enter/exit maintenance
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14368/1/src/kudu/tools/tool_action_tserver.cc
File src/kudu/tools/tool_action_tserver.cc:

http://gerrit.cloudera.org:8080/#/c/14368/1/src/kudu/tools/tool_action_tserver.cc@249
PS1, Line 249:   unique_ptr<Action> enter_maintenance =
             :       ActionBuilder("enter_maintenance", &EnterMaintenance)
             :       .Description("Put the Tablet Server in maintenance. Failures of the "
             :                    "Tablet Server will not lead to the recovery of tablet "
             :                    "replicas.")
             :       .AddRequiredParameter({ kMasterAddressesArg, kMasterAddressesArgDesc })
             :       .AddRequiredParameter({ kTServerIdArg, kTServerIdDesc })
             :       .Build();
             :   unique_ptr<Action> exit_maintenance =
             :       ActionBuilder("exit_maintenance", &ExitMaintenance)
             :       .Description("End maintenance of the Tablet Server.")
             :       .AddRequiredParameter({ kMasterAddressesArg, kMasterAddressesArgDesc })
             :       .AddRequiredParameter({ kTServerIdArg, kTServerIdDesc })
             :       .Build();
             :   unique_ptr<Mode> set_state = ModeBuilder("set_state")
             :       .Description("Set the state for a Kudu Tablet Server")
             :       .AddAction(std::move(enter_maintenance))
             :       .AddAction(std::move(exit_maintenance))
             :       .Build();
> BTW, how is decommissioning going to fit in there in the future?
I was thinking it'd look like:

$ kudu tserver set_state decommission

or something like that. These states are tied together. I don't want maintenance mode to be a separate tool from decom without good reason.



-- 
To view, visit http://gerrit.cloudera.org:8080/14368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifab64dcb5b7b0d171502c9154c83113c82e5972d
Gerrit-Change-Number: 14368
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Hannah Nguyen <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Fri, 04 Oct 2019 19:33:43 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2069 p7: add tool to enter/exit maintenance

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14368 )

Change subject: KUDU-2069 p7: add tool to enter/exit maintenance
......................................................................


Patch Set 1:

> Patch Set 1:
> 
> (1 comment)

How about a different "race", though imagine it spanning several minutes:
user1: state enter_maintenance
user2: state decommission
user1: state exit_maintenance (this should fail and user1 should look into it!)

Compare this to:
user1: state maintenance_mode
user2: state decommissioned
user1: state none (this doesn't fail, and it undid what user2 did! was that intentional?)

That isn't to say that exit_maintenance will only work if the server is in maintenance mode. If the state is already at none, exit_maintenance should no-op; with that, I don't think this is user-unfriendly, at least given the scenarios you've put out.


-- 
To view, visit http://gerrit.cloudera.org:8080/14368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifab64dcb5b7b0d171502c9154c83113c82e5972d
Gerrit-Change-Number: 14368
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Hannah Nguyen <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Sat, 05 Oct 2019 05:03:49 +0000
Gerrit-HasComments: No

[kudu-CR] KUDU-2069 p7: add tool to enter/exit maintenance

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/14368 )

Change subject: KUDU-2069 p7: add tool to enter/exit maintenance
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14368/1/src/kudu/tools/tool_action_tserver.cc
File src/kudu/tools/tool_action_tserver.cc:

http://gerrit.cloudera.org:8080/#/c/14368/1/src/kudu/tools/tool_action_tserver.cc@277
PS1, Line 277:       .AddMode(std::move(set_state))
> I agree re: state, which is why I think it's more appropriate. Eg when you 
Yep, having strict operation mode might be an options as well  (like returning an error on request to set a maintenance mode if it's already set).  I'm fine with either.  I'm more concerned about usability of our tools.

A simple example scenario: an operator put number of tablet servers into the maintenance mode.  Now, everything is clear and they want to simply return every tablet server back into the normal mode.  Why would they care about exact state transitions?

Another example: they were running a script to switch a few tablet servers into the maintenance mode using kudu CLI, an error happened somewhere in the middle (a network issue or simple typo?), they fix it (the network or the typo) and run the script again.  If we are strict with those transitions, it will fail again because the first tablet server is already in the maintenance mode.  OK, now they need to spend more time to find that out and update the input list for the script.  And after all manipulations, run some extra checks to make sure they didn't mess up something else.  Being an operator, would you prefer the strict mode or idempotent behavior of the tool?



-- 
To view, visit http://gerrit.cloudera.org:8080/14368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifab64dcb5b7b0d171502c9154c83113c82e5972d
Gerrit-Change-Number: 14368
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Hannah Nguyen <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Sat, 05 Oct 2019 04:03:16 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2069 p7: add tool to enter/exit maintenance

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Adar Dembo, Grant Henke, Hannah Nguyen, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/14368

to look at the new patch set (#2).

Change subject: KUDU-2069 p7: add tool to enter/exit maintenance
......................................................................

KUDU-2069 p7: add tool to enter/exit maintenance

Adds the following tools:
$ kudu tserver state enter_maintenance <masters> <ts_id>
$ kudu tserver state exit_maintenance <masters> <ts_id>

This opts to have users specify the transition they expect to enact,
rather than the desired end state. This is because I expect the same
tool to be used for decommissioning. When that materializes, it will be
valuable to have users specify their expected starting state to avoid
accidentally overwriting tserver states unintentionally.

It also adds an option into the TSManager to allow the tool to raise an
error when running it with a tserver that doesn't exist. This isn't
always desired behavior of the TSManager because we will reload tserver
states when the master first starts up, at which point tservers may not
be registered yet. It is accessible through a CLI tool optional
argument.

Change-Id: Ifab64dcb5b7b0d171502c9154c83113c82e5972d
---
M src/kudu/client/master_proxy_rpc.cc
M src/kudu/master/master.proto
M src/kudu/master/master_service.cc
M src/kudu/master/ts_manager.cc
M src/kudu/master/ts_manager.h
M src/kudu/master/ts_state-test.cc
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_common.cc
M src/kudu/tools/tool_action_tserver.cc
9 files changed, 180 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/68/14368/2
-- 
To view, visit http://gerrit.cloudera.org:8080/14368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifab64dcb5b7b0d171502c9154c83113c82e5972d
Gerrit-Change-Number: 14368
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Hannah Nguyen <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)

[kudu-CR] KUDU-2069 p7: add tool to enter/exit maintenance

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14368 )

Change subject: KUDU-2069 p7: add tool to enter/exit maintenance
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14368/1/src/kudu/tools/kudu-tool-test.cc
File src/kudu/tools/kudu-tool-test.cc:

http://gerrit.cloudera.org:8080/#/c/14368/1/src/kudu/tools/kudu-tool-test.cc@4784
PS1, Line 4784:   NO_FATALS(RunActionStdoutNone(
              :       Substitute("tserver set_state enter_maintenance $0 $1",
              :                  mini_master->bound_rpc_addr().ToString(), kDummyUuid)));
> Why doesn't this fail in some way?
I don't think we can say with certainty that a tablet server will have always registered with the master before a user or a script puts it in maintenance mode. So it's allowed for users to set state for a tablet server that doesn't exist yet.



-- 
To view, visit http://gerrit.cloudera.org:8080/14368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifab64dcb5b7b0d171502c9154c83113c82e5972d
Gerrit-Change-Number: 14368
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Hannah Nguyen <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Sat, 05 Oct 2019 01:36:32 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2069 p7: add tool to enter/exit maintenance

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/14368 )

Change subject: KUDU-2069 p7: add tool to enter/exit maintenance
......................................................................


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14368/4/src/kudu/tools/tool_action_tserver.cc
File src/kudu/tools/tool_action_tserver.cc:

http://gerrit.cloudera.org:8080/#/c/14368/4/src/kudu/tools/tool_action_tserver.cc@265
PS4, Line 265: --allow_missing
Nit: --allow_missing_tserver



-- 
To view, visit http://gerrit.cloudera.org:8080/14368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifab64dcb5b7b0d171502c9154c83113c82e5972d
Gerrit-Change-Number: 14368
Gerrit-PatchSet: 4
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Hannah Nguyen <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Tue, 08 Oct 2019 17:31:13 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2069 p7: add tool to enter/exit maintenance

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/14368 )

Change subject: KUDU-2069 p7: add tool to enter/exit maintenance
......................................................................


Patch Set 1:

> > Patch Set 1:
 > >
 > > (1 comment)
 > 
 > How about a different "race", though imagine it spanning several
 > minutes:
 > user1: state enter_maintenance
 > user2: state decommission
 > user1: state exit_maintenance (this should fail and user1 should
 > look into it!)
 > 
 > Compare this to:
 > user1: state maintenance_mode
 > user2: state decommissioned
 > user1: state none (this doesn't fail, and it undid what user2 did!
 > was that intentional?)
 > 
 > That isn't to say that exit_maintenance will only work if the
 > server is in maintenance mode. If the state is already at none,
 > exit_maintenance should no-op; with that, I don't think this is
 > user-unfriendly, at least given the scenarios you've put out.

Yep, in that sense 'strict' mode makes sense.

It seems there is a difference in terms/naming: I thought 'permissive' is when request for switching from mode to mode is allowed to be a no-op, but it seems you say it would be a no-op anyways.  I thought 'strict' was when no-ops were prohibited.


I thought in the scenarios you mention there would be an error when trying to switch to 'decommission' from 'maintenance' mode, since  we wouldn't allow going from of maintenance to decommission modes and vice versa.  However, I think that's a good example regardless.


-- 
To view, visit http://gerrit.cloudera.org:8080/14368
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifab64dcb5b7b0d171502c9154c83113c82e5972d
Gerrit-Change-Number: 14368
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Hannah Nguyen <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Mon, 07 Oct 2019 21:12:18 +0000
Gerrit-HasComments: No