You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Zoltan Chovan (Code Review)" <ge...@cloudera.org> on 2022/04/22 11:46:00 UTC

[kudu-CR] [tserver] KUDU-1827: tserver decommission

Zoltan Chovan has uploaded this change for review. ( http://gerrit.cloudera.org:8080/18441


Change subject: [tserver] KUDU-1827: tserver decommission
......................................................................

[tserver] KUDU-1827: tserver decommission

This change introduces two new states for TServers for decommissioning.
The states are:
	* DECOMMISSIONING_IN_PROGRESS
	* DECOMMISSIONED
In these states the TServer is quieced and new replicas are prevented to
be placed on them. When a TServer is being decommissioned, first it
enters the DECOMMISSIONING_IN_PROGRESS state, then a rebalancer job is
started to move replicas away from every TServer that is in this state.
When the rebalancing is done, the TServers move to DECOMMISSIONED state
and will stay in it until they are removed from the cluster via the 'kudu
tserver unregister' tool.
The decommissioning can be started with the following cli command:

$ kudu tserver state enter_decommissioning <master_addresses>
        <tserver_addresses> [-allow_missing_tserver]
        [-negotiation_timeout_ms=<ms>] [-timeout_ms=<ms>]

It is possible to decommission multiple TServers at once by providing
multiple addresses in a comma-separated list, which is preferred to
avoid multiple replica moves between each server decommission.

Change-Id: I15c52b653c20b2a3a45fbf8934f19f6bd1a9caea
---
M src/kudu/integration-tests/CMakeLists.txt
M src/kudu/master/auto_rebalancer-test.cc
M src/kudu/master/auto_rebalancer.cc
M src/kudu/master/master.proto
M src/kudu/master/master_service.cc
M src/kudu/master/ts_manager.cc
M src/kudu/master/ts_manager.h
M src/kudu/master/ts_state-test.cc
M src/kudu/rebalance/rebalancer.cc
M src/kudu/rebalance/rebalancer.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/rebalancer_tool.cc
M src/kudu/tools/tool_action_common.cc
M src/kudu/tools/tool_action_common.h
M src/kudu/tools/tool_action_tserver.cc
15 files changed, 562 insertions(+), 150 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/41/18441/1
-- 
To view, visit http://gerrit.cloudera.org:8080/18441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I15c52b653c20b2a3a45fbf8934f19f6bd1a9caea
Gerrit-Change-Number: 18441
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Chovan <zc...@cloudera.com>

[kudu-CR] [tserver] KUDU-1827: tserver decommission

Posted by "Yifan Zhang (Code Review)" <ge...@cloudera.org>.
Yifan Zhang has posted comments on this change. ( http://gerrit.cloudera.org:8080/18441 )

Change subject: [tserver] KUDU-1827: tserver decommission
......................................................................


Patch Set 12:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/18441/12/src/kudu/master/auto_rebalancer-test.cc
File src/kudu/master/auto_rebalancer-test.cc:

http://gerrit.cloudera.org:8080/#/c/18441/12/src/kudu/master/auto_rebalancer-test.cc@285
PS12, Line 285:   Status ChangeTServerState(const string& uuid, TServerStateChangePB::StateChange change) {
If I don't miss something, what is this method used for?


http://gerrit.cloudera.org:8080/#/c/18441/12/src/kudu/tools/kudu-tool-test.cc
File src/kudu/tools/kudu-tool-test.cc:

http://gerrit.cloudera.org:8080/#/c/18441/12/src/kudu/tools/kudu-tool-test.cc@8400
PS12, Line 8400: }
Maybe we also need to check there is no replica on the decommissioned tablet server.


http://gerrit.cloudera.org:8080/#/c/18441/12/src/kudu/tools/tool_action_tserver.cc
File src/kudu/tools/tool_action_tserver.cc:

http://gerrit.cloudera.org:8080/#/c/18441/12/src/kudu/tools/tool_action_tserver.cc@460
PS12, Line 460: RETURN_NOT_OK(IsDecommissioningInProgress(context, &decommissioningInProgress));
              : 
              :   if (decommissioningInProgress) {
              :     return Status::IllegalState("Decommissioning already in progress");
              :   }
              : 
If `decommissioningInProgress` is true, the method will return a non-OK status, right?


http://gerrit.cloudera.org:8080/#/c/18441/12/src/kudu/tools/tool_action_tserver.cc@495
PS12, Line 495: /* max_moves_per_server = */ 5,
              :       /* max_staleness_interval_sec = */ 300,
              :       /* max_run_time_sec = */ 0,
Can these parameters also be configurable?



-- 
To view, visit http://gerrit.cloudera.org:8080/18441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I15c52b653c20b2a3a45fbf8934f19f6bd1a9caea
Gerrit-Change-Number: 18441
Gerrit-PatchSet: 12
Gerrit-Owner: Zoltan Chovan <zc...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Yifan Zhang <ch...@163.com>
Gerrit-Reviewer: Zoltan Chovan <zc...@cloudera.com>
Gerrit-Comment-Date: Mon, 31 Jul 2023 15:01:07 +0000
Gerrit-HasComments: Yes

[kudu-CR] [tserver] KUDU-1827: tserver decommission

Posted by "Zoltan Chovan (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Attila Bukor, Kudu Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/18441

to look at the new patch set (#8).

Change subject: [tserver] KUDU-1827: tserver decommission
......................................................................

[tserver] KUDU-1827: tserver decommission

This change introduces two new states for TServers for decommissioning.
The states are:
	* DECOMMISSIONING_IN_PROGRESS
	* DECOMMISSIONED
In these states the TServer is quieced and new replicas are prevented to
be placed on them. When a TServer is being decommissioned, first it
enters the DECOMMISSIONING_IN_PROGRESS state, then a rebalancer job is
started to move replicas away from every TServer that is in this state.
When the rebalancing is done, the TServers move to DECOMMISSIONED state
and will stay in it until they are removed from the cluster via the 'kudu
tserver unregister' tool.
The decommissioning can be started with the following cli command:

$ kudu tserver state enter_decommissioning <master_addresses>
        <tserver_addresses> [-allow_missing_tserver]
        [-negotiation_timeout_ms=<ms>] [-timeout_ms=<ms>]

It is possible to decommission multiple TServers at once by providing
multiple addresses in a comma-separated list, which is preferred to
avoid multiple replica moves between each server decommission.

Change-Id: I15c52b653c20b2a3a45fbf8934f19f6bd1a9caea
---
M src/kudu/integration-tests/CMakeLists.txt
M src/kudu/master/auto_rebalancer-test.cc
M src/kudu/master/auto_rebalancer.cc
M src/kudu/master/master.proto
M src/kudu/master/master_service.cc
M src/kudu/master/ts_manager.cc
M src/kudu/master/ts_manager.h
M src/kudu/master/ts_state-test.cc
M src/kudu/rebalance/rebalancer.cc
M src/kudu/rebalance/rebalancer.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/rebalancer_tool.cc
M src/kudu/tools/tool_action_common.cc
M src/kudu/tools/tool_action_common.h
M src/kudu/tools/tool_action_tserver.cc
15 files changed, 554 insertions(+), 152 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/41/18441/8
-- 
To view, visit http://gerrit.cloudera.org:8080/18441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I15c52b653c20b2a3a45fbf8934f19f6bd1a9caea
Gerrit-Change-Number: 18441
Gerrit-PatchSet: 8
Gerrit-Owner: Zoltan Chovan <zc...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)

[kudu-CR] [tserver] KUDU-1827: tserver decommission

Posted by "Zoltan Chovan (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Attila Bukor, Kudu Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/18441

to look at the new patch set (#12).

Change subject: [tserver] KUDU-1827: tserver decommission
......................................................................

[tserver] KUDU-1827: tserver decommission

This change introduces two new states for TServers for decommissioning.
The states are:
   * DECOMMISSIONING_IN_PROGRESS
   * DECOMMISSIONED
In these states the TServer is quieced and new replicas are prevented to
be placed on them. When a TServer is being decommissioned, first it
enters the DECOMMISSIONING_IN_PROGRESS state, then a rebalancer job is
started to move replicas away from every TServer that is in this state.
When the rebalancing is done, the TServers move to DECOMMISSIONED state
and will stay in it until they are removed from the cluster via the 'kudu
tserver unregister' tool.
The decommissioning can be started with the following cli command:

$ kudu tserver state enter_decommissioning <master_addresses>
        <tserver_addresses> [-allow_missing_tserver]
        [-negotiation_timeout_ms=<ms>] [-timeout_ms=<ms>]

It is possible to decommission multiple TServers at once by providing
multiple addresses in a comma-separated list, which is preferred to
avoid multiple replica moves between each server decommission.

Change-Id: I15c52b653c20b2a3a45fbf8934f19f6bd1a9caea
---
M src/kudu/master/auto_rebalancer-test.cc
M src/kudu/master/auto_rebalancer.cc
M src/kudu/master/master.proto
M src/kudu/master/master_service.cc
M src/kudu/master/ts_manager.cc
M src/kudu/master/ts_manager.h
M src/kudu/master/ts_state-test.cc
M src/kudu/rebalance/rebalancer.cc
M src/kudu/rebalance/rebalancer.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/rebalancer_tool.cc
M src/kudu/tools/tool_action_common.cc
M src/kudu/tools/tool_action_common.h
M src/kudu/tools/tool_action_tserver.cc
14 files changed, 547 insertions(+), 19 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/41/18441/12
-- 
To view, visit http://gerrit.cloudera.org:8080/18441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I15c52b653c20b2a3a45fbf8934f19f6bd1a9caea
Gerrit-Change-Number: 18441
Gerrit-PatchSet: 12
Gerrit-Owner: Zoltan Chovan <zc...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Zoltan Chovan <zc...@cloudera.com>

[kudu-CR] [tserver] KUDU-1827: tserver decommission

Posted by "Attila Bukor (Code Review)" <ge...@cloudera.org>.
Attila Bukor has posted comments on this change. ( http://gerrit.cloudera.org:8080/18441 )

Change subject: [tserver] KUDU-1827: tserver decommission
......................................................................


Patch Set 9:

(13 comments)

Thanks for the contribution. I did a quick first pass, overall it looks good, but there are a few nits.

http://gerrit.cloudera.org:8080/#/c/18441/9//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/18441/9//COMMIT_MSG@11
PS9, Line 11: 	* DECOMMISSIONING_IN_PROGRESS
nit: use spaces instead of tabs


http://gerrit.cloudera.org:8080/#/c/18441/9/src/kudu/integration-tests/CMakeLists.txt
File src/kudu/integration-tests/CMakeLists.txt:

http://gerrit.cloudera.org:8080/#/c/18441/9/src/kudu/integration-tests/CMakeLists.txt@34
PS9, Line 34:   ts_itest-base.cc)
nit: this change is unnecessary


http://gerrit.cloudera.org:8080/#/c/18441/9/src/kudu/master/auto_rebalancer-test.cc
File src/kudu/master/auto_rebalancer-test.cc:

http://gerrit.cloudera.org:8080/#/c/18441/9/src/kudu/master/auto_rebalancer-test.cc@70
PS9, Line 70: using kudu::master::MasterServiceProxy;
nit: is this needed?


http://gerrit.cloudera.org:8080/#/c/18441/9/src/kudu/master/master.proto
File src/kudu/master/master.proto:

http://gerrit.cloudera.org:8080/#/c/18441/9/src/kudu/master/master.proto@956
PS9, Line 956:   DECOMMISSIONED = 4;
can you add a comment on this as well?


http://gerrit.cloudera.org:8080/#/c/18441/9/src/kudu/master/ts_state-test.cc
File src/kudu/master/ts_state-test.cc:

http://gerrit.cloudera.org:8080/#/c/18441/9/src/kudu/master/ts_state-test.cc@69
PS9, Line 69:   switch (rand() % 3) {
missing DECOMMISSIONED case?


http://gerrit.cloudera.org:8080/#/c/18441/9/src/kudu/master/ts_state-test.cc@305
PS9, Line 305: deco_in_prog
nit: use the full name


http://gerrit.cloudera.org:8080/#/c/18441/9/src/kudu/rebalance/rebalancer.h
File src/kudu/rebalance/rebalancer.h:

http://gerrit.cloudera.org:8080/#/c/18441/9/src/kudu/rebalance/rebalancer.h@163
PS9, Line 163:     TIMED_OUT
nit: leave the trailing comma


http://gerrit.cloudera.org:8080/#/c/18441/9/src/kudu/rebalance/rebalancer.h@303
PS9, Line 303: 
nit: remove this line


http://gerrit.cloudera.org:8080/#/c/18441/9/src/kudu/rebalance/rebalancer.cc
File src/kudu/rebalance/rebalancer.cc:

http://gerrit.cloudera.org:8080/#/c/18441/9/src/kudu/rebalance/rebalancer.cc@544
PS9, Line 544:  
nit: extra space


http://gerrit.cloudera.org:8080/#/c/18441/9/src/kudu/tools/kudu-tool-test.cc
File src/kudu/tools/kudu-tool-test.cc:

http://gerrit.cloudera.org:8080/#/c/18441/9/src/kudu/tools/kudu-tool-test.cc@a7788
PS9, Line 7788: 
Is this removed intentionally? If yes, why?


http://gerrit.cloudera.org:8080/#/c/18441/9/src/kudu/tools/kudu-tool-test.cc@7066
PS9, Line 7066:   std::cout << std::endl << output << std::endl;
nit: remove these here and below


http://gerrit.cloudera.org:8080/#/c/18441/9/src/kudu/tools/tool_action_tserver.cc
File src/kudu/tools/tool_action_tserver.cc:

http://gerrit.cloudera.org:8080/#/c/18441/9/src/kudu/tools/tool_action_tserver.cc@18
PS9, Line 18: stddef.h
nit: use <cstddef> instead


http://gerrit.cloudera.org:8080/#/c/18441/9/src/kudu/tools/tool_action_tserver.cc@368
PS9, Line 368:   req, &resp, "ListTabletServers", &MasterServiceProxy::ListTabletServersAsync)));
nit: indent



-- 
To view, visit http://gerrit.cloudera.org:8080/18441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I15c52b653c20b2a3a45fbf8934f19f6bd1a9caea
Gerrit-Change-Number: 18441
Gerrit-PatchSet: 9
Gerrit-Owner: Zoltan Chovan <zc...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Wed, 11 May 2022 23:36:33 +0000
Gerrit-HasComments: Yes

[kudu-CR] [tserver] KUDU-1827: tserver decommission

Posted by "Zoltan Chovan (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Attila Bukor, Kudu Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/18441

to look at the new patch set (#4).

Change subject: [tserver] KUDU-1827: tserver decommission
......................................................................

[tserver] KUDU-1827: tserver decommission

This change introduces two new states for TServers for decommissioning.
The states are:
	* DECOMMISSIONING_IN_PROGRESS
	* DECOMMISSIONED
In these states the TServer is quieced and new replicas are prevented to
be placed on them. When a TServer is being decommissioned, first it
enters the DECOMMISSIONING_IN_PROGRESS state, then a rebalancer job is
started to move replicas away from every TServer that is in this state.
When the rebalancing is done, the TServers move to DECOMMISSIONED state
and will stay in it until they are removed from the cluster via the 'kudu
tserver unregister' tool.
The decommissioning can be started with the following cli command:

$ kudu tserver state enter_decommissioning <master_addresses>
        <tserver_addresses> [-allow_missing_tserver]
        [-negotiation_timeout_ms=<ms>] [-timeout_ms=<ms>]

It is possible to decommission multiple TServers at once by providing
multiple addresses in a comma-separated list, which is preferred to
avoid multiple replica moves between each server decommission.

Change-Id: I15c52b653c20b2a3a45fbf8934f19f6bd1a9caea
---
M src/kudu/integration-tests/CMakeLists.txt
A src/kudu/integration-tests/decommissioning-itest.cc
M src/kudu/master/auto_rebalancer-test.cc
M src/kudu/master/auto_rebalancer.cc
M src/kudu/master/master.proto
M src/kudu/master/master_service.cc
M src/kudu/master/ts_manager.cc
M src/kudu/master/ts_manager.h
M src/kudu/master/ts_state-test.cc
M src/kudu/rebalance/rebalancer.cc
M src/kudu/rebalance/rebalancer.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/rebalancer_tool.cc
M src/kudu/tools/tool_action_common.cc
M src/kudu/tools/tool_action_common.h
M src/kudu/tools/tool_action_tserver.cc
16 files changed, 699 insertions(+), 150 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/41/18441/4
-- 
To view, visit http://gerrit.cloudera.org:8080/18441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I15c52b653c20b2a3a45fbf8934f19f6bd1a9caea
Gerrit-Change-Number: 18441
Gerrit-PatchSet: 4
Gerrit-Owner: Zoltan Chovan <zc...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)

[kudu-CR] [tserver] KUDU-1827: tserver decommission

Posted by "Zoltan Chovan (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Attila Bukor, Kudu Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/18441

to look at the new patch set (#5).

Change subject: [tserver] KUDU-1827: tserver decommission
......................................................................

[tserver] KUDU-1827: tserver decommission

This change introduces two new states for TServers for decommissioning.
The states are:
	* DECOMMISSIONING_IN_PROGRESS
	* DECOMMISSIONED
In these states the TServer is quieced and new replicas are prevented to
be placed on them. When a TServer is being decommissioned, first it
enters the DECOMMISSIONING_IN_PROGRESS state, then a rebalancer job is
started to move replicas away from every TServer that is in this state.
When the rebalancing is done, the TServers move to DECOMMISSIONED state
and will stay in it until they are removed from the cluster via the 'kudu
tserver unregister' tool.
The decommissioning can be started with the following cli command:

$ kudu tserver state enter_decommissioning <master_addresses>
        <tserver_addresses> [-allow_missing_tserver]
        [-negotiation_timeout_ms=<ms>] [-timeout_ms=<ms>]

It is possible to decommission multiple TServers at once by providing
multiple addresses in a comma-separated list, which is preferred to
avoid multiple replica moves between each server decommission.

Change-Id: I15c52b653c20b2a3a45fbf8934f19f6bd1a9caea
---
M src/kudu/integration-tests/CMakeLists.txt
A src/kudu/integration-tests/decommissioning-itest.cc
M src/kudu/master/auto_rebalancer-test.cc
M src/kudu/master/auto_rebalancer.cc
M src/kudu/master/master.proto
M src/kudu/master/master_service.cc
M src/kudu/master/ts_manager.cc
M src/kudu/master/ts_manager.h
M src/kudu/master/ts_state-test.cc
M src/kudu/rebalance/rebalancer.cc
M src/kudu/rebalance/rebalancer.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/rebalancer_tool.cc
M src/kudu/tools/tool_action_common.cc
M src/kudu/tools/tool_action_common.h
M src/kudu/tools/tool_action_tserver.cc
16 files changed, 707 insertions(+), 152 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/41/18441/5
-- 
To view, visit http://gerrit.cloudera.org:8080/18441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I15c52b653c20b2a3a45fbf8934f19f6bd1a9caea
Gerrit-Change-Number: 18441
Gerrit-PatchSet: 5
Gerrit-Owner: Zoltan Chovan <zc...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)

[kudu-CR] [tserver] KUDU-1827: tserver decommission

Posted by "Zoltan Chovan (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Attila Bukor, Kudu Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/18441

to look at the new patch set (#6).

Change subject: [tserver] KUDU-1827: tserver decommission
......................................................................

[tserver] KUDU-1827: tserver decommission

This change introduces two new states for TServers for decommissioning.
The states are:
	* DECOMMISSIONING_IN_PROGRESS
	* DECOMMISSIONED
In these states the TServer is quieced and new replicas are prevented to
be placed on them. When a TServer is being decommissioned, first it
enters the DECOMMISSIONING_IN_PROGRESS state, then a rebalancer job is
started to move replicas away from every TServer that is in this state.
When the rebalancing is done, the TServers move to DECOMMISSIONED state
and will stay in it until they are removed from the cluster via the 'kudu
tserver unregister' tool.
The decommissioning can be started with the following cli command:

$ kudu tserver state enter_decommissioning <master_addresses>
        <tserver_addresses> [-allow_missing_tserver]
        [-negotiation_timeout_ms=<ms>] [-timeout_ms=<ms>]

It is possible to decommission multiple TServers at once by providing
multiple addresses in a comma-separated list, which is preferred to
avoid multiple replica moves between each server decommission.

Change-Id: I15c52b653c20b2a3a45fbf8934f19f6bd1a9caea
---
M src/kudu/integration-tests/CMakeLists.txt
A src/kudu/integration-tests/decommissioning-itest.cc
M src/kudu/master/auto_rebalancer-test.cc
M src/kudu/master/auto_rebalancer.cc
M src/kudu/master/master.proto
M src/kudu/master/master_service.cc
M src/kudu/master/ts_manager.cc
M src/kudu/master/ts_manager.h
M src/kudu/master/ts_state-test.cc
M src/kudu/rebalance/rebalancer.cc
M src/kudu/rebalance/rebalancer.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/rebalancer_tool.cc
M src/kudu/tools/tool_action_common.cc
M src/kudu/tools/tool_action_common.h
M src/kudu/tools/tool_action_tserver.cc
16 files changed, 707 insertions(+), 152 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/41/18441/6
-- 
To view, visit http://gerrit.cloudera.org:8080/18441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I15c52b653c20b2a3a45fbf8934f19f6bd1a9caea
Gerrit-Change-Number: 18441
Gerrit-PatchSet: 6
Gerrit-Owner: Zoltan Chovan <zc...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)

[kudu-CR] [tserver] KUDU-1827: tserver decommission

Posted by "Zoltan Chovan (Code Review)" <ge...@cloudera.org>.
Hello Attila Bukor, Kudu Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/18441

to look at the new patch set (#2).

Change subject: [tserver] KUDU-1827: tserver decommission
......................................................................

[tserver] KUDU-1827: tserver decommission

This change introduces two new states for TServers for decommissioning.
The states are:
	* DECOMMISSIONING_IN_PROGRESS
	* DECOMMISSIONED
In these states the TServer is quieced and new replicas are prevented to
be placed on them. When a TServer is being decommissioned, first it
enters the DECOMMISSIONING_IN_PROGRESS state, then a rebalancer job is
started to move replicas away from every TServer that is in this state.
When the rebalancing is done, the TServers move to DECOMMISSIONED state
and will stay in it until they are removed from the cluster via the 'kudu
tserver unregister' tool.
The decommissioning can be started with the following cli command:

$ kudu tserver state enter_decommissioning <master_addresses>
        <tserver_addresses> [-allow_missing_tserver]
        [-negotiation_timeout_ms=<ms>] [-timeout_ms=<ms>]

It is possible to decommission multiple TServers at once by providing
multiple addresses in a comma-separated list, which is preferred to
avoid multiple replica moves between each server decommission.

Change-Id: I15c52b653c20b2a3a45fbf8934f19f6bd1a9caea
---
M src/kudu/integration-tests/CMakeLists.txt
M src/kudu/master/auto_rebalancer-test.cc
M src/kudu/master/auto_rebalancer.cc
M src/kudu/master/master.proto
M src/kudu/master/master_service.cc
M src/kudu/master/ts_manager.cc
M src/kudu/master/ts_manager.h
M src/kudu/master/ts_state-test.cc
M src/kudu/rebalance/rebalancer.cc
M src/kudu/rebalance/rebalancer.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/rebalancer_tool.cc
M src/kudu/tools/tool_action_common.cc
M src/kudu/tools/tool_action_common.h
M src/kudu/tools/tool_action_tserver.cc
15 files changed, 546 insertions(+), 150 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/41/18441/2
-- 
To view, visit http://gerrit.cloudera.org:8080/18441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I15c52b653c20b2a3a45fbf8934f19f6bd1a9caea
Gerrit-Change-Number: 18441
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Chovan <zc...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)

[kudu-CR] [tserver] KUDU-1827: tserver decommission

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/18441 )

Change subject: [tserver] KUDU-1827: tserver decommission
......................................................................


Patch Set 12:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/18441/12/src/kudu/master/auto_rebalancer.cc
File src/kudu/master/auto_rebalancer.cc:

http://gerrit.cloudera.org:8080/#/c/18441/12/src/kudu/master/auto_rebalancer.cc@478
PS12, Line 478: ts_manager_->GetDecommissionedCount()
Does it mean it's OK to put replicas on tablet servers being de-commissioned?


http://gerrit.cloudera.org:8080/#/c/18441/12/src/kudu/master/master.proto
File src/kudu/master/master.proto:

http://gerrit.cloudera.org:8080/#/c/18441/12/src/kudu/master/master.proto@1042
PS12, Line 1042: At any time only one TS should be in this state to avoid resource
               :   // contention and unnecessary moves when decommissioning multiple instances.
Is this invariant enforced in the code?


http://gerrit.cloudera.org:8080/#/c/18441/12/src/kudu/master/ts_manager.cc
File src/kudu/master/ts_manager.cc:

http://gerrit.cloudera.org:8080/#/c/18441/12/src/kudu/master/ts_manager.cc@227
PS12, Line 227:   for (auto const& entry : stateMap) {
              :     // TServerStateMap entry schema is: <uuid, <state, timestamp>>
You could use structure binding for this instead: Kudu code is using C++17 internally.


http://gerrit.cloudera.org:8080/#/c/18441/12/src/kudu/rebalance/rebalancer.h
File src/kudu/rebalance/rebalancer.h:

http://gerrit.cloudera.org:8080/#/c/18441/12/src/kudu/rebalance/rebalancer.h@219
PS12, Line 219: std::string
This could return 'const std::string&' to avoid useless copying.


http://gerrit.cloudera.org:8080/#/c/18441/12/src/kudu/rebalance/rebalancer.cc
File src/kudu/rebalance/rebalancer.cc:

http://gerrit.cloudera.org:8080/#/c/18441/12/src/kudu/rebalance/rebalancer.cc@539
PS12, Line 539: std::string Rebalancer::RunStatusAsString(Rebalancer::RunStatus run_status) {
              :   static const char *enum_str[] =
              :       { "Unknown", "Cluster is balanced", "Timed out" };
The standard way of doing this is using switch/case statement -- it eliminates many issues that another approach you use introduces.


http://gerrit.cloudera.org:8080/#/c/18441/12/src/kudu/rebalance/rebalancer.cc@543
PS12, Line 543:  string tmp
nit: there is no need for this cast from char* to string


http://gerrit.cloudera.org:8080/#/c/18441/12/src/kudu/rebalance/rebalancer.cc@543
PS12, Line 543: run_status
What if this is out of the range of the enum_str array?


http://gerrit.cloudera.org:8080/#/c/18441/12/src/kudu/tools/rebalancer_tool.cc
File src/kudu/tools/rebalancer_tool.cc:

http://gerrit.cloudera.org:8080/#/c/18441/12/src/kudu/tools/rebalancer_tool.cc@1625
PS12, Line 1625: maintenance mode
How does this relate to the decommissioning?  Is this message still actionable with the recent updates?


http://gerrit.cloudera.org:8080/#/c/18441/12/src/kudu/tools/tool_action_common.cc
File src/kudu/tools/tool_action_common.cc:

http://gerrit.cloudera.org:8080/#/c/18441/12/src/kudu/tools/tool_action_common.cc@316
PS12, Line 316: const char* const kTServerAddressesArg = "tserver_addresses";
              : const char* const kTServerAddressesDesc = "Comma-separated list of Kudu Tablet Servers where each "
              :                                           "address is of form 'hostname:port'.";
This seems to be a duplicate of what's currently in tool_action_master.cc -- the only difference is the separator.  Maybe, consolidate that by just moving the corresponding definitions from tool_action_master.cc into this file instead?


http://gerrit.cloudera.org:8080/#/c/18441/12/src/kudu/tools/tool_action_tserver.cc
File src/kudu/tools/tool_action_tserver.cc:

http://gerrit.cloudera.org:8080/#/c/18441/12/src/kudu/tools/tool_action_tserver.cc@19
PS12, Line 19: 
nit: remove the extra empty line


http://gerrit.cloudera.org:8080/#/c/18441/12/src/kudu/tools/tool_action_tserver.cc@74
PS12, Line 74: UUIDs of tablet servers
In what form and what's the separator?



-- 
To view, visit http://gerrit.cloudera.org:8080/18441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I15c52b653c20b2a3a45fbf8934f19f6bd1a9caea
Gerrit-Change-Number: 18441
Gerrit-PatchSet: 12
Gerrit-Owner: Zoltan Chovan <zc...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Zoltan Chovan <zc...@cloudera.com>
Gerrit-Comment-Date: Tue, 25 Jul 2023 22:10:11 +0000
Gerrit-HasComments: Yes

[kudu-CR] [tserver] KUDU-1827: tserver decommission

Posted by "Zoltan Chovan (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Attila Bukor, Kudu Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/18441

to look at the new patch set (#7).

Change subject: [tserver] KUDU-1827: tserver decommission
......................................................................

[tserver] KUDU-1827: tserver decommission

This change introduces two new states for TServers for decommissioning.
The states are:
	* DECOMMISSIONING_IN_PROGRESS
	* DECOMMISSIONED
In these states the TServer is quieced and new replicas are prevented to
be placed on them. When a TServer is being decommissioned, first it
enters the DECOMMISSIONING_IN_PROGRESS state, then a rebalancer job is
started to move replicas away from every TServer that is in this state.
When the rebalancing is done, the TServers move to DECOMMISSIONED state
and will stay in it until they are removed from the cluster via the 'kudu
tserver unregister' tool.
The decommissioning can be started with the following cli command:

$ kudu tserver state enter_decommissioning <master_addresses>
        <tserver_addresses> [-allow_missing_tserver]
        [-negotiation_timeout_ms=<ms>] [-timeout_ms=<ms>]

It is possible to decommission multiple TServers at once by providing
multiple addresses in a comma-separated list, which is preferred to
avoid multiple replica moves between each server decommission.

Change-Id: I15c52b653c20b2a3a45fbf8934f19f6bd1a9caea
---
M src/kudu/integration-tests/CMakeLists.txt
M src/kudu/master/auto_rebalancer-test.cc
M src/kudu/master/auto_rebalancer.cc
M src/kudu/master/master.proto
M src/kudu/master/master_service.cc
M src/kudu/master/ts_manager.cc
M src/kudu/master/ts_manager.h
M src/kudu/master/ts_state-test.cc
M src/kudu/rebalance/rebalancer.cc
M src/kudu/rebalance/rebalancer.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/rebalancer_tool.cc
M src/kudu/tools/tool_action_common.cc
M src/kudu/tools/tool_action_common.h
M src/kudu/tools/tool_action_tserver.cc
15 files changed, 570 insertions(+), 152 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/41/18441/7
-- 
To view, visit http://gerrit.cloudera.org:8080/18441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I15c52b653c20b2a3a45fbf8934f19f6bd1a9caea
Gerrit-Change-Number: 18441
Gerrit-PatchSet: 7
Gerrit-Owner: Zoltan Chovan <zc...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)

[kudu-CR] [tserver] KUDU-1827: tserver decommission

Posted by "Zoltan Chovan (Code Review)" <ge...@cloudera.org>.
Hello Attila Bukor, Kudu Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/18441

to look at the new patch set (#3).

Change subject: [tserver] KUDU-1827: tserver decommission
......................................................................

[tserver] KUDU-1827: tserver decommission

This change introduces two new states for TServers for decommissioning.
The states are:
	* DECOMMISSIONING_IN_PROGRESS
	* DECOMMISSIONED
In these states the TServer is quieced and new replicas are prevented to
be placed on them. When a TServer is being decommissioned, first it
enters the DECOMMISSIONING_IN_PROGRESS state, then a rebalancer job is
started to move replicas away from every TServer that is in this state.
When the rebalancing is done, the TServers move to DECOMMISSIONED state
and will stay in it until they are removed from the cluster via the 'kudu
tserver unregister' tool.
The decommissioning can be started with the following cli command:

$ kudu tserver state enter_decommissioning <master_addresses>
        <tserver_addresses> [-allow_missing_tserver]
        [-negotiation_timeout_ms=<ms>] [-timeout_ms=<ms>]

It is possible to decommission multiple TServers at once by providing
multiple addresses in a comma-separated list, which is preferred to
avoid multiple replica moves between each server decommission.

Change-Id: I15c52b653c20b2a3a45fbf8934f19f6bd1a9caea
---
M src/kudu/integration-tests/CMakeLists.txt
M src/kudu/master/auto_rebalancer-test.cc
M src/kudu/master/auto_rebalancer.cc
M src/kudu/master/master.proto
M src/kudu/master/master_service.cc
M src/kudu/master/ts_manager.cc
M src/kudu/master/ts_manager.h
M src/kudu/master/ts_state-test.cc
M src/kudu/rebalance/rebalancer.cc
M src/kudu/rebalance/rebalancer.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/rebalancer_tool.cc
M src/kudu/tools/tool_action_common.cc
M src/kudu/tools/tool_action_common.h
M src/kudu/tools/tool_action_tserver.cc
15 files changed, 554 insertions(+), 152 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/41/18441/3
-- 
To view, visit http://gerrit.cloudera.org:8080/18441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I15c52b653c20b2a3a45fbf8934f19f6bd1a9caea
Gerrit-Change-Number: 18441
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Chovan <zc...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)

[kudu-CR] [tserver] KUDU-1827: tserver decommission

Posted by "Zoltan Chovan (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Attila Bukor, Kudu Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/18441

to look at the new patch set (#11).

Change subject: [tserver] KUDU-1827: tserver decommission
......................................................................

[tserver] KUDU-1827: tserver decommission

This change introduces two new states for TServers for decommissioning.
The states are:
   * DECOMMISSIONING_IN_PROGRESS
   * DECOMMISSIONED
In these states the TServer is quieced and new replicas are prevented to
be placed on them. When a TServer is being decommissioned, first it
enters the DECOMMISSIONING_IN_PROGRESS state, then a rebalancer job is
started to move replicas away from every TServer that is in this state.
When the rebalancing is done, the TServers move to DECOMMISSIONED state
and will stay in it until they are removed from the cluster via the 'kudu
tserver unregister' tool.
The decommissioning can be started with the following cli command:

$ kudu tserver state enter_decommissioning <master_addresses>
        <tserver_addresses> [-allow_missing_tserver]
        [-negotiation_timeout_ms=<ms>] [-timeout_ms=<ms>]

It is possible to decommission multiple TServers at once by providing
multiple addresses in a comma-separated list, which is preferred to
avoid multiple replica moves between each server decommission.

Change-Id: I15c52b653c20b2a3a45fbf8934f19f6bd1a9caea
---
M src/kudu/master/auto_rebalancer-test.cc
M src/kudu/master/auto_rebalancer.cc
M src/kudu/master/master.proto
M src/kudu/master/master_service.cc
M src/kudu/master/ts_manager.cc
M src/kudu/master/ts_manager.h
M src/kudu/master/ts_state-test.cc
M src/kudu/rebalance/rebalancer.cc
M src/kudu/rebalance/rebalancer.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/rebalancer_tool.cc
M src/kudu/tools/tool_action_common.cc
M src/kudu/tools/tool_action_common.h
M src/kudu/tools/tool_action_tserver.cc
14 files changed, 546 insertions(+), 19 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/41/18441/11
-- 
To view, visit http://gerrit.cloudera.org:8080/18441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I15c52b653c20b2a3a45fbf8934f19f6bd1a9caea
Gerrit-Change-Number: 18441
Gerrit-PatchSet: 11
Gerrit-Owner: Zoltan Chovan <zc...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Zoltan Chovan <zc...@cloudera.com>

[kudu-CR] [tserver] KUDU-1827: tserver decommission

Posted by "Zoltan Chovan (Code Review)" <ge...@cloudera.org>.
Zoltan Chovan has posted comments on this change. ( http://gerrit.cloudera.org:8080/18441 )

Change subject: [tserver] KUDU-1827: tserver decommission
......................................................................


Patch Set 10:

(13 comments)

http://gerrit.cloudera.org:8080/#/c/18441/9//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/18441/9//COMMIT_MSG@11
PS9, Line 11:    * DECOMMISSIONING_IN_PROGRESS
> nit: use spaces instead of tabs
Done


http://gerrit.cloudera.org:8080/#/c/18441/9/src/kudu/integration-tests/CMakeLists.txt
File src/kudu/integration-tests/CMakeLists.txt:

http://gerrit.cloudera.org:8080/#/c/18441/9/src/kudu/integration-tests/CMakeLists.txt@34
PS9, Line 34:   ts_itest-base.cc
> nit: this change is unnecessary
Ack


http://gerrit.cloudera.org:8080/#/c/18441/9/src/kudu/master/auto_rebalancer-test.cc
File src/kudu/master/auto_rebalancer-test.cc:

http://gerrit.cloudera.org:8080/#/c/18441/9/src/kudu/master/auto_rebalancer-test.cc@70
PS9, Line 70: using strings::Substitute;
> nit: is this needed?
Done


http://gerrit.cloudera.org:8080/#/c/18441/9/src/kudu/master/master.proto
File src/kudu/master/master.proto:

http://gerrit.cloudera.org:8080/#/c/18441/9/src/kudu/master/master.proto@956
PS9, Line 956:   // enabled on the cluster, the master returns a signed authn token.
> can you add a comment on this as well?
Done


http://gerrit.cloudera.org:8080/#/c/18441/9/src/kudu/master/ts_state-test.cc
File src/kudu/master/ts_state-test.cc:

http://gerrit.cloudera.org:8080/#/c/18441/9/src/kudu/master/ts_state-test.cc@69
PS9, Line 69:   switch (rand() % 2) {
> missing DECOMMISSIONED case?
I'll change this back to % 2 as the existing tests start behaving in unpredictable ways if the DECOMMISSIONED state is added.


http://gerrit.cloudera.org:8080/#/c/18441/9/src/kudu/master/ts_state-test.cc@305
PS9, Line 305: DECOMMISSION
> nit: use the full name
Ack


http://gerrit.cloudera.org:8080/#/c/18441/9/src/kudu/rebalance/rebalancer.h
File src/kudu/rebalance/rebalancer.h:

http://gerrit.cloudera.org:8080/#/c/18441/9/src/kudu/rebalance/rebalancer.h@163
PS9, Line 163:   };
> nit: leave the trailing comma
Done


http://gerrit.cloudera.org:8080/#/c/18441/9/src/kudu/rebalance/rebalancer.h@303
PS9, Line 303: } // namespace kudu
> nit: remove this line
Done


http://gerrit.cloudera.org:8080/#/c/18441/9/src/kudu/rebalance/rebalancer.cc
File src/kudu/rebalance/rebalancer.cc:

http://gerrit.cloudera.org:8080/#/c/18441/9/src/kudu/rebalance/rebalancer.cc@544
PS9, Line 544: 
> nit: extra space
Done


http://gerrit.cloudera.org:8080/#/c/18441/9/src/kudu/tools/kudu-tool-test.cc
File src/kudu/tools/kudu-tool-test.cc:

http://gerrit.cloudera.org:8080/#/c/18441/9/src/kudu/tools/kudu-tool-test.cc@a7788
PS9, Line 7788: 
> Is this removed intentionally? If yes, why?
no, it was not intentional, it's placed back


http://gerrit.cloudera.org:8080/#/c/18441/9/src/kudu/tools/kudu-tool-test.cc@7066
PS9, Line 7066: 
> nit: remove these here and below
Done


http://gerrit.cloudera.org:8080/#/c/18441/9/src/kudu/tools/tool_action_tserver.cc
File src/kudu/tools/tool_action_tserver.cc:

http://gerrit.cloudera.org:8080/#/c/18441/9/src/kudu/tools/tool_action_tserver.cc@18
PS9, Line 18: cstddef>
> nit: use <cstddef> instead
Done


http://gerrit.cloudera.org:8080/#/c/18441/9/src/kudu/tools/tool_action_tserver.cc@368
PS9, Line 368: 
> nit: indent
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/18441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I15c52b653c20b2a3a45fbf8934f19f6bd1a9caea
Gerrit-Change-Number: 18441
Gerrit-PatchSet: 10
Gerrit-Owner: Zoltan Chovan <zc...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Zoltan Chovan <zc...@cloudera.com>
Gerrit-Comment-Date: Mon, 24 Jul 2023 11:39:51 +0000
Gerrit-HasComments: Yes

[kudu-CR] [tserver] KUDU-1827: tserver decommission

Posted by "Zoltan Chovan (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Attila Bukor, Kudu Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/18441

to look at the new patch set (#10).

Change subject: [tserver] KUDU-1827: tserver decommission
......................................................................

[tserver] KUDU-1827: tserver decommission

This change introduces two new states for TServers for decommissioning.
The states are:
   * DECOMMISSIONING_IN_PROGRESS
   * DECOMMISSIONED
In these states the TServer is quieced and new replicas are prevented to
be placed on them. When a TServer is being decommissioned, first it
enters the DECOMMISSIONING_IN_PROGRESS state, then a rebalancer job is
started to move replicas away from every TServer that is in this state.
When the rebalancing is done, the TServers move to DECOMMISSIONED state
and will stay in it until they are removed from the cluster via the 'kudu
tserver unregister' tool.
The decommissioning can be started with the following cli command:

$ kudu tserver state enter_decommissioning <master_addresses>
        <tserver_addresses> [-allow_missing_tserver]
        [-negotiation_timeout_ms=<ms>] [-timeout_ms=<ms>]

It is possible to decommission multiple TServers at once by providing
multiple addresses in a comma-separated list, which is preferred to
avoid multiple replica moves between each server decommission.

Change-Id: I15c52b653c20b2a3a45fbf8934f19f6bd1a9caea
---
M src/kudu/master/auto_rebalancer-test.cc
M src/kudu/master/auto_rebalancer.cc
M src/kudu/master/master.proto
M src/kudu/master/master_service.cc
M src/kudu/master/ts_manager.cc
M src/kudu/master/ts_manager.h
M src/kudu/master/ts_state-test.cc
M src/kudu/rebalance/rebalancer.cc
M src/kudu/rebalance/rebalancer.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/rebalancer_tool.cc
M src/kudu/tools/tool_action_common.cc
M src/kudu/tools/tool_action_common.h
M src/kudu/tools/tool_action_tserver.cc
14 files changed, 545 insertions(+), 19 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/41/18441/10
-- 
To view, visit http://gerrit.cloudera.org:8080/18441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I15c52b653c20b2a3a45fbf8934f19f6bd1a9caea
Gerrit-Change-Number: 18441
Gerrit-PatchSet: 10
Gerrit-Owner: Zoltan Chovan <zc...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)