You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Alexey Serbin (Code Review)" <ge...@cloudera.org> on 2022/04/01 02:22:09 UTC

[kudu-CR] WIP: [tools] range rebalancing for 'kudu cluster rebalance'

Alexey Serbin has uploaded this change for review. ( http://gerrit.cloudera.org:8080/18294


Change subject: WIP: [tools] range rebalancing for 'kudu cluster rebalance'
......................................................................

WIP: [tools] range rebalancing for 'kudu cluster rebalance'

WIP:
  * split into logically independent patches
  * proper commit description
  * clean-up TODO(s)
  * add more docs
  * add more tests

This patch adds range rebalancing functionality into the
'kudu cluster rebalance' CLI tool.  The implementation is rather an
MVP: the range rebalancing can now be performed only for a single
table per run.  As far as I can see, there is a room for improvement
since it's possible to perform range-aware replica movements even
during standard whole cluster rebalancing.

Below is two snapshots of distribution of the range-specific tablet
replicas in a cluster.  Those are produced by running the tool with
extra --report_only --output_replica_distribution_details flags
before and after range rebalancing for a single table:

  kudu cluster rebalance \
     --enable_range_rebalancing \
     --tables=default.loadgen_auto_6800f4ec4e164b2b8e42db7b5044df09 \
     127.0.0.1:8765

before:
========================================================================

Table: abb2bbf8b4ff4bc0989bc82c78d4ae2b

Range start key: ''
               UUID               | Server address | Replica Count
----------------------------------+----------------+---------------
 15a8d0fef42c4da2bd5d9e1c5a2de301 | 127.0.0.1:9870 | 8
 3243029e0db04680a2653c6acc048813 | 127.0.0.1:9876 | 0
 324ef10666b14ab9bb61e775fa351ad6 | 127.0.0.1:9872 | 0
 a5c6f822f5cc4645bbb4a14874e311d4 | 127.0.0.1:9874 | 0

Range start key: 'ff80000000000001fff4'
               UUID               | Server address | Replica Count
----------------------------------+----------------+---------------
 15a8d0fef42c4da2bd5d9e1c5a2de301 | 127.0.0.1:9870 | 0
 3243029e0db04680a2653c6acc048813 | 127.0.0.1:9876 | 8
 324ef10666b14ab9bb61e775fa351ad6 | 127.0.0.1:9872 | 0
 a5c6f822f5cc4645bbb4a14874e311d4 | 127.0.0.1:9874 | 0

Range start key: 'ff80000000000003ffe8'
               UUID               | Server address | Replica Count
----------------------------------+----------------+---------------
 15a8d0fef42c4da2bd5d9e1c5a2de301 | 127.0.0.1:9870 | 0
 3243029e0db04680a2653c6acc048813 | 127.0.0.1:9876 | 0
 324ef10666b14ab9bb61e775fa351ad6 | 127.0.0.1:9872 | 8
 a5c6f822f5cc4645bbb4a14874e311d4 | 127.0.0.1:9874 | 0

Range start key: 'ff80000000000005ffdc'
               UUID               | Server address | Replica Count
----------------------------------+----------------+---------------
 15a8d0fef42c4da2bd5d9e1c5a2de301 | 127.0.0.1:9870 | 0
 3243029e0db04680a2653c6acc048813 | 127.0.0.1:9876 | 0
 324ef10666b14ab9bb61e775fa351ad6 | 127.0.0.1:9872 | 0
 a5c6f822f5cc4645bbb4a14874e311d4 | 127.0.0.1:9874 | 8

after:
========================================================================

Table: abb2bbf8b4ff4bc0989bc82c78d4ae2b

Range start key: ''
               UUID               | Server address | Replica Count
----------------------------------+----------------+---------------
 15a8d0fef42c4da2bd5d9e1c5a2de301 | 127.0.0.1:9870 | 2
 3243029e0db04680a2653c6acc048813 | 127.0.0.1:9876 | 2
 324ef10666b14ab9bb61e775fa351ad6 | 127.0.0.1:9872 | 2
 a5c6f822f5cc4645bbb4a14874e311d4 | 127.0.0.1:9874 | 2

Range start key: 'ff80000000000001fff4'
               UUID               | Server address | Replica Count
----------------------------------+----------------+---------------
 15a8d0fef42c4da2bd5d9e1c5a2de301 | 127.0.0.1:9870 | 2
 3243029e0db04680a2653c6acc048813 | 127.0.0.1:9876 | 2
 324ef10666b14ab9bb61e775fa351ad6 | 127.0.0.1:9872 | 2
 a5c6f822f5cc4645bbb4a14874e311d4 | 127.0.0.1:9874 | 2

Range start key: 'ff80000000000003ffe8'
               UUID               | Server address | Replica Count
----------------------------------+----------------+---------------
 15a8d0fef42c4da2bd5d9e1c5a2de301 | 127.0.0.1:9870 | 2
 3243029e0db04680a2653c6acc048813 | 127.0.0.1:9876 | 2
 324ef10666b14ab9bb61e775fa351ad6 | 127.0.0.1:9872 | 2
 a5c6f822f5cc4645bbb4a14874e311d4 | 127.0.0.1:9874 | 2

Range start key: 'ff80000000000005ffdc'
               UUID               | Server address | Replica Count
----------------------------------+----------------+---------------
 15a8d0fef42c4da2bd5d9e1c5a2de301 | 127.0.0.1:9870 | 2
 3243029e0db04680a2653c6acc048813 | 127.0.0.1:9876 | 2
 324ef10666b14ab9bb61e775fa351ad6 | 127.0.0.1:9872 | 2
 a5c6f822f5cc4645bbb4a14874e311d4 | 127.0.0.1:9874 | 2

Change-Id: I7d2e19266e993f5e2ae13ba18d323c83db30eac1
---
M src/kudu/client/CMakeLists.txt
M src/kudu/client/client.h
A src/kudu/client/tablet_info_provider-internal.cc
A src/kudu/client/tablet_info_provider-internal.h
M src/kudu/rebalance/cluster_status.h
M src/kudu/rebalance/rebalance-test.cc
M src/kudu/rebalance/rebalance_algo-test.cc
M src/kudu/rebalance/rebalance_algo.cc
M src/kudu/rebalance/rebalance_algo.h
M src/kudu/rebalance/rebalancer.cc
M src/kudu/rebalance/rebalancer.h
M src/kudu/tools/ksck-test.cc
M src/kudu/tools/ksck.cc
M src/kudu/tools/ksck.h
M src/kudu/tools/ksck_remote.cc
M src/kudu/tools/ksck_results.cc
M src/kudu/tools/rebalancer_tool.cc
M src/kudu/tools/tool.proto
M src/kudu/tools/tool_action_cluster.cc
19 files changed, 1,224 insertions(+), 322 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/94/18294/2
-- 
To view, visit http://gerrit.cloudera.org:8080/18294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I7d2e19266e993f5e2ae13ba18d323c83db30eac1
Gerrit-Change-Number: 18294
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>

[kudu-CR] [tools] range rebalancing for 'kudu cluster rebalance'

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/18294 )

Change subject: [tools] range rebalancing for 'kudu cluster rebalance'
......................................................................


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/18294/4/src/kudu/tools/rebalancer_tool.cc
File src/kudu/tools/rebalancer_tool.cc:

http://gerrit.cloudera.org:8080/#/c/18294/4/src/kudu/tools/rebalancer_tool.cc@602
PS4, Line 602:           DataTable skew_table({ "UUID", "Server address", "Replica Coun
> There is a table per range -- you can see how it looks in the description o
Right, but it only shows the details for a single table, IIUC.

I meant instead of only showing the "Per-range replica distribution details for tables" table, maybe it's worth also a summary of skew per range, i.e. replica count and replica skew, similar to the "Per-table replica distribution details" table below.

I don't feel strongly about it, but was curious since it could be helpful to get a big picture of which table ranges in the cluster need rebalancing.



-- 
To view, visit http://gerrit.cloudera.org:8080/18294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7d2e19266e993f5e2ae13ba18d323c83db30eac1
Gerrit-Change-Number: 18294
Gerrit-PatchSet: 5
Gerrit-Owner: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Sat, 02 Apr 2022 04:54:06 +0000
Gerrit-HasComments: Yes

[kudu-CR] [tools] range rebalancing for 'kudu cluster rebalance'

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/18294 )

Change subject: [tools] range rebalancing for 'kudu cluster rebalance'
......................................................................


Patch Set 4: Code-Review+1

(2 comments)

Overall this looks good to go! Thanks for the great contribution!

http://gerrit.cloudera.org:8080/#/c/18294/4/src/kudu/rebalance/rebalance_algo-test.cc
File src/kudu/rebalance/rebalance_algo-test.cc:

http://gerrit.cloudera.org:8080/#/c/18294/4/src/kudu/rebalance/rebalance_algo-test.cc@998
PS4, Line 998: TEST(RebalanceAlgoUnitTest, FewMovesSameTableRanges) {
nit: consider adding some cases for when there's a mix of both having some tablets with ranges and others without


http://gerrit.cloudera.org:8080/#/c/18294/4/src/kudu/tools/rebalancer_tool.cc
File src/kudu/tools/rebalancer_tool.cc:

http://gerrit.cloudera.org:8080/#/c/18294/4/src/kudu/tools/rebalancer_tool.cc@602
PS4, Line 602:           DataTable skew({ "UUID", "Server address", "Replica Count" });
I think the output here makes a lot of sense, especially to dig into the specifics of each range.

Curious if you think it's worth adding a similar skew table that summarizes the results? I think it could be helpful as a summary, though worry showing both by default might be too verbose.



-- 
To view, visit http://gerrit.cloudera.org:8080/18294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7d2e19266e993f5e2ae13ba18d323c83db30eac1
Gerrit-Change-Number: 18294
Gerrit-PatchSet: 4
Gerrit-Owner: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Sat, 02 Apr 2022 01:47:13 +0000
Gerrit-HasComments: Yes

[kudu-CR] [tools] range rebalancing for 'kudu cluster rebalance'

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/18294 )

Change subject: [tools] range rebalancing for 'kudu cluster rebalance'
......................................................................


Patch Set 4: Verified+1

(1 comment)

Unrelated test failures:

* DisableWriteWhenExceedingQuotaTest.TestDisableWritePrivilegeWhenExceedingSizeQuota
* TimestampAdvancementITest.TestUpgradeFromOlderCorruptedData
* LocationAwareRebalancingParamTest.RebalanceWithIgnoredTServer/idx4_rf3_t3_l_3_2_1
an issue with freeing memory triggered by OPENSSL_cleanup
*** SIGSEGV (@0x0) received by PID 14159 (TID 0x7f68195328c0) from PID 0; stack trace: ***
    @     0x7f682100c980 (unknown) at ??:0                                      
    @     0x7f6820485dc3 tcmalloc::ThreadCache::ReleaseToCentralCache() at ??:0 
    @     0x7f68204860f7 tcmalloc::ThreadCache::Scavenge() at ??:0              
    @     0x7f681af23f01 OPENSSL_LH_free at ??:0                                
    @     0x7f681af0293d (unknown) at ??:0                                      
    @     0x7f681af21aa0 OPENSSL_cleanup at ??:0                                
    @     0x7f681eda4161 (unknown) at ??:0                                      
    @     0x7f681eda425a exit at ??:0                                           
    @     0x7f681ed82bfe __libc_start_main at ??:0                              
    @     0x56183107271a _start at ??:0

http://gerrit.cloudera.org:8080/#/c/18294/3/src/kudu/rebalance/rebalance_algo.cc
File src/kudu/rebalance/rebalance_algo.cc:

http://gerrit.cloudera.org:8080/#/c/18294/3/src/kudu/rebalance/rebalance_algo.cc@123
PS3, Line 123:   return out;
> warning: control reaches end of non-void function [clang-diagnostic-return-
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/18294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7d2e19266e993f5e2ae13ba18d323c83db30eac1
Gerrit-Change-Number: 18294
Gerrit-PatchSet: 4
Gerrit-Owner: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Fri, 01 Apr 2022 18:41:08 +0000
Gerrit-HasComments: Yes

[kudu-CR] [tools] range rebalancing for 'kudu cluster rebalance'

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/18294 )

Change subject: [tools] range rebalancing for 'kudu cluster rebalance'
......................................................................

[tools] range rebalancing for 'kudu cluster rebalance'

This patch adds range rebalancing functionality into the
'kudu cluster rebalance' CLI tool.  The implementation is rather an
MVP: the range rebalancing can now be performed only for a single
table per run.  As far as I can see, there is a room for improvement
since it's possible to perform range-aware replica movements even
during standard whole cluster rebalancing.

Below is two snapshots of distribution of the range-specific tablet
replicas in a cluster.  Those are produced by running the tool with
extra --report_only --output_replica_distribution_details flags
before and after range rebalancing for a single table:

  kudu cluster rebalance \
     --enable_range_rebalancing \
     --tables=default.loadgen_auto_6800f4ec4e164b2b8e42db7b5044df09 \
     127.0.0.1:8765

before:
========================================================================

Table: abb2bbf8b4ff4bc0989bc82c78d4ae2b

Number of tablet replicas at servers for each range
 Max Skew | Total Count |   Range Start Key
----------+-------------+----------------------
 8        | 8           |
 8        | 8           | ff80000000000001fff4
 8        | 8           | ff80000000000003ffe8
 8        | 8           | ff80000000000005ffdc

Range start key: ''
               UUID               | Server address | Replica Count
----------------------------------+----------------+---------------
 15a8d0fef42c4da2bd5d9e1c5a2de301 | 127.0.0.1:9870 | 8
 3243029e0db04680a2653c6acc048813 | 127.0.0.1:9876 | 0
 324ef10666b14ab9bb61e775fa351ad6 | 127.0.0.1:9872 | 0
 a5c6f822f5cc4645bbb4a14874e311d4 | 127.0.0.1:9874 | 0

Range start key: 'ff80000000000001fff4'
               UUID               | Server address | Replica Count
----------------------------------+----------------+---------------
 15a8d0fef42c4da2bd5d9e1c5a2de301 | 127.0.0.1:9870 | 0
 3243029e0db04680a2653c6acc048813 | 127.0.0.1:9876 | 8
 324ef10666b14ab9bb61e775fa351ad6 | 127.0.0.1:9872 | 0
 a5c6f822f5cc4645bbb4a14874e311d4 | 127.0.0.1:9874 | 0

Range start key: 'ff80000000000003ffe8'
               UUID               | Server address | Replica Count
----------------------------------+----------------+---------------
 15a8d0fef42c4da2bd5d9e1c5a2de301 | 127.0.0.1:9870 | 0
 3243029e0db04680a2653c6acc048813 | 127.0.0.1:9876 | 0
 324ef10666b14ab9bb61e775fa351ad6 | 127.0.0.1:9872 | 8
 a5c6f822f5cc4645bbb4a14874e311d4 | 127.0.0.1:9874 | 0

Range start key: 'ff80000000000005ffdc'
               UUID               | Server address | Replica Count
----------------------------------+----------------+---------------
 15a8d0fef42c4da2bd5d9e1c5a2de301 | 127.0.0.1:9870 | 0
 3243029e0db04680a2653c6acc048813 | 127.0.0.1:9876 | 0
 324ef10666b14ab9bb61e775fa351ad6 | 127.0.0.1:9872 | 0
 a5c6f822f5cc4645bbb4a14874e311d4 | 127.0.0.1:9874 | 8

after:
========================================================================

Table: abb2bbf8b4ff4bc0989bc82c78d4ae2b

Number of tablet replicas at servers for each range
 Max Skew | Total Count |   Range Start Key
----------+-------------+----------------------
 0        | 8           |
 0        | 8           | ff80000000000001fff4
 0        | 8           | ff80000000000003ffe8
 0        | 8           | ff80000000000005ffdc

Range start key: ''
               UUID               | Server address | Replica Count
----------------------------------+----------------+---------------
 15a8d0fef42c4da2bd5d9e1c5a2de301 | 127.0.0.1:9870 | 2
 3243029e0db04680a2653c6acc048813 | 127.0.0.1:9876 | 2
 324ef10666b14ab9bb61e775fa351ad6 | 127.0.0.1:9872 | 2
 a5c6f822f5cc4645bbb4a14874e311d4 | 127.0.0.1:9874 | 2

Range start key: 'ff80000000000001fff4'
               UUID               | Server address | Replica Count
----------------------------------+----------------+---------------
 15a8d0fef42c4da2bd5d9e1c5a2de301 | 127.0.0.1:9870 | 2
 3243029e0db04680a2653c6acc048813 | 127.0.0.1:9876 | 2
 324ef10666b14ab9bb61e775fa351ad6 | 127.0.0.1:9872 | 2
 a5c6f822f5cc4645bbb4a14874e311d4 | 127.0.0.1:9874 | 2

Range start key: 'ff80000000000003ffe8'
               UUID               | Server address | Replica Count
----------------------------------+----------------+---------------
 15a8d0fef42c4da2bd5d9e1c5a2de301 | 127.0.0.1:9870 | 2
 3243029e0db04680a2653c6acc048813 | 127.0.0.1:9876 | 2
 324ef10666b14ab9bb61e775fa351ad6 | 127.0.0.1:9872 | 2
 a5c6f822f5cc4645bbb4a14874e311d4 | 127.0.0.1:9874 | 2

Range start key: 'ff80000000000005ffdc'
               UUID               | Server address | Replica Count
----------------------------------+----------------+---------------
 15a8d0fef42c4da2bd5d9e1c5a2de301 | 127.0.0.1:9870 | 2
 3243029e0db04680a2653c6acc048813 | 127.0.0.1:9876 | 2
 324ef10666b14ab9bb61e775fa351ad6 | 127.0.0.1:9872 | 2
 a5c6f822f5cc4645bbb4a14874e311d4 | 127.0.0.1:9874 | 2

Change-Id: I7d2e19266e993f5e2ae13ba18d323c83db30eac1
Reviewed-on: http://gerrit.cloudera.org:8080/18294
Tested-by: Alexey Serbin <al...@apache.org>
Reviewed-by: Andrew Wong <aw...@cloudera.com>
---
M src/kudu/master/auto_rebalancer.cc
M src/kudu/rebalance/rebalance-test.cc
M src/kudu/rebalance/rebalance_algo-test.cc
M src/kudu/rebalance/rebalance_algo.cc
M src/kudu/rebalance/rebalance_algo.h
M src/kudu/rebalance/rebalancer.cc
M src/kudu/rebalance/rebalancer.h
M src/kudu/tools/rebalancer_tool-test.cc
M src/kudu/tools/rebalancer_tool.cc
M src/kudu/tools/tool_action_cluster.cc
10 files changed, 1,233 insertions(+), 329 deletions(-)

Approvals:
  Alexey Serbin: Verified
  Andrew Wong: Looks good to me, approved

-- 
To view, visit http://gerrit.cloudera.org:8080/18294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I7d2e19266e993f5e2ae13ba18d323c83db30eac1
Gerrit-Change-Number: 18294
Gerrit-PatchSet: 8
Gerrit-Owner: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)

[kudu-CR] [tools] range rebalancing for 'kudu cluster rebalance'

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/18294 )

Change subject: [tools] range rebalancing for 'kudu cluster rebalance'
......................................................................


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/18294/4/src/kudu/tools/rebalancer_tool.cc
File src/kudu/tools/rebalancer_tool.cc:

http://gerrit.cloudera.org:8080/#/c/18294/4/src/kudu/tools/rebalancer_tool.cc@602
PS4, Line 602:           DataTable skew({ "UUID", "Server address", "Replica Count" });
> Right, but it only shows the details for a single table, IIUC.
Ah, that's a good point.

Done.



-- 
To view, visit http://gerrit.cloudera.org:8080/18294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7d2e19266e993f5e2ae13ba18d323c83db30eac1
Gerrit-Change-Number: 18294
Gerrit-PatchSet: 4
Gerrit-Owner: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Sat, 02 Apr 2022 18:13:56 +0000
Gerrit-HasComments: Yes

[kudu-CR] [tools] range rebalancing for 'kudu cluster rebalance'

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/18294 )

Change subject: [tools] range rebalancing for 'kudu cluster rebalance'
......................................................................


Patch Set 7:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/18294/7/src/kudu/tools/rebalancer_tool.cc
File src/kudu/tools/rebalancer_tool.cc:

http://gerrit.cloudera.org:8080/#/c/18294/7/src/kudu/tools/rebalancer_tool.cc@614
PS7, Line 614:         string prev_table_id;
> Given the limitation that the rebalancer in this mode can only run on a sin
Exactly -- that's a future/feature proofing since I'm thinking to iterate on this.  The idea is to provide a functionality for minimum viable product as of now, and eventually bring in the functionality to perform range rebalancing across all the tables in the cluster.



-- 
To view, visit http://gerrit.cloudera.org:8080/18294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7d2e19266e993f5e2ae13ba18d323c83db30eac1
Gerrit-Change-Number: 18294
Gerrit-PatchSet: 7
Gerrit-Owner: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Mon, 04 Apr 2022 18:21:53 +0000
Gerrit-HasComments: Yes

[kudu-CR] [tools] range rebalancing for 'kudu cluster rebalance'

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Kudu Jenkins, Andrew Wong, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/18294

to look at the new patch set (#5).

Change subject: [tools] range rebalancing for 'kudu cluster rebalance'
......................................................................

[tools] range rebalancing for 'kudu cluster rebalance'

This patch adds range rebalancing functionality into the
'kudu cluster rebalance' CLI tool.  The implementation is rather an
MVP: the range rebalancing can now be performed only for a single
table per run.  As far as I can see, there is a room for improvement
since it's possible to perform range-aware replica movements even
during standard whole cluster rebalancing.

Below is two snapshots of distribution of the range-specific tablet
replicas in a cluster.  Those are produced by running the tool with
extra --report_only --output_replica_distribution_details flags
before and after range rebalancing for a single table:

  kudu cluster rebalance \
     --enable_range_rebalancing \
     --tables=default.loadgen_auto_6800f4ec4e164b2b8e42db7b5044df09 \
     127.0.0.1:8765

before:
========================================================================

Table: abb2bbf8b4ff4bc0989bc82c78d4ae2b

Range start key: ''
               UUID               | Server address | Replica Count
----------------------------------+----------------+---------------
 15a8d0fef42c4da2bd5d9e1c5a2de301 | 127.0.0.1:9870 | 8
 3243029e0db04680a2653c6acc048813 | 127.0.0.1:9876 | 0
 324ef10666b14ab9bb61e775fa351ad6 | 127.0.0.1:9872 | 0
 a5c6f822f5cc4645bbb4a14874e311d4 | 127.0.0.1:9874 | 0

Range start key: 'ff80000000000001fff4'
               UUID               | Server address | Replica Count
----------------------------------+----------------+---------------
 15a8d0fef42c4da2bd5d9e1c5a2de301 | 127.0.0.1:9870 | 0
 3243029e0db04680a2653c6acc048813 | 127.0.0.1:9876 | 8
 324ef10666b14ab9bb61e775fa351ad6 | 127.0.0.1:9872 | 0
 a5c6f822f5cc4645bbb4a14874e311d4 | 127.0.0.1:9874 | 0

Range start key: 'ff80000000000003ffe8'
               UUID               | Server address | Replica Count
----------------------------------+----------------+---------------
 15a8d0fef42c4da2bd5d9e1c5a2de301 | 127.0.0.1:9870 | 0
 3243029e0db04680a2653c6acc048813 | 127.0.0.1:9876 | 0
 324ef10666b14ab9bb61e775fa351ad6 | 127.0.0.1:9872 | 8
 a5c6f822f5cc4645bbb4a14874e311d4 | 127.0.0.1:9874 | 0

Range start key: 'ff80000000000005ffdc'
               UUID               | Server address | Replica Count
----------------------------------+----------------+---------------
 15a8d0fef42c4da2bd5d9e1c5a2de301 | 127.0.0.1:9870 | 0
 3243029e0db04680a2653c6acc048813 | 127.0.0.1:9876 | 0
 324ef10666b14ab9bb61e775fa351ad6 | 127.0.0.1:9872 | 0
 a5c6f822f5cc4645bbb4a14874e311d4 | 127.0.0.1:9874 | 8

after:
========================================================================

Table: abb2bbf8b4ff4bc0989bc82c78d4ae2b

Range start key: ''
               UUID               | Server address | Replica Count
----------------------------------+----------------+---------------
 15a8d0fef42c4da2bd5d9e1c5a2de301 | 127.0.0.1:9870 | 2
 3243029e0db04680a2653c6acc048813 | 127.0.0.1:9876 | 2
 324ef10666b14ab9bb61e775fa351ad6 | 127.0.0.1:9872 | 2
 a5c6f822f5cc4645bbb4a14874e311d4 | 127.0.0.1:9874 | 2

Range start key: 'ff80000000000001fff4'
               UUID               | Server address | Replica Count
----------------------------------+----------------+---------------
 15a8d0fef42c4da2bd5d9e1c5a2de301 | 127.0.0.1:9870 | 2
 3243029e0db04680a2653c6acc048813 | 127.0.0.1:9876 | 2
 324ef10666b14ab9bb61e775fa351ad6 | 127.0.0.1:9872 | 2
 a5c6f822f5cc4645bbb4a14874e311d4 | 127.0.0.1:9874 | 2

Range start key: 'ff80000000000003ffe8'
               UUID               | Server address | Replica Count
----------------------------------+----------------+---------------
 15a8d0fef42c4da2bd5d9e1c5a2de301 | 127.0.0.1:9870 | 2
 3243029e0db04680a2653c6acc048813 | 127.0.0.1:9876 | 2
 324ef10666b14ab9bb61e775fa351ad6 | 127.0.0.1:9872 | 2
 a5c6f822f5cc4645bbb4a14874e311d4 | 127.0.0.1:9874 | 2

Range start key: 'ff80000000000005ffdc'
               UUID               | Server address | Replica Count
----------------------------------+----------------+---------------
 15a8d0fef42c4da2bd5d9e1c5a2de301 | 127.0.0.1:9870 | 2
 3243029e0db04680a2653c6acc048813 | 127.0.0.1:9876 | 2
 324ef10666b14ab9bb61e775fa351ad6 | 127.0.0.1:9872 | 2
 a5c6f822f5cc4645bbb4a14874e311d4 | 127.0.0.1:9874 | 2

Change-Id: I7d2e19266e993f5e2ae13ba18d323c83db30eac1
---
M src/kudu/master/auto_rebalancer.cc
M src/kudu/rebalance/rebalance-test.cc
M src/kudu/rebalance/rebalance_algo-test.cc
M src/kudu/rebalance/rebalance_algo.cc
M src/kudu/rebalance/rebalance_algo.h
M src/kudu/rebalance/rebalancer.cc
M src/kudu/rebalance/rebalancer.h
M src/kudu/tools/rebalancer_tool-test.cc
M src/kudu/tools/rebalancer_tool.cc
M src/kudu/tools/tool_action_cluster.cc
10 files changed, 1,195 insertions(+), 329 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/94/18294/5
-- 
To view, visit http://gerrit.cloudera.org:8080/18294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7d2e19266e993f5e2ae13ba18d323c83db30eac1
Gerrit-Change-Number: 18294
Gerrit-PatchSet: 5
Gerrit-Owner: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)

[kudu-CR] [tools] range rebalancing for 'kudu cluster rebalance'

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Kudu Jenkins, Andrew Wong, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/18294

to look at the new patch set (#4).

Change subject: [tools] range rebalancing for 'kudu cluster rebalance'
......................................................................

[tools] range rebalancing for 'kudu cluster rebalance'

This patch adds range rebalancing functionality into the
'kudu cluster rebalance' CLI tool.  The implementation is rather an
MVP: the range rebalancing can now be performed only for a single
table per run.  As far as I can see, there is a room for improvement
since it's possible to perform range-aware replica movements even
during standard whole cluster rebalancing.

Below is two snapshots of distribution of the range-specific tablet
replicas in a cluster.  Those are produced by running the tool with
extra --report_only --output_replica_distribution_details flags
before and after range rebalancing for a single table:

  kudu cluster rebalance \
     --enable_range_rebalancing \
     --tables=default.loadgen_auto_6800f4ec4e164b2b8e42db7b5044df09 \
     127.0.0.1:8765

before:
========================================================================

Table: abb2bbf8b4ff4bc0989bc82c78d4ae2b

Range start key: ''
               UUID               | Server address | Replica Count
----------------------------------+----------------+---------------
 15a8d0fef42c4da2bd5d9e1c5a2de301 | 127.0.0.1:9870 | 8
 3243029e0db04680a2653c6acc048813 | 127.0.0.1:9876 | 0
 324ef10666b14ab9bb61e775fa351ad6 | 127.0.0.1:9872 | 0
 a5c6f822f5cc4645bbb4a14874e311d4 | 127.0.0.1:9874 | 0

Range start key: 'ff80000000000001fff4'
               UUID               | Server address | Replica Count
----------------------------------+----------------+---------------
 15a8d0fef42c4da2bd5d9e1c5a2de301 | 127.0.0.1:9870 | 0
 3243029e0db04680a2653c6acc048813 | 127.0.0.1:9876 | 8
 324ef10666b14ab9bb61e775fa351ad6 | 127.0.0.1:9872 | 0
 a5c6f822f5cc4645bbb4a14874e311d4 | 127.0.0.1:9874 | 0

Range start key: 'ff80000000000003ffe8'
               UUID               | Server address | Replica Count
----------------------------------+----------------+---------------
 15a8d0fef42c4da2bd5d9e1c5a2de301 | 127.0.0.1:9870 | 0
 3243029e0db04680a2653c6acc048813 | 127.0.0.1:9876 | 0
 324ef10666b14ab9bb61e775fa351ad6 | 127.0.0.1:9872 | 8
 a5c6f822f5cc4645bbb4a14874e311d4 | 127.0.0.1:9874 | 0

Range start key: 'ff80000000000005ffdc'
               UUID               | Server address | Replica Count
----------------------------------+----------------+---------------
 15a8d0fef42c4da2bd5d9e1c5a2de301 | 127.0.0.1:9870 | 0
 3243029e0db04680a2653c6acc048813 | 127.0.0.1:9876 | 0
 324ef10666b14ab9bb61e775fa351ad6 | 127.0.0.1:9872 | 0
 a5c6f822f5cc4645bbb4a14874e311d4 | 127.0.0.1:9874 | 8

after:
========================================================================

Table: abb2bbf8b4ff4bc0989bc82c78d4ae2b

Range start key: ''
               UUID               | Server address | Replica Count
----------------------------------+----------------+---------------
 15a8d0fef42c4da2bd5d9e1c5a2de301 | 127.0.0.1:9870 | 2
 3243029e0db04680a2653c6acc048813 | 127.0.0.1:9876 | 2
 324ef10666b14ab9bb61e775fa351ad6 | 127.0.0.1:9872 | 2
 a5c6f822f5cc4645bbb4a14874e311d4 | 127.0.0.1:9874 | 2

Range start key: 'ff80000000000001fff4'
               UUID               | Server address | Replica Count
----------------------------------+----------------+---------------
 15a8d0fef42c4da2bd5d9e1c5a2de301 | 127.0.0.1:9870 | 2
 3243029e0db04680a2653c6acc048813 | 127.0.0.1:9876 | 2
 324ef10666b14ab9bb61e775fa351ad6 | 127.0.0.1:9872 | 2
 a5c6f822f5cc4645bbb4a14874e311d4 | 127.0.0.1:9874 | 2

Range start key: 'ff80000000000003ffe8'
               UUID               | Server address | Replica Count
----------------------------------+----------------+---------------
 15a8d0fef42c4da2bd5d9e1c5a2de301 | 127.0.0.1:9870 | 2
 3243029e0db04680a2653c6acc048813 | 127.0.0.1:9876 | 2
 324ef10666b14ab9bb61e775fa351ad6 | 127.0.0.1:9872 | 2
 a5c6f822f5cc4645bbb4a14874e311d4 | 127.0.0.1:9874 | 2

Range start key: 'ff80000000000005ffdc'
               UUID               | Server address | Replica Count
----------------------------------+----------------+---------------
 15a8d0fef42c4da2bd5d9e1c5a2de301 | 127.0.0.1:9870 | 2
 3243029e0db04680a2653c6acc048813 | 127.0.0.1:9876 | 2
 324ef10666b14ab9bb61e775fa351ad6 | 127.0.0.1:9872 | 2
 a5c6f822f5cc4645bbb4a14874e311d4 | 127.0.0.1:9874 | 2

Change-Id: I7d2e19266e993f5e2ae13ba18d323c83db30eac1
---
M src/kudu/master/auto_rebalancer.cc
M src/kudu/rebalance/rebalance-test.cc
M src/kudu/rebalance/rebalance_algo-test.cc
M src/kudu/rebalance/rebalance_algo.cc
M src/kudu/rebalance/rebalance_algo.h
M src/kudu/rebalance/rebalancer.cc
M src/kudu/rebalance/rebalancer.h
M src/kudu/tools/rebalancer_tool-test.cc
M src/kudu/tools/rebalancer_tool.cc
M src/kudu/tools/tool_action_cluster.cc
10 files changed, 1,181 insertions(+), 329 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/94/18294/4
-- 
To view, visit http://gerrit.cloudera.org:8080/18294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7d2e19266e993f5e2ae13ba18d323c83db30eac1
Gerrit-Change-Number: 18294
Gerrit-PatchSet: 4
Gerrit-Owner: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)

[kudu-CR] [tools] range rebalancing for 'kudu cluster rebalance'

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/18294 )

Change subject: [tools] range rebalancing for 'kudu cluster rebalance'
......................................................................


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/18294/4/src/kudu/tools/rebalancer_tool.cc
File src/kudu/tools/rebalancer_tool.cc:

http://gerrit.cloudera.org:8080/#/c/18294/4/src/kudu/tools/rebalancer_tool.cc@602
PS4, Line 602:           DataTable skew({ "UUID", "Server address", "Replica Count" });
> I think the output here makes a lot of sense, especially to dig into the sp
Actually I realize now that my suggestion doesn't apply, since the limitation exists to run on a single table.

That said, perhaps it's worth adding the ranges to the skew table below.



-- 
To view, visit http://gerrit.cloudera.org:8080/18294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7d2e19266e993f5e2ae13ba18d323c83db30eac1
Gerrit-Change-Number: 18294
Gerrit-PatchSet: 4
Gerrit-Owner: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Sat, 02 Apr 2022 01:50:57 +0000
Gerrit-HasComments: Yes

[kudu-CR] [tools] range rebalancing for 'kudu cluster rebalance'

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Kudu Jenkins, Andrew Wong, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/18294

to look at the new patch set (#6).

Change subject: [tools] range rebalancing for 'kudu cluster rebalance'
......................................................................

[tools] range rebalancing for 'kudu cluster rebalance'

This patch adds range rebalancing functionality into the
'kudu cluster rebalance' CLI tool.  The implementation is rather an
MVP: the range rebalancing can now be performed only for a single
table per run.  As far as I can see, there is a room for improvement
since it's possible to perform range-aware replica movements even
during standard whole cluster rebalancing.

Below is two snapshots of distribution of the range-specific tablet
replicas in a cluster.  Those are produced by running the tool with
extra --report_only --output_replica_distribution_details flags
before and after range rebalancing for a single table:

  kudu cluster rebalance \
     --enable_range_rebalancing \
     --tables=default.loadgen_auto_6800f4ec4e164b2b8e42db7b5044df09 \
     127.0.0.1:8765

before:
========================================================================

Table: abb2bbf8b4ff4bc0989bc82c78d4ae2b

Number of tablet replicas at servers for each range
 Max Skew | Total Count |   Range Start Key
----------+-------------+----------------------
 8        | 8           |
 8        | 8           | ff80000000000001fff4
 8        | 8           | ff80000000000003ffe8
 8        | 8           | ff80000000000005ffdc

Range start key: ''
               UUID               | Server address | Replica Count
----------------------------------+----------------+---------------
 15a8d0fef42c4da2bd5d9e1c5a2de301 | 127.0.0.1:9870 | 8
 3243029e0db04680a2653c6acc048813 | 127.0.0.1:9876 | 0
 324ef10666b14ab9bb61e775fa351ad6 | 127.0.0.1:9872 | 0
 a5c6f822f5cc4645bbb4a14874e311d4 | 127.0.0.1:9874 | 0

Range start key: 'ff80000000000001fff4'
               UUID               | Server address | Replica Count
----------------------------------+----------------+---------------
 15a8d0fef42c4da2bd5d9e1c5a2de301 | 127.0.0.1:9870 | 0
 3243029e0db04680a2653c6acc048813 | 127.0.0.1:9876 | 8
 324ef10666b14ab9bb61e775fa351ad6 | 127.0.0.1:9872 | 0
 a5c6f822f5cc4645bbb4a14874e311d4 | 127.0.0.1:9874 | 0

Range start key: 'ff80000000000003ffe8'
               UUID               | Server address | Replica Count
----------------------------------+----------------+---------------
 15a8d0fef42c4da2bd5d9e1c5a2de301 | 127.0.0.1:9870 | 0
 3243029e0db04680a2653c6acc048813 | 127.0.0.1:9876 | 0
 324ef10666b14ab9bb61e775fa351ad6 | 127.0.0.1:9872 | 8
 a5c6f822f5cc4645bbb4a14874e311d4 | 127.0.0.1:9874 | 0

Range start key: 'ff80000000000005ffdc'
               UUID               | Server address | Replica Count
----------------------------------+----------------+---------------
 15a8d0fef42c4da2bd5d9e1c5a2de301 | 127.0.0.1:9870 | 0
 3243029e0db04680a2653c6acc048813 | 127.0.0.1:9876 | 0
 324ef10666b14ab9bb61e775fa351ad6 | 127.0.0.1:9872 | 0
 a5c6f822f5cc4645bbb4a14874e311d4 | 127.0.0.1:9874 | 8

after:
========================================================================

Table: abb2bbf8b4ff4bc0989bc82c78d4ae2b

Number of tablet replicas at servers for each range
 Max Skew | Total Count |   Range Start Key
----------+-------------+----------------------
 0        | 8           |
 0        | 8           | ff80000000000001fff4
 0        | 8           | ff80000000000003ffe8
 0        | 8           | ff80000000000005ffdc

Range start key: ''
               UUID               | Server address | Replica Count
----------------------------------+----------------+---------------
 15a8d0fef42c4da2bd5d9e1c5a2de301 | 127.0.0.1:9870 | 2
 3243029e0db04680a2653c6acc048813 | 127.0.0.1:9876 | 2
 324ef10666b14ab9bb61e775fa351ad6 | 127.0.0.1:9872 | 2
 a5c6f822f5cc4645bbb4a14874e311d4 | 127.0.0.1:9874 | 2

Range start key: 'ff80000000000001fff4'
               UUID               | Server address | Replica Count
----------------------------------+----------------+---------------
 15a8d0fef42c4da2bd5d9e1c5a2de301 | 127.0.0.1:9870 | 2
 3243029e0db04680a2653c6acc048813 | 127.0.0.1:9876 | 2
 324ef10666b14ab9bb61e775fa351ad6 | 127.0.0.1:9872 | 2
 a5c6f822f5cc4645bbb4a14874e311d4 | 127.0.0.1:9874 | 2

Range start key: 'ff80000000000003ffe8'
               UUID               | Server address | Replica Count
----------------------------------+----------------+---------------
 15a8d0fef42c4da2bd5d9e1c5a2de301 | 127.0.0.1:9870 | 2
 3243029e0db04680a2653c6acc048813 | 127.0.0.1:9876 | 2
 324ef10666b14ab9bb61e775fa351ad6 | 127.0.0.1:9872 | 2
 a5c6f822f5cc4645bbb4a14874e311d4 | 127.0.0.1:9874 | 2

Range start key: 'ff80000000000005ffdc'
               UUID               | Server address | Replica Count
----------------------------------+----------------+---------------
 15a8d0fef42c4da2bd5d9e1c5a2de301 | 127.0.0.1:9870 | 2
 3243029e0db04680a2653c6acc048813 | 127.0.0.1:9876 | 2
 324ef10666b14ab9bb61e775fa351ad6 | 127.0.0.1:9872 | 2
 a5c6f822f5cc4645bbb4a14874e311d4 | 127.0.0.1:9874 | 2

Change-Id: I7d2e19266e993f5e2ae13ba18d323c83db30eac1
---
M src/kudu/master/auto_rebalancer.cc
M src/kudu/rebalance/rebalance-test.cc
M src/kudu/rebalance/rebalance_algo-test.cc
M src/kudu/rebalance/rebalance_algo.cc
M src/kudu/rebalance/rebalance_algo.h
M src/kudu/rebalance/rebalancer.cc
M src/kudu/rebalance/rebalancer.h
M src/kudu/tools/rebalancer_tool-test.cc
M src/kudu/tools/rebalancer_tool.cc
M src/kudu/tools/tool_action_cluster.cc
10 files changed, 1,232 insertions(+), 329 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/94/18294/6
-- 
To view, visit http://gerrit.cloudera.org:8080/18294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7d2e19266e993f5e2ae13ba18d323c83db30eac1
Gerrit-Change-Number: 18294
Gerrit-PatchSet: 6
Gerrit-Owner: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)

[kudu-CR] [tools] range rebalancing for 'kudu cluster rebalance'

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Kudu Jenkins, Andrew Wong, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/18294

to look at the new patch set (#7).

Change subject: [tools] range rebalancing for 'kudu cluster rebalance'
......................................................................

[tools] range rebalancing for 'kudu cluster rebalance'

This patch adds range rebalancing functionality into the
'kudu cluster rebalance' CLI tool.  The implementation is rather an
MVP: the range rebalancing can now be performed only for a single
table per run.  As far as I can see, there is a room for improvement
since it's possible to perform range-aware replica movements even
during standard whole cluster rebalancing.

Below is two snapshots of distribution of the range-specific tablet
replicas in a cluster.  Those are produced by running the tool with
extra --report_only --output_replica_distribution_details flags
before and after range rebalancing for a single table:

  kudu cluster rebalance \
     --enable_range_rebalancing \
     --tables=default.loadgen_auto_6800f4ec4e164b2b8e42db7b5044df09 \
     127.0.0.1:8765

before:
========================================================================

Table: abb2bbf8b4ff4bc0989bc82c78d4ae2b

Number of tablet replicas at servers for each range
 Max Skew | Total Count |   Range Start Key
----------+-------------+----------------------
 8        | 8           |
 8        | 8           | ff80000000000001fff4
 8        | 8           | ff80000000000003ffe8
 8        | 8           | ff80000000000005ffdc

Range start key: ''
               UUID               | Server address | Replica Count
----------------------------------+----------------+---------------
 15a8d0fef42c4da2bd5d9e1c5a2de301 | 127.0.0.1:9870 | 8
 3243029e0db04680a2653c6acc048813 | 127.0.0.1:9876 | 0
 324ef10666b14ab9bb61e775fa351ad6 | 127.0.0.1:9872 | 0
 a5c6f822f5cc4645bbb4a14874e311d4 | 127.0.0.1:9874 | 0

Range start key: 'ff80000000000001fff4'
               UUID               | Server address | Replica Count
----------------------------------+----------------+---------------
 15a8d0fef42c4da2bd5d9e1c5a2de301 | 127.0.0.1:9870 | 0
 3243029e0db04680a2653c6acc048813 | 127.0.0.1:9876 | 8
 324ef10666b14ab9bb61e775fa351ad6 | 127.0.0.1:9872 | 0
 a5c6f822f5cc4645bbb4a14874e311d4 | 127.0.0.1:9874 | 0

Range start key: 'ff80000000000003ffe8'
               UUID               | Server address | Replica Count
----------------------------------+----------------+---------------
 15a8d0fef42c4da2bd5d9e1c5a2de301 | 127.0.0.1:9870 | 0
 3243029e0db04680a2653c6acc048813 | 127.0.0.1:9876 | 0
 324ef10666b14ab9bb61e775fa351ad6 | 127.0.0.1:9872 | 8
 a5c6f822f5cc4645bbb4a14874e311d4 | 127.0.0.1:9874 | 0

Range start key: 'ff80000000000005ffdc'
               UUID               | Server address | Replica Count
----------------------------------+----------------+---------------
 15a8d0fef42c4da2bd5d9e1c5a2de301 | 127.0.0.1:9870 | 0
 3243029e0db04680a2653c6acc048813 | 127.0.0.1:9876 | 0
 324ef10666b14ab9bb61e775fa351ad6 | 127.0.0.1:9872 | 0
 a5c6f822f5cc4645bbb4a14874e311d4 | 127.0.0.1:9874 | 8

after:
========================================================================

Table: abb2bbf8b4ff4bc0989bc82c78d4ae2b

Number of tablet replicas at servers for each range
 Max Skew | Total Count |   Range Start Key
----------+-------------+----------------------
 0        | 8           |
 0        | 8           | ff80000000000001fff4
 0        | 8           | ff80000000000003ffe8
 0        | 8           | ff80000000000005ffdc

Range start key: ''
               UUID               | Server address | Replica Count
----------------------------------+----------------+---------------
 15a8d0fef42c4da2bd5d9e1c5a2de301 | 127.0.0.1:9870 | 2
 3243029e0db04680a2653c6acc048813 | 127.0.0.1:9876 | 2
 324ef10666b14ab9bb61e775fa351ad6 | 127.0.0.1:9872 | 2
 a5c6f822f5cc4645bbb4a14874e311d4 | 127.0.0.1:9874 | 2

Range start key: 'ff80000000000001fff4'
               UUID               | Server address | Replica Count
----------------------------------+----------------+---------------
 15a8d0fef42c4da2bd5d9e1c5a2de301 | 127.0.0.1:9870 | 2
 3243029e0db04680a2653c6acc048813 | 127.0.0.1:9876 | 2
 324ef10666b14ab9bb61e775fa351ad6 | 127.0.0.1:9872 | 2
 a5c6f822f5cc4645bbb4a14874e311d4 | 127.0.0.1:9874 | 2

Range start key: 'ff80000000000003ffe8'
               UUID               | Server address | Replica Count
----------------------------------+----------------+---------------
 15a8d0fef42c4da2bd5d9e1c5a2de301 | 127.0.0.1:9870 | 2
 3243029e0db04680a2653c6acc048813 | 127.0.0.1:9876 | 2
 324ef10666b14ab9bb61e775fa351ad6 | 127.0.0.1:9872 | 2
 a5c6f822f5cc4645bbb4a14874e311d4 | 127.0.0.1:9874 | 2

Range start key: 'ff80000000000005ffdc'
               UUID               | Server address | Replica Count
----------------------------------+----------------+---------------
 15a8d0fef42c4da2bd5d9e1c5a2de301 | 127.0.0.1:9870 | 2
 3243029e0db04680a2653c6acc048813 | 127.0.0.1:9876 | 2
 324ef10666b14ab9bb61e775fa351ad6 | 127.0.0.1:9872 | 2
 a5c6f822f5cc4645bbb4a14874e311d4 | 127.0.0.1:9874 | 2

Change-Id: I7d2e19266e993f5e2ae13ba18d323c83db30eac1
---
M src/kudu/master/auto_rebalancer.cc
M src/kudu/rebalance/rebalance-test.cc
M src/kudu/rebalance/rebalance_algo-test.cc
M src/kudu/rebalance/rebalance_algo.cc
M src/kudu/rebalance/rebalance_algo.h
M src/kudu/rebalance/rebalancer.cc
M src/kudu/rebalance/rebalancer.h
M src/kudu/tools/rebalancer_tool-test.cc
M src/kudu/tools/rebalancer_tool.cc
M src/kudu/tools/tool_action_cluster.cc
10 files changed, 1,233 insertions(+), 329 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/94/18294/7
-- 
To view, visit http://gerrit.cloudera.org:8080/18294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7d2e19266e993f5e2ae13ba18d323c83db30eac1
Gerrit-Change-Number: 18294
Gerrit-PatchSet: 7
Gerrit-Owner: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)

[kudu-CR] [tools] range rebalancing for 'kudu cluster rebalance'

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Hello Andrew Wong, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/18294

to look at the new patch set (#3).

Change subject: [tools] range rebalancing for 'kudu cluster rebalance'
......................................................................

[tools] range rebalancing for 'kudu cluster rebalance'

This patch adds range rebalancing functionality into the
'kudu cluster rebalance' CLI tool.  The implementation is rather an
MVP: the range rebalancing can now be performed only for a single
table per run.  As far as I can see, there is a room for improvement
since it's possible to perform range-aware replica movements even
during standard whole cluster rebalancing.

Below is two snapshots of distribution of the range-specific tablet
replicas in a cluster.  Those are produced by running the tool with
extra --report_only --output_replica_distribution_details flags
before and after range rebalancing for a single table:

  kudu cluster rebalance \
     --enable_range_rebalancing \
     --tables=default.loadgen_auto_6800f4ec4e164b2b8e42db7b5044df09 \
     127.0.0.1:8765

before:
========================================================================

Table: abb2bbf8b4ff4bc0989bc82c78d4ae2b

Range start key: ''
               UUID               | Server address | Replica Count
----------------------------------+----------------+---------------
 15a8d0fef42c4da2bd5d9e1c5a2de301 | 127.0.0.1:9870 | 8
 3243029e0db04680a2653c6acc048813 | 127.0.0.1:9876 | 0
 324ef10666b14ab9bb61e775fa351ad6 | 127.0.0.1:9872 | 0
 a5c6f822f5cc4645bbb4a14874e311d4 | 127.0.0.1:9874 | 0

Range start key: 'ff80000000000001fff4'
               UUID               | Server address | Replica Count
----------------------------------+----------------+---------------
 15a8d0fef42c4da2bd5d9e1c5a2de301 | 127.0.0.1:9870 | 0
 3243029e0db04680a2653c6acc048813 | 127.0.0.1:9876 | 8
 324ef10666b14ab9bb61e775fa351ad6 | 127.0.0.1:9872 | 0
 a5c6f822f5cc4645bbb4a14874e311d4 | 127.0.0.1:9874 | 0

Range start key: 'ff80000000000003ffe8'
               UUID               | Server address | Replica Count
----------------------------------+----------------+---------------
 15a8d0fef42c4da2bd5d9e1c5a2de301 | 127.0.0.1:9870 | 0
 3243029e0db04680a2653c6acc048813 | 127.0.0.1:9876 | 0
 324ef10666b14ab9bb61e775fa351ad6 | 127.0.0.1:9872 | 8
 a5c6f822f5cc4645bbb4a14874e311d4 | 127.0.0.1:9874 | 0

Range start key: 'ff80000000000005ffdc'
               UUID               | Server address | Replica Count
----------------------------------+----------------+---------------
 15a8d0fef42c4da2bd5d9e1c5a2de301 | 127.0.0.1:9870 | 0
 3243029e0db04680a2653c6acc048813 | 127.0.0.1:9876 | 0
 324ef10666b14ab9bb61e775fa351ad6 | 127.0.0.1:9872 | 0
 a5c6f822f5cc4645bbb4a14874e311d4 | 127.0.0.1:9874 | 8

after:
========================================================================

Table: abb2bbf8b4ff4bc0989bc82c78d4ae2b

Range start key: ''
               UUID               | Server address | Replica Count
----------------------------------+----------------+---------------
 15a8d0fef42c4da2bd5d9e1c5a2de301 | 127.0.0.1:9870 | 2
 3243029e0db04680a2653c6acc048813 | 127.0.0.1:9876 | 2
 324ef10666b14ab9bb61e775fa351ad6 | 127.0.0.1:9872 | 2
 a5c6f822f5cc4645bbb4a14874e311d4 | 127.0.0.1:9874 | 2

Range start key: 'ff80000000000001fff4'
               UUID               | Server address | Replica Count
----------------------------------+----------------+---------------
 15a8d0fef42c4da2bd5d9e1c5a2de301 | 127.0.0.1:9870 | 2
 3243029e0db04680a2653c6acc048813 | 127.0.0.1:9876 | 2
 324ef10666b14ab9bb61e775fa351ad6 | 127.0.0.1:9872 | 2
 a5c6f822f5cc4645bbb4a14874e311d4 | 127.0.0.1:9874 | 2

Range start key: 'ff80000000000003ffe8'
               UUID               | Server address | Replica Count
----------------------------------+----------------+---------------
 15a8d0fef42c4da2bd5d9e1c5a2de301 | 127.0.0.1:9870 | 2
 3243029e0db04680a2653c6acc048813 | 127.0.0.1:9876 | 2
 324ef10666b14ab9bb61e775fa351ad6 | 127.0.0.1:9872 | 2
 a5c6f822f5cc4645bbb4a14874e311d4 | 127.0.0.1:9874 | 2

Range start key: 'ff80000000000005ffdc'
               UUID               | Server address | Replica Count
----------------------------------+----------------+---------------
 15a8d0fef42c4da2bd5d9e1c5a2de301 | 127.0.0.1:9870 | 2
 3243029e0db04680a2653c6acc048813 | 127.0.0.1:9876 | 2
 324ef10666b14ab9bb61e775fa351ad6 | 127.0.0.1:9872 | 2
 a5c6f822f5cc4645bbb4a14874e311d4 | 127.0.0.1:9874 | 2

Change-Id: I7d2e19266e993f5e2ae13ba18d323c83db30eac1
---
M src/kudu/master/auto_rebalancer.cc
M src/kudu/rebalance/rebalance-test.cc
M src/kudu/rebalance/rebalance_algo-test.cc
M src/kudu/rebalance/rebalance_algo.cc
M src/kudu/rebalance/rebalance_algo.h
M src/kudu/rebalance/rebalancer.cc
M src/kudu/rebalance/rebalancer.h
M src/kudu/tools/ksck.cc
M src/kudu/tools/ksck_results.cc
M src/kudu/tools/rebalancer_tool-test.cc
M src/kudu/tools/rebalancer_tool.cc
M src/kudu/tools/tool_action_cluster.cc
12 files changed, 1,176 insertions(+), 325 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/94/18294/3
-- 
To view, visit http://gerrit.cloudera.org:8080/18294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7d2e19266e993f5e2ae13ba18d323c83db30eac1
Gerrit-Change-Number: 18294
Gerrit-PatchSet: 3
Gerrit-Owner: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>

[kudu-CR] [tools] range rebalancing for 'kudu cluster rebalance'

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has removed a vote on this change.

Change subject: [tools] range rebalancing for 'kudu cluster rebalance'
......................................................................


Removed Verified-1 by Kudu Jenkins (120)
-- 
To view, visit http://gerrit.cloudera.org:8080/18294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: I7d2e19266e993f5e2ae13ba18d323c83db30eac1
Gerrit-Change-Number: 18294
Gerrit-PatchSet: 7
Gerrit-Owner: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)

[kudu-CR] [tools] range rebalancing for 'kudu cluster rebalance'

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has removed a vote on this change.

Change subject: [tools] range rebalancing for 'kudu cluster rebalance'
......................................................................


Removed Verified-1 by Kudu Jenkins (120)
-- 
To view, visit http://gerrit.cloudera.org:8080/18294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: I7d2e19266e993f5e2ae13ba18d323c83db30eac1
Gerrit-Change-Number: 18294
Gerrit-PatchSet: 4
Gerrit-Owner: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)

[kudu-CR] [tools] range rebalancing for 'kudu cluster rebalance'

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/18294 )

Change subject: [tools] range rebalancing for 'kudu cluster rebalance'
......................................................................


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/18294/4/src/kudu/rebalance/rebalance_algo-test.cc
File src/kudu/rebalance/rebalance_algo-test.cc:

http://gerrit.cloudera.org:8080/#/c/18294/4/src/kudu/rebalance/rebalance_algo-test.cc@998
PS4, Line 998: TEST(RebalanceAlgoUnitTest, FewMovesSameTableRanges) {
> nit: consider adding some cases for when there's a mix of both having some 
Done


http://gerrit.cloudera.org:8080/#/c/18294/4/src/kudu/tools/rebalancer_tool.cc
File src/kudu/tools/rebalancer_tool.cc:

http://gerrit.cloudera.org:8080/#/c/18294/4/src/kudu/tools/rebalancer_tool.cc@602
PS4, Line 602:           DataTable skew({ "UUID", "Server address", "Replica Count" });
> Actually I realize now that my suggestion doesn't apply, since the limitati
There is a table per range -- you can see how it looks in the description of this changelist.

Does it make sense or you meant to add the ranges into the table by some other manner?



-- 
To view, visit http://gerrit.cloudera.org:8080/18294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7d2e19266e993f5e2ae13ba18d323c83db30eac1
Gerrit-Change-Number: 18294
Gerrit-PatchSet: 4
Gerrit-Owner: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Sat, 02 Apr 2022 04:05:18 +0000
Gerrit-HasComments: Yes

[kudu-CR] [tools] range rebalancing for 'kudu cluster rebalance'

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/18294 )

Change subject: [tools] range rebalancing for 'kudu cluster rebalance'
......................................................................


Patch Set 7: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/18294/7/src/kudu/tools/rebalancer_tool.cc
File src/kudu/tools/rebalancer_tool.cc:

http://gerrit.cloudera.org:8080/#/c/18294/7/src/kudu/tools/rebalancer_tool.cc@614
PS7, Line 614:         string prev_table_id;
Given the limitation that the rebalancer in this mode can only run on a single table, does the prev_table_id actually get used, or is this more future proofing?



-- 
To view, visit http://gerrit.cloudera.org:8080/18294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7d2e19266e993f5e2ae13ba18d323c83db30eac1
Gerrit-Change-Number: 18294
Gerrit-PatchSet: 7
Gerrit-Owner: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Mon, 04 Apr 2022 18:07:06 +0000
Gerrit-HasComments: Yes

[kudu-CR] [tools] range rebalancing for 'kudu cluster rebalance'

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/18294 )

Change subject: [tools] range rebalancing for 'kudu cluster rebalance'
......................................................................


Patch Set 7: Verified+1

Unrelated failure in LocationAwareRebalancingParamTest.RebalanceWithIgnoredTServer/idx5_rf5_t4_l_3_2_3


-- 
To view, visit http://gerrit.cloudera.org:8080/18294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7d2e19266e993f5e2ae13ba18d323c83db30eac1
Gerrit-Change-Number: 18294
Gerrit-PatchSet: 7
Gerrit-Owner: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Alexey Serbin <al...@apache.org>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Sat, 02 Apr 2022 21:50:30 +0000
Gerrit-HasComments: No