You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Alexey Serbin (Code Review)" <ge...@cloudera.org> on 2018/05/07 23:41:13 UTC

[kudu-CR] WIP [rebalancing] high-level rebalancing algorithm

Alexey Serbin has uploaded this change for review. ( http://gerrit.cloudera.org:8080/10336


Change subject: WIP [rebalancing] high-level rebalancing algorithm
......................................................................

WIP [rebalancing] high-level rebalancing algorithm

This changelist introduces the high-level rebalancing algorithm
and unit corresponding unit tests.

The algorithm operates using high-level entities such as table and
replica (no tablets exists in that view of the world).

Change-Id: I5a8050ee79117a2ae2f6f88740ed25656946cfb4
WIP: just collecting feedback, need to add more unit tests.
---
M src/kudu/tools/CMakeLists.txt
A src/kudu/tools/rebalance-algo-test.cc
A src/kudu/tools/rebalance-algo.cc
A src/kudu/tools/rebalance-algo.h
4 files changed, 851 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/36/10336/1
-- 
To view, visit http://gerrit.cloudera.org:8080/10336
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I5a8050ee79117a2ae2f6f88740ed25656946cfb4
Gerrit-Change-Number: 10336
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>

[kudu-CR] [rebalancing] add a rebalancing algorithm

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change. ( http://gerrit.cloudera.org:8080/10336 )

Change subject: [rebalancing] add a rebalancing algorithm
......................................................................


Patch Set 19:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/10336/18/src/kudu/tools/rebalance_algo.h
File src/kudu/tools/rebalance_algo.h:

http://gerrit.cloudera.org:8080/#/c/10336/18/src/kudu/tools/rebalance_algo.h@148
PS18, Line 148:   Status GetIntersection(
              :       ExtremumType extremum,
              :       const ServersByCountMap& servers_by_table_replica_count,
              :       const ServersByCountMap& servers_by_tota
> After some consideration, I decided to leave this as is for now.
Ack


http://gerrit.cloudera.org:8080/#/c/10336/19/src/kudu/tools/rebalance_algo.cc
File src/kudu/tools/rebalance_algo.cc:

http://gerrit.cloudera.org:8080/#/c/10336/19/src/kudu/tools/rebalance_algo.cc@88
PS19, Line 88: DCHECK(found_src && found_dst);
since you're handling this case now, maybe remove this?


http://gerrit.cloudera.org:8080/#/c/10336/19/src/kudu/tools/rebalance_algo.cc@117
PS19, Line 117: Value of '0' is a shortcut for 'the possible maximum'.
do we need this? if so, please doc this



-- 
To view, visit http://gerrit.cloudera.org:8080/10336
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5a8050ee79117a2ae2f6f88740ed25656946cfb4
Gerrit-Change-Number: 10336
Gerrit-PatchSet: 19
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-Comment-Date: Thu, 07 Jun 2018 02:44:28 +0000
Gerrit-HasComments: Yes

[kudu-CR] WIP [rebalancing] high-level rebalancing algorithm

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Will Berkeley has posted comments on this change. ( http://gerrit.cloudera.org:8080/10336 )

Change subject: WIP [rebalancing] high-level rebalancing algorithm
......................................................................


Patch Set 3:

(56 comments)

Lots of comments but it's a lot of nits. Overall looks very nice I think.

http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo-test.cc
File src/kudu/tools/rebalance-algo-test.cc:

http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo-test.cc@194
PS3, Line 194: {
             :       { "0", "1", },
             :       {
             :         { "A", { 1, 1, } },
             :         { "B", { 1, 2, } },
             :       },
             :     },
Missing a comment on this one. It seems unnecessary given the test case immediately below.


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo-test.cc@221
PS3, Line 221: and
nit: Extra word.


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo-test.cc@222
PS3, Line 222: at
nit: on.


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo-test.cc@262
PS3, Line 262: to make
nit: extra


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo-test.cc@262
PS3, Line 262:  
nit: ,


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo-test.cc@263
PS3, Line 263: achive
nit: achieve.


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo-test.cc@263
PS3, Line 263: balanced state
nit: balance.


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo-test.cc@264
PS3, Line 264: Tablet
nit: Table.


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo-test.cc@272
PS3, Line 272: "B"
It'd be better if it weren't deterministic which table it uses for the move.


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo-test.cc@282
PS3, Line 282: { { "D", "1", "0" }, }
Aside: we'll have to think about single-replica tables. Does 3-4-3 (1-2-1?) work in that case? 3-2-3 doesn't.


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo-test.cc@321
PS3, Line 321: {
             :       { "0", "1", },
             :       {
             :         { "A", { 1, 2, } },
             :         { "B", { 2, 0, } },
             :         { "C", { 2, 1, } },
             :       },
             :       {
             :         { "B", "0", "1" },
             :       }
             :     },
0 has 5 total replicas; 1 has 3, so it doesn't fit in this test.


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo-test.cc@483
PS3, Line 483: 
Coverage is good but we could also use a randomized test. I could add it in a separate patch.


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h
File src/kudu/tools/rebalance-algo.h:

http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@33
PS3, Line 33: Information on t
nit: Howabout just "Balance information for a table."


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@34
PS3, Line 34: TableReplicasInfo
nit: I like TableBalanceInfo. It's parallel with ClusterBalanceInfo below.


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@35
PS3, Line 35: // Table's unique identifier.
nit: I think this comment isn't needed.


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@38
PS3, Line 38: Total number of tablet replicas of the table per tablet server.
nit: It's the reverse of this relationship.


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@41
PS3, Line 41: size_t
nit: The GSG's rule is to use signed integer types for numbers. See https://google.github.io/styleguide/cppguide.html#Integer_Types. Might be best to change since there's no prospect of the replica count overflowing int32_t and it might save us from some bad arithmetic at some point.


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@41
PS3, Line 41: table_replica_num_by_server
This name's confusing because it's the reverse of the actual relation, e.g. imagine seeing

    auto range = tri.table_replica_num_by_server.equal_range(6);

Howabout replica_count_to_servers or servers_by_replica_count, which is the correct direction for the relationship and captures that there could be multiples servers per replica count?


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@48
PS3, Line 48: replicas_info
The values in this map represent tables, so I think a name like table_balance_by_skew or table_info_by_skew is best.


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@48
PS3, Line 48: size_t
nit: Same.


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@50
PS3, Line 50: Total number of replicas of all tables per tablet server.
nit: Same as comment on L38.


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@53
PS3, Line 53: size_t
nit: Same.


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@53
PS3, Line 53: replica_num_by_server
nit: Same naming nit as L41.


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@55
PS3, Line 55: ToString
Would you mind putting a comment here explaining what the string looks like? I'm not sure from the fields what I expect the string to be. Is it multiline? Is it a table?


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@58
PS3, Line 58: the
nit: Extra "the".


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@58
PS3, Line 58: information information
Repeated word.


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@59
PS3, Line 59: The algorithm does not
            : // differentiate such entities as tablets
After our conversation yesterday I understand what you mean here, but I think the comment is confusingly because it sounds like rebalancing doesn't care about tablets, which can't be true because a tablet is a constraint on possible moves. Instead, I suggest saying that the algorithm represents a move for a table by the source and destination tablet servers, and, IIUC, the "move engine" translates that into an actual move that respects the constraint that no tablet server can have more than one replica of a given tablet.


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@64
PS3, Line 64: // Unique identifier of the table.
I think this comment is redundant.


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@65
PS3, Line 65: Unique identifier
nit: I'd say UUID instead of "unique identifier" because I "UUID" always means the canonical UUID, and not saying UUID made me think for a second that we were going to use some other sort of unique identifier.


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@66
PS3, Line 66: Unique identifier
nit: Same as above.


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@79
PS3, Line 79: The method puts the results into the 'moves' output parameter
How would a caller know when to stop calling GetNextMoves? Is there a guarantee about what happens if the cluster is balanced? Does the algo have a notion of balance built in or is it meant to be defined by subclasses?


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@79
PS3, Line 79: into the 'moves' output parameter
nit: Can you make it clear that 'moves' is cleared and filled with the results? I like the verb "populate" for those semantics, as opposed to "add" or "append" if the output parameter is appended to.


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@81
PS3, Line 81: size_t
nit: Same as L41.


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@90
PS3, Line 90:  
nit: Insert "an".


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc
File src/kudu/tools/rebalance-algo.cc:

http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@55
PS3, Line 55: replica_num_by_server
nit: Remember to change this if/when you change the name of the struct fields.


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@57
PS3, Line 57: s << " " << elem.first << ":" << elem.second;
If this is for debug purposes only then it's fine however it's easiest for you to use. But it might be helpful to have a more user-friendly ToString or PrintTo method that produces a nice table showing the server -> count and skew relationships, for each table, or specific tables. I'm sure something like that will be worthwhile to show initial, final, and progress status in the tool.


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@60
PS3, Line 60: replicas_info
Likewise.


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@77
PS3, Line 77: ClusterBalanceInfo info(cluster_info);
nit: Can you add a comment explaining why we copy? IIUC, it's so we can generate the ClusterBalanceInfo for picking move n from the one for move n - 1 by apply move n - 1.


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@95
PS3, Line 95: {
nit: Add a comment explaining the purpose of this block.


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@101
PS3, Line 101: auto it = replica_num_by_server.begin()
Maybe we need to keep both the mapping ts uuid -> count and count -> ts uuid? I think we could write a small class to encapsulate keeping both mappings and keeping them synchronized, and it would simplify the code.


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@115
PS3, Line 115: it = replica_num_by_server.erase(it);
This effectively advances the iterator, right?


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@133
PS3, Line 133: auto it = replicas_info.begin(); it != replicas_info.end()
Likewise I think we could use a "two-way mapping" class  to unify much of this code and the code below with the code above.


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@175
PS3, Line 175: instert
nit: extra 't'.


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@181
PS3, Line 181: DCHECK_GE(max_count, min_count)
Since table_info.table_replica_num_by_server is a multimap, isn't this just testing the STL multimap's ordering guarantee?


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@190
PS3, Line 190: TwoDimensionalGreedyAlgo::GetNextMove
Would you add a short comment on the different major parts of the algorithm, e.g. "Computing the intersection of the most loaded tablet servers for this table with the most loaded tablet server for the cluster, so, if possible, we can pick a move that helps balances the table and the cluster."


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@204
PS3, Line 204: 2
nit: I think this could be 1. 1 is typical in tools and this should only fire a few times per round, at most, right?


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@210
PS3, Line 210: 3
Could be 1 I think.


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@211
PS3, Line 211: 3
Same.


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@213
PS3, Line 213: // The distribution of replicas for this table is already balanced. However,
             :     // it's necessary to check for the opportunity to balance the replicas
             :     // cluster-wise.
I think the code is right here but the comment is wrong/stale. The table is balanced iff its skew is <= 1. If it's = 1 we have a potential opportunity to balance cluster-wise with table-skew-neutral moves. If it's = 0 we don't, so we return.


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@224
PS3, Line 224: DCHECK(!max_loaded.empty());
Maybe we ought to check if cnt_by_server is empty earlier.


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@227
PS3, Line 227: !max_loaded.empty()
This is unnecessary given the DCHECK.


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@236
PS3, Line 236: sort(max_loaded.begin(), max_loaded.end());
             :     sort(max_loaded_cluster_wise.begin(), max_loaded_cluster_wise.end());
             :     set_intersection(
             :         max_loaded.begin(), max_loaded.end(),
             :         max_loaded_cluster_wise.begin(), max_loaded_cluster_wise.end(),
             :         back_inserter(max_loaded_intersection));
We can probably separate this out as a function or something since it's re-used on L261 below. I think it'd make the logic here clearer overall.


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@249
PS3, Line 249: DCHECK(!min_loaded.empty());
Likewise seems unnecessary if we check cnt_by_server is nonempty.


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@252
PS3, Line 252: !min_loaded.empty()
Redundant.


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@277
PS3, Line 277: 3
1


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@280
PS3, Line 280: 3
1



-- 
To view, visit http://gerrit.cloudera.org:8080/10336
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5a8050ee79117a2ae2f6f88740ed25656946cfb4
Gerrit-Change-Number: 10336
Gerrit-PatchSet: 3
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-Comment-Date: Tue, 08 May 2018 18:42:37 +0000
Gerrit-HasComments: Yes

[kudu-CR] WIP [rebalancing] high-level rebalancing algorithm

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Hello Will Berkeley, Kudu Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/10336

to look at the new patch set (#7).

Change subject: WIP [rebalancing] high-level rebalancing algorithm
......................................................................

WIP [rebalancing] high-level rebalancing algorithm

This changelist introduces the high-level rebalancing algorithm
and corresponding unit tests.

The algorithm operates using high-level entities such as table and
replica (no tablets exist in this view of the world).

Change-Id: I5a8050ee79117a2ae2f6f88740ed25656946cfb4
WIP: just collecting feedback, need to add more unit tests.
---
M src/kudu/tools/CMakeLists.txt
A src/kudu/tools/rebalance-algo-test.cc
A src/kudu/tools/rebalance-algo.cc
A src/kudu/tools/rebalance-algo.h
4 files changed, 1,194 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/36/10336/7
-- 
To view, visit http://gerrit.cloudera.org:8080/10336
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5a8050ee79117a2ae2f6f88740ed25656946cfb4
Gerrit-Change-Number: 10336
Gerrit-PatchSet: 7
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] [rebalancing] Add a rebalancing algorithm

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Hello Will Berkeley, Mike Percy, Kudu Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/10336

to look at the new patch set (#13).

Change subject: [rebalancing] Add a rebalancing algorithm
......................................................................

[rebalancing] Add a rebalancing algorithm

This changelist introduces the high-level rebalancing algorithm
and corresponding unit tests. It contains both an abstraction for
rebalancing algorithms in general and an implementation of a
greedy algorithm that rebalances a cluster by trying to equalize
the number of replicas per tablet server for each table and for the
cluster as a whole.

The algorithm just identifies what moves should be done given an
arrangment of a cluster. The 'engine' that performs moves will be added
in a subsequent changelist.

Change-Id: I5a8050ee79117a2ae2f6f88740ed25656946cfb4
---
M src/kudu/tools/CMakeLists.txt
A src/kudu/tools/rebalance_algo-test.cc
A src/kudu/tools/rebalance_algo.cc
A src/kudu/tools/rebalance_algo.h
4 files changed, 1,189 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/36/10336/13
-- 
To view, visit http://gerrit.cloudera.org:8080/10336
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5a8050ee79117a2ae2f6f88740ed25656946cfb4
Gerrit-Change-Number: 10336
Gerrit-PatchSet: 13
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] [rebalancing] add a rebalancing algorithm

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/10336 )

Change subject: [rebalancing] add a rebalancing algorithm
......................................................................

[rebalancing] add a rebalancing algorithm

This changelist introduces the high-level rebalancing algorithm
and corresponding unit tests. It contains both an abstraction for
rebalancing algorithms in general and an implementation of a
greedy algorithm that rebalances a cluster by trying to equalize
the number of replicas per tablet server for each table and for the
cluster as a whole.

The algorithm just identifies what moves should be done given an
arrangement of a cluster. The 'engine' that performs moves
will be added in a subsequent changelist.

Change-Id: I5a8050ee79117a2ae2f6f88740ed25656946cfb4
Reviewed-on: http://gerrit.cloudera.org:8080/10336
Reviewed-by: Mike Percy <mp...@apache.org>
Reviewed-by: Will Berkeley <wd...@gmail.com>
Tested-by: Kudu Jenkins
---
M src/kudu/tools/CMakeLists.txt
A src/kudu/tools/rebalance_algo-test.cc
A src/kudu/tools/rebalance_algo.cc
A src/kudu/tools/rebalance_algo.h
4 files changed, 1,213 insertions(+), 0 deletions(-)

Approvals:
  Mike Percy: Looks good to me, but someone else must approve
  Will Berkeley: Looks good to me, approved
  Kudu Jenkins: Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/10336
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I5a8050ee79117a2ae2f6f88740ed25656946cfb4
Gerrit-Change-Number: 10336
Gerrit-PatchSet: 21
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] [rebalancing] add a rebalancing algorithm

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change. ( http://gerrit.cloudera.org:8080/10336 )

Change subject: [rebalancing] add a rebalancing algorithm
......................................................................


Patch Set 20: Code-Review+1

LGTM, not sure if Will wants to take another look or not.


-- 
To view, visit http://gerrit.cloudera.org:8080/10336
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5a8050ee79117a2ae2f6f88740ed25656946cfb4
Gerrit-Change-Number: 10336
Gerrit-PatchSet: 20
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-Comment-Date: Thu, 07 Jun 2018 03:18:53 +0000
Gerrit-HasComments: No

[kudu-CR] WIP [rebalancing] high-level rebalancing algorithm

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Hello Will Berkeley, Kudu Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/10336

to look at the new patch set (#4).

Change subject: WIP [rebalancing] high-level rebalancing algorithm
......................................................................

WIP [rebalancing] high-level rebalancing algorithm

This changelist introduces the high-level rebalancing algorithm
and corresponding unit tests.

The algorithm operates using high-level entities such as table and
replica (no tablets exist in this view of the world).

Change-Id: I5a8050ee79117a2ae2f6f88740ed25656946cfb4
WIP: just collecting feedback, need to add more unit tests.
---
M src/kudu/tools/CMakeLists.txt
A src/kudu/tools/rebalance-algo-test.cc
A src/kudu/tools/rebalance-algo.cc
A src/kudu/tools/rebalance-algo.h
4 files changed, 1,007 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/36/10336/4
-- 
To view, visit http://gerrit.cloudera.org:8080/10336
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5a8050ee79117a2ae2f6f88740ed25656946cfb4
Gerrit-Change-Number: 10336
Gerrit-PatchSet: 4
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] [rebalancing] add a rebalancing algorithm

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Hello Will Berkeley, Mike Percy, Grant Henke, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/10336

to look at the new patch set (#19).

Change subject: [rebalancing] add a rebalancing algorithm
......................................................................

[rebalancing] add a rebalancing algorithm

This changelist introduces the high-level rebalancing algorithm
and corresponding unit tests. It contains both an abstraction for
rebalancing algorithms in general and an implementation of a
greedy algorithm that rebalances a cluster by trying to equalize
the number of replicas per tablet server for each table and for the
cluster as a whole.

The algorithm just identifies what moves should be done given an
arrangement of a cluster. The 'engine' that performs moves
will be added in a subsequent changelist.

Change-Id: I5a8050ee79117a2ae2f6f88740ed25656946cfb4
---
M src/kudu/tools/CMakeLists.txt
A src/kudu/tools/rebalance_algo-test.cc
A src/kudu/tools/rebalance_algo.cc
A src/kudu/tools/rebalance_algo.h
4 files changed, 1,213 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/36/10336/19
-- 
To view, visit http://gerrit.cloudera.org:8080/10336
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5a8050ee79117a2ae2f6f88740ed25656946cfb4
Gerrit-Change-Number: 10336
Gerrit-PatchSet: 19
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] [rebalancing] add a rebalancing algorithm

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Hello Will Berkeley, Mike Percy, Kudu Jenkins, Grant Henke, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/10336

to look at the new patch set (#20).

Change subject: [rebalancing] add a rebalancing algorithm
......................................................................

[rebalancing] add a rebalancing algorithm

This changelist introduces the high-level rebalancing algorithm
and corresponding unit tests. It contains both an abstraction for
rebalancing algorithms in general and an implementation of a
greedy algorithm that rebalances a cluster by trying to equalize
the number of replicas per tablet server for each table and for the
cluster as a whole.

The algorithm just identifies what moves should be done given an
arrangement of a cluster. The 'engine' that performs moves
will be added in a subsequent changelist.

Change-Id: I5a8050ee79117a2ae2f6f88740ed25656946cfb4
---
M src/kudu/tools/CMakeLists.txt
A src/kudu/tools/rebalance_algo-test.cc
A src/kudu/tools/rebalance_algo.cc
A src/kudu/tools/rebalance_algo.h
4 files changed, 1,213 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/36/10336/20
-- 
To view, visit http://gerrit.cloudera.org:8080/10336
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5a8050ee79117a2ae2f6f88740ed25656946cfb4
Gerrit-Change-Number: 10336
Gerrit-PatchSet: 20
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] WIP [rebalancing] high-level rebalancing algorithm

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has removed Kudu Jenkins from this change.  ( http://gerrit.cloudera.org:8080/10336 )

Change subject: WIP [rebalancing] high-level rebalancing algorithm
......................................................................


Removed reviewer Kudu Jenkins with the following votes:

* Verified-1 by Kudu Jenkins (120)
-- 
To view, visit http://gerrit.cloudera.org:8080/10336
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteReviewer
Gerrit-Change-Id: I5a8050ee79117a2ae2f6f88740ed25656946cfb4
Gerrit-Change-Number: 10336
Gerrit-PatchSet: 5
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] [rebalancing] Add a rebalancing algorithm

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Will Berkeley has uploaded a new patch set (#10) to the change originally created by Alexey Serbin. ( http://gerrit.cloudera.org:8080/10336 )

Change subject: [rebalancing] Add a rebalancing algorithm
......................................................................

[rebalancing] Add a rebalancing algorithm

This changelist introduces the high-level rebalancing algorithm
and corresponding unit tests. It contains both an abstraction for
rebalancing algorithms in general and an implementation of a
greedy algorithm that rebalances a cluster by trying to equalize
the number of replicas per tablet server for each table and for the
cluster as a whole.

The algorithm just identifies what moves should be done given an
arrangment of a cluster. The 'engine' that performs moves will be added
in a subsequent changelist.

Change-Id: I5a8050ee79117a2ae2f6f88740ed25656946cfb4
---
M src/kudu/tools/CMakeLists.txt
A src/kudu/tools/rebalance_algo-test.cc
A src/kudu/tools/rebalance_algo.cc
A src/kudu/tools/rebalance_algo.h
4 files changed, 1,162 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/36/10336/10
-- 
To view, visit http://gerrit.cloudera.org:8080/10336
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5a8050ee79117a2ae2f6f88740ed25656946cfb4
Gerrit-Change-Number: 10336
Gerrit-PatchSet: 10
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] [rebalancing] Add a rebalancing algorithm

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Will Berkeley has posted comments on this change. ( http://gerrit.cloudera.org:8080/10336 )

Change subject: [rebalancing] Add a rebalancing algorithm
......................................................................


Patch Set 11: Code-Review+1

LGTM. For the record, the PICK_RANDOM option is exercised in the follow-up to add a randomized test (https://gerrit.cloudera.org/#/c/10453/3).


-- 
To view, visit http://gerrit.cloudera.org:8080/10336
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5a8050ee79117a2ae2f6f88740ed25656946cfb4
Gerrit-Change-Number: 10336
Gerrit-PatchSet: 11
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-Comment-Date: Tue, 22 May 2018 06:51:51 +0000
Gerrit-HasComments: No

[kudu-CR] WIP [rebalancing] high-level rebalancing algorithm

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Hello Will Berkeley, Kudu Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/10336

to look at the new patch set (#5).

Change subject: WIP [rebalancing] high-level rebalancing algorithm
......................................................................

WIP [rebalancing] high-level rebalancing algorithm

This changelist introduces the high-level rebalancing algorithm
and corresponding unit tests.

The algorithm operates using high-level entities such as table and
replica (no tablets exist in this view of the world).

Change-Id: I5a8050ee79117a2ae2f6f88740ed25656946cfb4
WIP: just collecting feedback, need to add more unit tests.
---
M src/kudu/tools/CMakeLists.txt
A src/kudu/tools/rebalance-algo-test.cc
A src/kudu/tools/rebalance-algo.cc
A src/kudu/tools/rebalance-algo.h
4 files changed, 1,008 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/36/10336/5
-- 
To view, visit http://gerrit.cloudera.org:8080/10336
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5a8050ee79117a2ae2f6f88740ed25656946cfb4
Gerrit-Change-Number: 10336
Gerrit-PatchSet: 5
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] [rebalancing] Add a rebalancing algorithm

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change. ( http://gerrit.cloudera.org:8080/10336 )

Change subject: [rebalancing] Add a rebalancing algorithm
......................................................................


Patch Set 18:

(11 comments)

Looks good, I just have some small nits for feedback.

http://gerrit.cloudera.org:8080/#/c/10336/18/src/kudu/tools/rebalance_algo-test.cc
File src/kudu/tools/rebalance_algo-test.cc:

http://gerrit.cloudera.org:8080/#/c/10336/18/src/kudu/tools/rebalance_algo-test.cc@67
PS18, Line 67: per
nit: s/per/by/

add comment: // Number of replicas of this table on each server in the cluster


http://gerrit.cloudera.org:8080/#/c/10336/18/src/kudu/tools/rebalance_algo-test.cc@78
PS18, Line 78: table_replicas.{front(),...,back()}.size
I'm confused by this syntax, maybe say something like: for t in table_replicas: assert(num_replicas_by_server.size() == tserver_uuids.size())


http://gerrit.cloudera.org:8080/#/c/10336/18/src/kudu/tools/rebalance_algo-test.cc@550
PS18, Line 550:         { "B", "0", "2" },
just wondering, in this case, why do we move B before moving A again? They have equal skew here at step 2.


http://gerrit.cloudera.org:8080/#/c/10336/18/src/kudu/tools/rebalance_algo-test.cc@626
PS18, Line 626: 
nit: extra line


http://gerrit.cloudera.org:8080/#/c/10336/18/src/kudu/tools/rebalance_algo.h
File src/kudu/tools/rebalance_algo.h:

http://gerrit.cloudera.org:8080/#/c/10336/18/src/kudu/tools/rebalance_algo.h@30
PS18, Line 30:  
nit: extra space


http://gerrit.cloudera.org:8080/#/c/10336/18/src/kudu/tools/rebalance_algo.h@91
PS18, Line 91: size_t
nit: prefer int per GSG, see https://google.github.io/styleguide/cppguide.html#Integer_Types


http://gerrit.cloudera.org:8080/#/c/10336/18/src/kudu/tools/rebalance_algo.h@143
PS18, Line 143:   // None of the output parameters may be NULL.
nit: document how to interpret 'intersection' coming back empty


http://gerrit.cloudera.org:8080/#/c/10336/18/src/kudu/tools/rebalance_algo.h@148
PS18, Line 148:       int32_t* replica_count_table,
              :       int32_t* replica_count_total,
              :       std::vector<std::string>* server_uuids,
              :       std::vector<std::string>* intersection);
nit: instead of 4 out-params, can we have a single struct out-param instead? It would be easier to keep track of, I think.


http://gerrit.cloudera.org:8080/#/c/10336/18/src/kudu/tools/rebalance_algo.cc
File src/kudu/tools/rebalance_algo.cc:

http://gerrit.cloudera.org:8080/#/c/10336/18/src/kudu/tools/rebalance_algo.cc@89
PS18, Line 89:     return Status::NotFound("no count for replica", to);
It looks like this could leave the map in an inconsistent state if only one was erased? Should we add an assertion for that? i.e. DCHECK_EQ(0, found_from ^ found_to) or something


http://gerrit.cloudera.org:8080/#/c/10336/18/src/kudu/tools/rebalance_algo.cc@105
PS18, Line 105: size_t
nit: int


http://gerrit.cloudera.org:8080/#/c/10336/18/src/kudu/tools/rebalance_algo.cc@330
PS18, Line 330: stable_sort
nit: just wondering, how is std::stable_sort available as stable_sort when there is no "using std::stable_sort" in this file?



-- 
To view, visit http://gerrit.cloudera.org:8080/10336
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5a8050ee79117a2ae2f6f88740ed25656946cfb4
Gerrit-Change-Number: 10336
Gerrit-PatchSet: 18
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-Comment-Date: Wed, 06 Jun 2018 23:16:51 +0000
Gerrit-HasComments: Yes

[kudu-CR] [rebalancing] Add a rebalancing algorithm

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Hello Will Berkeley, Mike Percy, Kudu Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/10336

to look at the new patch set (#12).

Change subject: [rebalancing] Add a rebalancing algorithm
......................................................................

[rebalancing] Add a rebalancing algorithm

This changelist introduces the high-level rebalancing algorithm
and corresponding unit tests. It contains both an abstraction for
rebalancing algorithms in general and an implementation of a
greedy algorithm that rebalances a cluster by trying to equalize
the number of replicas per tablet server for each table and for the
cluster as a whole.

The algorithm just identifies what moves should be done given an
arrangment of a cluster. The 'engine' that performs moves will be added
in a subsequent changelist.

Change-Id: I5a8050ee79117a2ae2f6f88740ed25656946cfb4
---
M src/kudu/tools/CMakeLists.txt
A src/kudu/tools/rebalance_algo-test.cc
A src/kudu/tools/rebalance_algo.cc
A src/kudu/tools/rebalance_algo.h
4 files changed, 1,191 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/36/10336/12
-- 
To view, visit http://gerrit.cloudera.org:8080/10336
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5a8050ee79117a2ae2f6f88740ed25656946cfb4
Gerrit-Change-Number: 10336
Gerrit-PatchSet: 12
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] WIP [rebalancing] high-level rebalancing algorithm

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/10336 )

Change subject: WIP [rebalancing] high-level rebalancing algorithm
......................................................................


Patch Set 3:

(55 comments)

Thank you for the thorough review!

http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo-test.cc
File src/kudu/tools/rebalance-algo-test.cc:

http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo-test.cc@194
PS3, Line 194: {
             :       { "0", "1", },
             :       {
             :         { "A", { 1, 1, } },
             :         { "B", { 1, 2, } },
             :       },
             :     },
> Missing a comment on this one. It seems unnecessary given the test case imm
The case below has 0 cluster-wise skew, actually.  I added a comment.


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo-test.cc@221
PS3, Line 221: and
> nit: Extra word.
Done


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo-test.cc@222
PS3, Line 222: at
> nit: on.
Done


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo-test.cc@262
PS3, Line 262:  
> nit: ,
Done


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo-test.cc@263
PS3, Line 263: achive
> nit: achieve.
Done


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo-test.cc@263
PS3, Line 263: balanced state
> nit: balance.
Done


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo-test.cc@264
PS3, Line 264: Tablet
> nit: Table.
Done


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo-test.cc@272
PS3, Line 272: "B"
> It'd be better if it weren't deterministic which table it uses for the move
Yeah, that's the next step, I think.  The idea is to specify the moves in some order, but during the verification phase make it loose in terms of order if some additional flag is set for TestClusterConfig instance.  For first stages I want to be sure I understand how the algorithm works, so I'd like to keep the order checked as well.


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo-test.cc@282
PS3, Line 282: { { "D", "1", "0" }, }
> Aside: we'll have to think about single-replica tables. Does 3-4-3 (1-2-1?)
Good question: AFAIK the 3-4-3 scheme should work for tables of replication factor 1 while moving replicas around.


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo-test.cc@321
PS3, Line 321: {
             :       { "0", "1", },
             :       {
             :         { "A", { 1, 2, } },
             :         { "B", { 2, 0, } },
             :         { "C", { 2, 1, } },
             :       },
             :       {
             :         { "B", "0", "1" },
             :       }
             :     },
> 0 has 5 total replicas; 1 has 3, so it doesn't fit in this test.
Good catch!  Moved it into the appropriate scenario and updated the typo in this one to make it fit in here.


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo-test.cc@483
PS3, Line 483: 
> Coverage is good but we could also use a randomized test. I could add it in
For the randomized test it's a bit harder to specify the sequence of moves to be in the necessary order (basically, it's about writing 'reference' balancing algorithm).  However, I think it's easy to compute the reference number of move operations and making sure the algorithm converges, outputting with a balanced result.  I think I can add that.


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h
File src/kudu/tools/rebalance-algo.h:

http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@33
PS3, Line 33: Information on t
> nit: Howabout just "Balance information for a table."
Done


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@34
PS3, Line 34: TableReplicasInfo
> nit: I like TableBalanceInfo. It's parallel with ClusterBalanceInfo below.
Done


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@35
PS3, Line 35: // Table's unique identifier.
> nit: I think this comment isn't needed.
Done


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@38
PS3, Line 38: Total number of tablet replicas of the table per tablet server.
> nit: It's the reverse of this relationship.
Done


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@41
PS3, Line 41: table_replica_num_by_server
> This name's confusing because it's the reverse of the actual relation, e.g.
Done


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@41
PS3, Line 41: size_t
> nit: The GSG's rule is to use signed integer types for numbers. See https:/
Yeah, GSG has some strange conventions by my understanding -- it seems it was written by java people. :)

But since we stick to it, sure -- let's use those.

As a side note, I think that using signed integers here would result in necessity to add checks for the non-negativity of the values, so instead of checks for bad arithmetic you need to add checks for wrong values.


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@48
PS3, Line 48: replicas_info
> The values in this map represent tables, so I think a name like table_balan
Done


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@48
PS3, Line 48: size_t
> nit: Same.
Done


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@50
PS3, Line 50: Total number of replicas of all tables per tablet server.
> nit: Same as comment on L38.
Done


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@53
PS3, Line 53: replica_num_by_server
> nit: Same naming nit as L41.
Done


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@53
PS3, Line 53: size_t
> nit: Same.
Done


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@55
PS3, Line 55: ToString
> Would you mind putting a comment here explaining what the string looks like
Done


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@58
PS3, Line 58: the
> nit: Extra "the".
Done


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@58
PS3, Line 58: information information
> Repeated word.
Done


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@59
PS3, Line 59: The algorithm does not
            : // differentiate such entities as tablets
> After our conversation yesterday I understand what you mean here, but I thi
Done


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@64
PS3, Line 64: // Unique identifier of the table.
> I think this comment is redundant.
Done


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@65
PS3, Line 65: Unique identifier
> nit: I'd say UUID instead of "unique identifier" because I "UUID" always me
Yeah, that's right -- the algorithm can consume any form of unique identifiers.  Whether it's UUID or some other type of string identifiers, it does not matter (i.e. stringified numbers will do as well).  That said, I'd better keep 'unique identifier' instead of UUID.


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@66
PS3, Line 66: Unique identifier
> nit: Same as above.
same


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@79
PS3, Line 79: The method puts the results into the 'moves' output parameter
> How would a caller know when to stop calling GetNextMoves? Is there a guara
I think the notation on the already-balanced configuration should be the same for all algorithms.  I added corresponding comment about that.


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@79
PS3, Line 79: into the 'moves' output parameter
> nit: Can you make it clear that 'moves' is cleared and filled with the resu
Done


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@81
PS3, Line 81: size_t
> nit: Same as L41.
I'd like to keep it as size_t to reflect the type of the std::vector<>::size_type parameter.


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@90
PS3, Line 90:  
> nit: Insert "an".
Done


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc
File src/kudu/tools/rebalance-algo.cc:

http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@55
PS3, Line 55: replica_num_by_server
> nit: Remember to change this if/when you change the name of the struct fiel
Done


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@57
PS3, Line 57: s << " " << elem.first << ":" << elem.second;
> If this is for debug purposes only then it's fine however it's easiest for 
Yep, I consider this as a draft approach to output information on the replica balance information for a cluster.  Sure, that should be updated to be more comprehensive and pretty-printed.    Maybe, I will work on that during the next phase (i.e. while implementing the move engine based on ksck info in the kudu CLI tool).


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@60
PS3, Line 60: replicas_info
> Likewise.
Done


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@77
PS3, Line 77: ClusterBalanceInfo info(cluster_info);
> nit: Can you add a comment explaining why we copy? IIUC, it's so we can gen
Yep -- I changed the signature to pass the cluster_info parameter by value and added the comment.


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@95
PS3, Line 95: {
> nit: Add a comment explaining the purpose of this block.
Done


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@101
PS3, Line 101: auto it = replica_num_by_server.begin()
> Maybe we need to keep both the mapping ts uuid -> count and count -> ts uui
Yep, that would be a good alternative as well.  I think iterating over the container like this is not a big deal since  the number of tablet servers should not be huge.


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@115
PS3, Line 115: it = replica_num_by_server.erase(it);
> This effectively advances the iterator, right?
Yes, this erases the element and updates the iterator to point to the next element.  I added comment.


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@133
PS3, Line 133: auto it = replicas_info.begin(); it != replicas_info.end()
> Likewise I think we could use a "two-way mapping" class  to unify much of t
Good idea.  I think I can come up with an update in the next version :)


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@175
PS3, Line 175: instert
> nit: extra 't'.
Done


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@181
PS3, Line 181: DCHECK_GE(max_count, min_count)
> Since table_info.table_replica_num_by_server is a multimap, isn't this just
Yep, that and also testing for typos, if any.  It's also important to do so because the skew in that version was of unsigned type.  Besides, one day the ordering of the elements might be changed because of update on the comparison operator for the container.


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@190
PS3, Line 190: TwoDimensionalGreedyAlgo::GetNextMove
> Would you add a short comment on the different major parts of the algorithm
Done


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@204
PS3, Line 204: 2
> nit: I think this could be 1. 1 is typical in tools and this should only fi
Done


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@210
PS3, Line 210: 3
> Could be 1 I think.
Done


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@211
PS3, Line 211: 3
> Same.
Done


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@213
PS3, Line 213: // The distribution of replicas for this table is already balanced. However,
             :     // it's necessary to check for the opportunity to balance the replicas
             :     // cluster-wise.
> I think the code is right here but the comment is wrong/stale. The table is
Good catch!  Indeed, the comment was stale.


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@224
PS3, Line 224: DCHECK(!max_loaded.empty());
> Maybe we ought to check if cnt_by_server is empty earlier.
It's a great idea!


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@227
PS3, Line 227: !max_loaded.empty()
> This is unnecessary given the DCHECK.
Done


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@236
PS3, Line 236: sort(max_loaded.begin(), max_loaded.end());
             :     sort(max_loaded_cluster_wise.begin(), max_loaded_cluster_wise.end());
             :     set_intersection(
             :         max_loaded.begin(), max_loaded.end(),
             :         max_loaded_cluster_wise.begin(), max_loaded_cluster_wise.end(),
             :         back_inserter(max_loaded_intersection));
> We can probably separate this out as a function or something since it's re-
Done


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@249
PS3, Line 249: DCHECK(!min_loaded.empty());
> Likewise seems unnecessary if we check cnt_by_server is nonempty.
Done


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@252
PS3, Line 252: !min_loaded.empty()
> Redundant.
Done


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@277
PS3, Line 277: 3
> 1
Done


http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@280
PS3, Line 280: 3
> 1
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/10336
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5a8050ee79117a2ae2f6f88740ed25656946cfb4
Gerrit-Change-Number: 10336
Gerrit-PatchSet: 3
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-Comment-Date: Thu, 10 May 2018 03:27:28 +0000
Gerrit-HasComments: Yes

[kudu-CR] [rebalancing] Add a rebalancing algorithm

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Will Berkeley has posted comments on this change. ( http://gerrit.cloudera.org:8080/10336 )

Change subject: [rebalancing] Add a rebalancing algorithm
......................................................................


Patch Set 9:

Ah, just noticed that to fit with the current convention we should use _ to separate words in source file names, except when separating words identifying the file as a test file:

rebalance_algo-test.cc
rebalance_algo.h
rebalance_algo.cc


-- 
To view, visit http://gerrit.cloudera.org:8080/10336
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5a8050ee79117a2ae2f6f88740ed25656946cfb4
Gerrit-Change-Number: 10336
Gerrit-PatchSet: 9
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-Comment-Date: Fri, 18 May 2018 18:54:09 +0000
Gerrit-HasComments: No

[kudu-CR] [rebalancing] Add a rebalancing algorithm

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Will Berkeley has uploaded a new patch set (#9) to the change originally created by Alexey Serbin. ( http://gerrit.cloudera.org:8080/10336 )

Change subject: [rebalancing] Add a rebalancing algorithm
......................................................................

[rebalancing] Add a rebalancing algorithm

This changelist introduces the high-level rebalancing algorithm
and corresponding unit tests. It contains both an abstraction for
rebalancing algorithms in general and an implementation of a
greedy algorithm that rebalances a cluster by trying to equalize
the number of replicas per tablet server for each table and for the
cluster as a whole.

The algorithm just identifies what moves should be done given an
arrangment of a cluster. The 'engine' that performs moves will be added
in a subsequent changelist.

Change-Id: I5a8050ee79117a2ae2f6f88740ed25656946cfb4
---
M src/kudu/tools/CMakeLists.txt
A src/kudu/tools/rebalance-algo-test.cc
A src/kudu/tools/rebalance-algo.cc
A src/kudu/tools/rebalance-algo.h
4 files changed, 1,162 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/36/10336/9
-- 
To view, visit http://gerrit.cloudera.org:8080/10336
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5a8050ee79117a2ae2f6f88740ed25656946cfb4
Gerrit-Change-Number: 10336
Gerrit-PatchSet: 9
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] WIP [rebalancing] high-level rebalancing algorithm

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Hello Will Berkeley, Kudu Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/10336

to look at the new patch set (#2).

Change subject: WIP [rebalancing] high-level rebalancing algorithm
......................................................................

WIP [rebalancing] high-level rebalancing algorithm

This changelist introduces the high-level rebalancing algorithm
and corresponding unit tests.

The algorithm operates using high-level entities such as table and
replica (no tablets exist in this view of the world).

Change-Id: I5a8050ee79117a2ae2f6f88740ed25656946cfb4
WIP: just collecting feedback, need to add more unit tests.
---
M src/kudu/tools/CMakeLists.txt
A src/kudu/tools/rebalance-algo-test.cc
A src/kudu/tools/rebalance-algo.cc
A src/kudu/tools/rebalance-algo.h
4 files changed, 904 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/36/10336/2
-- 
To view, visit http://gerrit.cloudera.org:8080/10336
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5a8050ee79117a2ae2f6f88740ed25656946cfb4
Gerrit-Change-Number: 10336
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] WIP [rebalancing] high-level rebalancing algorithm

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Will Berkeley has posted comments on this change. ( http://gerrit.cloudera.org:8080/10336 )

Change subject: WIP [rebalancing] high-level rebalancing algorithm
......................................................................


Patch Set 3:

(2 comments)

FYI Alexey I am revising this patch while you work on the "engine" part, as we spoke about.

http://gerrit.cloudera.org:8080/#/c/10336/8/src/kudu/tools/rebalance-algo.cc
File src/kudu/tools/rebalance-algo.cc:

http://gerrit.cloudera.org:8080/#/c/10336/8/src/kudu/tools/rebalance-algo.cc@217
PS8, Line 217: }
This isn't the max server skew, it's the number of replicas on the most loaded server.


http://gerrit.cloudera.org:8080/#/c/10336/8/src/kudu/tools/rebalance-algo.cc@244
PS8, Line 244: auto min_range = cnt_by_server.equal_r
This is the same as checking if max_table_skew == 0 above.



-- 
To view, visit http://gerrit.cloudera.org:8080/10336
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5a8050ee79117a2ae2f6f88740ed25656946cfb4
Gerrit-Change-Number: 10336
Gerrit-PatchSet: 3
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-Comment-Date: Fri, 18 May 2018 08:40:58 +0000
Gerrit-HasComments: Yes

[kudu-CR] [rebalancing] Add a rebalancing algorithm

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Will Berkeley has posted comments on this change. ( http://gerrit.cloudera.org:8080/10336 )

Change subject: [rebalancing] Add a rebalancing algorithm
......................................................................


Patch Set 16: Code-Review+1

LGTM. Should have another pair of eyes look.


-- 
To view, visit http://gerrit.cloudera.org:8080/10336
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5a8050ee79117a2ae2f6f88740ed25656946cfb4
Gerrit-Change-Number: 10336
Gerrit-PatchSet: 16
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-Comment-Date: Fri, 01 Jun 2018 03:40:17 +0000
Gerrit-HasComments: No

[kudu-CR] [rebalancing] add a rebalancing algorithm

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/10336 )

Change subject: [rebalancing] add a rebalancing algorithm
......................................................................


Patch Set 19:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/10336/19/src/kudu/tools/rebalance_algo.cc
File src/kudu/tools/rebalance_algo.cc:

http://gerrit.cloudera.org:8080/#/c/10336/19/src/kudu/tools/rebalance_algo.cc@88
PS19, Line 88: DCHECK(found_src && found_dst);
> since you're handling this case now, maybe remove this?
Done


http://gerrit.cloudera.org:8080/#/c/10336/19/src/kudu/tools/rebalance_algo.cc@117
PS19, Line 117: Value of '0' is a shortcut for 'the possible maximum'.
> do we need this? if so, please doc this
It's a handy shortcut to use in tests, etc.  I documented that in the header file rebalance_algo.h



-- 
To view, visit http://gerrit.cloudera.org:8080/10336
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5a8050ee79117a2ae2f6f88740ed25656946cfb4
Gerrit-Change-Number: 10336
Gerrit-PatchSet: 19
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-Comment-Date: Thu, 07 Jun 2018 03:05:31 +0000
Gerrit-HasComments: Yes

[kudu-CR] [rebalancing] Add a rebalancing algorithm

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Will Berkeley has posted comments on this change. ( http://gerrit.cloudera.org:8080/10336 )

Change subject: [rebalancing] Add a rebalancing algorithm
......................................................................


Patch Set 14:

(11 comments)

Just some nits on comments and formatting. Sorry I'm kind of picky about wording. I think this is about good to go. We should get another set of eyes on it, though.

http://gerrit.cloudera.org:8080/#/c/10336/14/src/kudu/tools/CMakeLists.txt
File src/kudu/tools/CMakeLists.txt:

http://gerrit.cloudera.org:8080/#/c/10336/14/src/kudu/tools/CMakeLists.txt@153
PS14, Line 153: kudu_tools_util
              :   kudu_tools_test_util
              :   kudu_tools_rebalance
              :   itest_util
              :   mini_cluster
nit: Not sure how much we care about alphabetically sorting here, but this is all out of order.


http://gerrit.cloudera.org:8080/#/c/10336/14/src/kudu/tools/rebalance_algo-test.cc
File src/kudu/tools/rebalance_algo-test.cc:

http://gerrit.cloudera.org:8080/#/c/10336/14/src/kudu/tools/rebalance_algo-test.cc@78
PS14, Line 78: table_replicas.{front(),...,back()}.size
ultra-nit: You mean table_replicas.{front(),...,back()}.num_replicas_per_server.size(). That's pretty wordy, so maybe it'd be easier to say, "For each t in table_replicas, t.num_replicas_per_server.size() == tserver_uuids.size()."


http://gerrit.cloudera.org:8080/#/c/10336/14/src/kudu/tools/rebalance_algo-test.cc@117
PS14, Line 117:  tserver_idx < tcc.tserver_uuids.size()
nit: It's a little easier to read if each statement in the for clause is on its own line (1 + 1 + 1) instead of 2 + 1.


http://gerrit.cloudera.org:8080/#/c/10336/14/src/kudu/tools/rebalance_algo-test.cc@119
PS14, Line 119: at
on


http://gerrit.cloudera.org:8080/#/c/10336/14/src/kudu/tools/rebalance_algo-test.cc@152
PS14, Line 152: PICK_FIRST
nit: Mention why this option is being used.


http://gerrit.cloudera.org:8080/#/c/10336/14/src/kudu/tools/rebalance_algo-test.cc@372
PS14, Line 372: configuration
configurations


http://gerrit.cloudera.org:8080/#/c/10336/14/src/kudu/tools/rebalance_algo-test.cc@372
PS14, Line 372: to have just one one-move
that need one move


http://gerrit.cloudera.org:8080/#/c/10336/14/src/kudu/tools/rebalance_algo-test.cc@376
PS14, Line 376: that's why multiples of virtually same
that's why we test multiple virtually identical


http://gerrit.cloudera.org:8080/#/c/10336/14/src/kudu/tools/rebalance_algo-test.cc@437
PS14, Line 437:  
a


http://gerrit.cloudera.org:8080/#/c/10336/14/src/kudu/tools/rebalance_algo-test.cc@491
PS14, Line 491:  
a


http://gerrit.cloudera.org:8080/#/c/10336/14/src/kudu/tools/rebalance_algo-test.cc@597
PS14, Line 597: Unbalanced (both table- and cluster-wise) and simple enough configurations to
              : // make them balanced moving many replicas around.
This is a little hard to read. Maybe "Simple configurations that are both table- and cluster-wise unbalanced and require many replica moves to make them balanced."



-- 
To view, visit http://gerrit.cloudera.org:8080/10336
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5a8050ee79117a2ae2f6f88740ed25656946cfb4
Gerrit-Change-Number: 10336
Gerrit-PatchSet: 14
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-Comment-Date: Wed, 30 May 2018 06:57:59 +0000
Gerrit-HasComments: Yes

[kudu-CR] WIP [rebalancing] high-level rebalancing algorithm

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/10336 )

Change subject: WIP [rebalancing] high-level rebalancing algorithm
......................................................................


Patch Set 5: Verified+1

Unrelated flake in org.apache.kudu.spark.tools.TestImportExportFiles


-- 
To view, visit http://gerrit.cloudera.org:8080/10336
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5a8050ee79117a2ae2f6f88740ed25656946cfb4
Gerrit-Change-Number: 10336
Gerrit-PatchSet: 5
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-Comment-Date: Tue, 15 May 2018 00:27:11 +0000
Gerrit-HasComments: No

[kudu-CR] WIP [rebalancing] high-level rebalancing algorithm

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Hello Will Berkeley, Kudu Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/10336

to look at the new patch set (#3).

Change subject: WIP [rebalancing] high-level rebalancing algorithm
......................................................................

WIP [rebalancing] high-level rebalancing algorithm

This changelist introduces the high-level rebalancing algorithm
and corresponding unit tests.

The algorithm operates using high-level entities such as table and
replica (no tablets exist in this view of the world).

Change-Id: I5a8050ee79117a2ae2f6f88740ed25656946cfb4
WIP: just collecting feedback, need to add more unit tests.
---
M src/kudu/tools/CMakeLists.txt
A src/kudu/tools/rebalance-algo-test.cc
A src/kudu/tools/rebalance-algo.cc
A src/kudu/tools/rebalance-algo.h
4 files changed, 903 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/36/10336/3
-- 
To view, visit http://gerrit.cloudera.org:8080/10336
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5a8050ee79117a2ae2f6f88740ed25656946cfb4
Gerrit-Change-Number: 10336
Gerrit-PatchSet: 3
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] [rebalancing] Add a rebalancing algorithm

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has removed Kudu Jenkins from this change.  ( http://gerrit.cloudera.org:8080/10336 )

Change subject: [rebalancing] Add a rebalancing algorithm
......................................................................


Removed reviewer Kudu Jenkins with the following votes:

* Verified-1 by Kudu Jenkins (120)
-- 
To view, visit http://gerrit.cloudera.org:8080/10336
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteReviewer
Gerrit-Change-Id: I5a8050ee79117a2ae2f6f88740ed25656946cfb4
Gerrit-Change-Number: 10336
Gerrit-PatchSet: 18
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] [rebalancing] Add a rebalancing algorithm

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has removed Kudu Jenkins from this change.  ( http://gerrit.cloudera.org:8080/10336 )

Change subject: [rebalancing] Add a rebalancing algorithm
......................................................................


Removed reviewer Kudu Jenkins with the following votes:

* Verified-1 by Kudu Jenkins (120)
-- 
To view, visit http://gerrit.cloudera.org:8080/10336
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteReviewer
Gerrit-Change-Id: I5a8050ee79117a2ae2f6f88740ed25656946cfb4
Gerrit-Change-Number: 10336
Gerrit-PatchSet: 15
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] [rebalancing] Add a rebalancing algorithm

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/10336 )

Change subject: [rebalancing] Add a rebalancing algorithm
......................................................................


Patch Set 15: Verified+1

Flake in unrelated test:

  OptionalSSL/TestRpc.TestClientConnectionMetrics/0


-- 
To view, visit http://gerrit.cloudera.org:8080/10336
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5a8050ee79117a2ae2f6f88740ed25656946cfb4
Gerrit-Change-Number: 10336
Gerrit-PatchSet: 15
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-Comment-Date: Thu, 31 May 2018 06:12:26 +0000
Gerrit-HasComments: No

[kudu-CR] [rebalancing] Add a rebalancing algorithm

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Hello Will Berkeley, Mike Percy, Kudu Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/10336

to look at the new patch set (#11).

Change subject: [rebalancing] Add a rebalancing algorithm
......................................................................

[rebalancing] Add a rebalancing algorithm

This changelist introduces the high-level rebalancing algorithm
and corresponding unit tests. It contains both an abstraction for
rebalancing algorithms in general and an implementation of a
greedy algorithm that rebalances a cluster by trying to equalize
the number of replicas per tablet server for each table and for the
cluster as a whole.

The algorithm just identifies what moves should be done given an
arrangment of a cluster. The 'engine' that performs moves will be added
in a subsequent changelist.

Change-Id: I5a8050ee79117a2ae2f6f88740ed25656946cfb4
---
M src/kudu/tools/CMakeLists.txt
A src/kudu/tools/rebalance_algo-test.cc
A src/kudu/tools/rebalance_algo.cc
A src/kudu/tools/rebalance_algo.h
4 files changed, 1,190 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/36/10336/11
-- 
To view, visit http://gerrit.cloudera.org:8080/10336
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5a8050ee79117a2ae2f6f88740ed25656946cfb4
Gerrit-Change-Number: 10336
Gerrit-PatchSet: 11
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] [rebalancing] add a rebalancing algorithm

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Will Berkeley has posted comments on this change. ( http://gerrit.cloudera.org:8080/10336 )

Change subject: [rebalancing] add a rebalancing algorithm
......................................................................


Patch Set 20: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/10336
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5a8050ee79117a2ae2f6f88740ed25656946cfb4
Gerrit-Change-Number: 10336
Gerrit-PatchSet: 20
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-Comment-Date: Thu, 07 Jun 2018 03:20:35 +0000
Gerrit-HasComments: No

[kudu-CR] [rebalancing] add a rebalancing algorithm

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/10336 )

Change subject: [rebalancing] add a rebalancing algorithm
......................................................................


Patch Set 18:

(11 comments)

Thank you for the review!

http://gerrit.cloudera.org:8080/#/c/10336/18/src/kudu/tools/rebalance_algo-test.cc
File src/kudu/tools/rebalance_algo-test.cc:

http://gerrit.cloudera.org:8080/#/c/10336/18/src/kudu/tools/rebalance_algo-test.cc@67
PS18, Line 67: per
> nit: s/per/by/
Done


http://gerrit.cloudera.org:8080/#/c/10336/18/src/kudu/tools/rebalance_algo-test.cc@78
PS18, Line 78: table_replicas.{front(),...,back()}.size
> I'm confused by this syntax, maybe say something like: for t in table_repli
Done


http://gerrit.cloudera.org:8080/#/c/10336/18/src/kudu/tools/rebalance_algo-test.cc@550
PS18, Line 550:         { "B", "0", "2" },
> just wondering, in this case, why do we move B before moving A again? They 
I think that's just an implementation detail, where the order among the candidate moves with the same skew is ordered by table identifier.  In practice, we strive to choose among such candidates randomly: notice that here in tests we use EqualSkewOption::PICK_FIRST, but by default it EqualSkewOption::PICK_RANDOM.  These test scenarios use PICK_FIRST because they compare the result with the reference.


http://gerrit.cloudera.org:8080/#/c/10336/18/src/kudu/tools/rebalance_algo-test.cc@626
PS18, Line 626: 
> nit: extra line
Done


http://gerrit.cloudera.org:8080/#/c/10336/18/src/kudu/tools/rebalance_algo.h
File src/kudu/tools/rebalance_algo.h:

http://gerrit.cloudera.org:8080/#/c/10336/18/src/kudu/tools/rebalance_algo.h@30
PS18, Line 30:  
> nit: extra space
Done


http://gerrit.cloudera.org:8080/#/c/10336/18/src/kudu/tools/rebalance_algo.h@91
PS18, Line 91: size_t
> nit: prefer int per GSG, see https://google.github.io/styleguide/cppguide.h
Done


http://gerrit.cloudera.org:8080/#/c/10336/18/src/kudu/tools/rebalance_algo.h@143
PS18, Line 143:   // None of the output parameters may be NULL.
> nit: document how to interpret 'intersection' coming back empty
Done


http://gerrit.cloudera.org:8080/#/c/10336/18/src/kudu/tools/rebalance_algo.h@148
PS18, Line 148:       int32_t* replica_count_table,
              :       int32_t* replica_count_total,
              :       std::vector<std::string>* server_uuids,
              :       std::vector<std::string>* intersection);
> nit: instead of 4 out-params, can we have a single struct out-param instead
After some consideration, I decided to leave this as is for now.


http://gerrit.cloudera.org:8080/#/c/10336/18/src/kudu/tools/rebalance_algo.cc
File src/kudu/tools/rebalance_algo.cc:

http://gerrit.cloudera.org:8080/#/c/10336/18/src/kudu/tools/rebalance_algo.cc@89
PS18, Line 89:     return Status::NotFound("no count for replica", to);
> It looks like this could leave the map in an inconsistent state if only one
Good catch -- I added some code to restore the consistency of the container is such cases, and I added DCHECK for that as well.


http://gerrit.cloudera.org:8080/#/c/10336/18/src/kudu/tools/rebalance_algo.cc@105
PS18, Line 105: size_t
> nit: int
Done


http://gerrit.cloudera.org:8080/#/c/10336/18/src/kudu/tools/rebalance_algo.cc@330
PS18, Line 330: stable_sort
> nit: just wondering, how is std::stable_sort available as stable_sort when 
Ah, that's a good question.  From the other side, we don't need std::stable_sort here, just std::sort() is enough since we don't expect UUIDs to be the same, and we don't need to preserve the order of same UUIDs anyway.



-- 
To view, visit http://gerrit.cloudera.org:8080/10336
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5a8050ee79117a2ae2f6f88740ed25656946cfb4
Gerrit-Change-Number: 10336
Gerrit-PatchSet: 18
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-Comment-Date: Thu, 07 Jun 2018 02:19:39 +0000
Gerrit-HasComments: Yes

[kudu-CR] [rebalancing] Add a rebalancing algorithm

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/10336 )

Change subject: [rebalancing] Add a rebalancing algorithm
......................................................................


Patch Set 10: Code-Review+1

(1 comment)

Thank you for improving this patch!

http://gerrit.cloudera.org:8080/#/c/10336/8/src/kudu/tools/rebalance-algo.cc
File src/kudu/tools/rebalance-algo.cc:

http://gerrit.cloudera.org:8080/#/c/10336/8/src/kudu/tools/rebalance-algo.cc@313
PS8, Line 313: 
yep, that was a typo, indeed



-- 
To view, visit http://gerrit.cloudera.org:8080/10336
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5a8050ee79117a2ae2f6f88740ed25656946cfb4
Gerrit-Change-Number: 10336
Gerrit-PatchSet: 10
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-Comment-Date: Sat, 19 May 2018 05:50:44 +0000
Gerrit-HasComments: Yes

[kudu-CR] [rebalancing] Add a rebalancing algorithm

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/10336 )

Change subject: [rebalancing] Add a rebalancing algorithm
......................................................................


Patch Set 18: Verified+1

unrelated flake in org.apache.kudu.client.TestAsyncKuduSession


-- 
To view, visit http://gerrit.cloudera.org:8080/10336
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5a8050ee79117a2ae2f6f88740ed25656946cfb4
Gerrit-Change-Number: 10336
Gerrit-PatchSet: 18
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-Comment-Date: Wed, 06 Jun 2018 15:08:51 +0000
Gerrit-HasComments: No