You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Hannah Nguyen (Code Review)" <ge...@cloudera.org> on 2019/07/22 19:58:55 UTC

[kudu-CR] KUDU-2780: create thread for auto-rebalancing (1/n)

Hannah Nguyen has uploaded this change for review. ( http://gerrit.cloudera.org:8080/13891


Change subject: KUDU-2780: create thread for auto-rebalancing (1/n)
......................................................................

KUDU-2780: create thread for auto-rebalancing (1/n)

AutoRebalancerTask is a thread in the master with a configurable
polling interval. Every interval, the thread logs the current
skew of the cluster if auto-rebalancing is enabled.

Code that is modified/copied directly from the CLI rebalancing
tool in kudu/tools is marked.

The task currently calls the 'ksck' tool to determine skew, but
this may change in later patches when rebalancing is implemented
and the task retrieves information directly from the master.

Change-Id: Ibc829d14dc8636287bf3a4faa4ba76e86ab6885c
---
M src/kudu/benchmarks/tpch/tpch1.cc
M src/kudu/master/CMakeLists.txt
A src/kudu/master/autorebalancer-test.cc
A src/kudu/master/autorebalancer.cc
A src/kudu/master/autorebalancer.h
M src/kudu/master/catalog_manager.cc
M src/kudu/master/catalog_manager.h
7 files changed, 1,769 insertions(+), 6 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/91/13891/1
-- 
To view, visit http://gerrit.cloudera.org:8080/13891
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ibc829d14dc8636287bf3a4faa4ba76e86ab6885c
Gerrit-Change-Number: 13891
Gerrit-PatchSet: 1
Gerrit-Owner: Hannah Nguyen <ha...@cloudera.com>

[kudu-CR] WIP KUDU-2780: create thread for auto-rebalancing (1/n)

Posted by "Hannah Nguyen (Code Review)" <ge...@cloudera.org>.
Hello Alexey Serbin, Kudu Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/13891

to look at the new patch set (#2).

Change subject: WIP KUDU-2780: create thread for auto-rebalancing (1/n)
......................................................................

WIP KUDU-2780: create thread for auto-rebalancing (1/n)

AutoRebalancerTask is a thread in the master with a configurable
polling interval. Every interval, the thread logs the current
skew of the cluster if auto-rebalancing is enabled.

Code that is modified/copied directly from the CLI rebalancing
tool in kudu/tools is marked.

The task currently calls the 'ksck' tool to determine skew, but
this will change in later patches when rebalancing is implemented
and the task retrieves information directly from the master.

Change-Id: Ibc829d14dc8636287bf3a4faa4ba76e86ab6885c
---
M src/kudu/benchmarks/tpch/tpch1.cc
M src/kudu/master/CMakeLists.txt
A src/kudu/master/autorebalancer-test.cc
A src/kudu/master/autorebalancer.cc
A src/kudu/master/autorebalancer.h
M src/kudu/master/catalog_manager.cc
M src/kudu/master/catalog_manager.h
7 files changed, 1,753 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/91/13891/2
-- 
To view, visit http://gerrit.cloudera.org:8080/13891
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibc829d14dc8636287bf3a4faa4ba76e86ab6885c
Gerrit-Change-Number: 13891
Gerrit-PatchSet: 2
Gerrit-Owner: Hannah Nguyen <ha...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Hannah Nguyen <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)

[kudu-CR] WIP KUDU-2780: create thread for auto-rebalancing (1/n)

Posted by "Hannah Nguyen (Code Review)" <ge...@cloudera.org>.
Hannah Nguyen has abandoned this change. ( http://gerrit.cloudera.org:8080/13891 )

Change subject: WIP KUDU-2780: create thread for auto-rebalancing (1/n)
......................................................................


Abandoned
-- 
To view, visit http://gerrit.cloudera.org:8080/13891
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: abandon
Gerrit-Change-Id: Ibc829d14dc8636287bf3a4faa4ba76e86ab6885c
Gerrit-Change-Number: 13891
Gerrit-PatchSet: 4
Gerrit-Owner: Hannah Nguyen <ha...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Hannah Nguyen <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)

[kudu-CR] KUDU-2780: create thread for auto-rebalancing (1/n)

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/13891 )

Change subject: KUDU-2780: create thread for auto-rebalancing (1/n)
......................................................................


Patch Set 1:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/13891/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13891/1//COMMIT_MSG@13
PS1, Line 13: Code that is modified/copied directly from the CLI rebalancing
            : tool in kudu/tools is marked.
Did you consider separating the rebalancing piece that doesn't depend on the ksck structures per se into a library that might be used from both master and the ksck CLI tool?  Another option would be moving the whole rebalancing code from the tools directory under master directory, and introducing an RPC end-point in master that the CLI tool would simply call.

I'm not sure it's a good idea to keep duplicate code around.


http://gerrit.cloudera.org:8080/#/c/13891/1/src/kudu/master/CMakeLists.txt
File src/kudu/master/CMakeLists.txt:

http://gerrit.cloudera.org:8080/#/c/13891/1/src/kudu/master/CMakeLists.txt@61
PS1, Line 61:   ksck
This introduce a cycling dependency which is a no-no:

12:59:16 CMake Error: The inter-target dependency graph contains the following strongly connected component (cycle):
12:59:16   "master" of type SHARED_LIBRARY
12:59:16     depends on "ksck" (weak)
12:59:16   "ksck" of type SHARED_LIBRARY
12:59:16     depends on "master" (weak)

BTW, why does master need the ksck library as the link dependecney if the rebalancer is being run as a simple out-of-the box binary?


http://gerrit.cloudera.org:8080/#/c/13891/1/src/kudu/master/autorebalancer-test.cc
File src/kudu/master/autorebalancer-test.cc:

http://gerrit.cloudera.org:8080/#/c/13891/1/src/kudu/master/autorebalancer-test.cc@55
PS1, Line 55: TEST_F(AutoRebalancerTest, NoReports) {
We have many tests which start master and does some work before shutting it down.  Given this scenario doesn't do much aside from that, I think we automatically get coverage for this simple functionality already, no?


http://gerrit.cloudera.org:8080/#/c/13891/1/src/kudu/master/autorebalancer-test.cc@69
PS1, Line 69: TEST_F(AutoRebalancerTest, ViewReports) {
What does this test tries to verify?


http://gerrit.cloudera.org:8080/#/c/13891/1/src/kudu/master/autorebalancer.h
File src/kudu/master/autorebalancer.h:

http://gerrit.cloudera.org:8080/#/c/13891/1/src/kudu/master/autorebalancer.h@74
PS1, Line 74: 	
nit: we use spaces, not tabs in Kudu C++ for indentation.



-- 
To view, visit http://gerrit.cloudera.org:8080/13891
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibc829d14dc8636287bf3a4faa4ba76e86ab6885c
Gerrit-Change-Number: 13891
Gerrit-PatchSet: 1
Gerrit-Owner: Hannah Nguyen <ha...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Tue, 23 Jul 2019 00:56:44 +0000
Gerrit-HasComments: Yes

[kudu-CR] WIP KUDU-2780: create thread for auto-rebalancing (1/n)

Posted by "Hannah Nguyen (Code Review)" <ge...@cloudera.org>.
Hello Alexey Serbin, Kudu Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/13891

to look at the new patch set (#4).

Change subject: WIP KUDU-2780: create thread for auto-rebalancing (1/n)
......................................................................

WIP KUDU-2780: create thread for auto-rebalancing (1/n)

AutoRebalancerTask is a thread in the master with a configurable
polling interval. Every interval, the thread logs the current
skew of the cluster if auto-rebalancing is enabled.

Code that is modified/copied directly from the CLI rebalancing
tool in kudu/tools is marked.

The task currently calls the 'ksck' tool to determine skew, but
this will change in later patches when rebalancing is implemented
and the task retrieves information directly from the master.

Change-Id: Ibc829d14dc8636287bf3a4faa4ba76e86ab6885c
---
M src/kudu/benchmarks/tpch/tpch1.cc
M src/kudu/master/CMakeLists.txt
A src/kudu/master/autorebalancer-test.cc
A src/kudu/master/autorebalancer.cc
A src/kudu/master/autorebalancer.h
M src/kudu/master/catalog_manager.cc
M src/kudu/master/catalog_manager.h
7 files changed, 1,753 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/91/13891/4
-- 
To view, visit http://gerrit.cloudera.org:8080/13891
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibc829d14dc8636287bf3a4faa4ba76e86ab6885c
Gerrit-Change-Number: 13891
Gerrit-PatchSet: 4
Gerrit-Owner: Hannah Nguyen <ha...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Hannah Nguyen <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)

[kudu-CR] WIP KUDU-2780: create thread for auto-rebalancing (1/n)

Posted by "Hannah Nguyen (Code Review)" <ge...@cloudera.org>.
Hello Alexey Serbin, Kudu Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/13891

to look at the new patch set (#3).

Change subject: WIP KUDU-2780: create thread for auto-rebalancing (1/n)
......................................................................

WIP KUDU-2780: create thread for auto-rebalancing (1/n)

AutoRebalancerTask is a thread in the master with a configurable
polling interval. Every interval, the thread logs the current
skew of the cluster if auto-rebalancing is enabled.

Code that is modified/copied directly from the CLI rebalancing
tool in kudu/tools is marked.

The task currently calls the 'ksck' tool to determine skew, but
this will change in later patches when rebalancing is implemented
and the task retrieves information directly from the master.

Change-Id: Ibc829d14dc8636287bf3a4faa4ba76e86ab6885c
---
M src/kudu/benchmarks/tpch/tpch1.cc
M src/kudu/master/CMakeLists.txt
A src/kudu/master/autorebalancer-test.cc
A src/kudu/master/autorebalancer.cc
A src/kudu/master/autorebalancer.h
M src/kudu/master/catalog_manager.cc
M src/kudu/master/catalog_manager.h
7 files changed, 1,753 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/91/13891/3
-- 
To view, visit http://gerrit.cloudera.org:8080/13891
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibc829d14dc8636287bf3a4faa4ba76e86ab6885c
Gerrit-Change-Number: 13891
Gerrit-PatchSet: 3
Gerrit-Owner: Hannah Nguyen <ha...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Hannah Nguyen <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)

[kudu-CR] WIP KUDU-2780: create thread for auto-rebalancing (1/n)

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/13891 )

Change subject: WIP KUDU-2780: create thread for auto-rebalancing (1/n)
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13891/1/src/kudu/master/autorebalancer-test.cc
File src/kudu/master/autorebalancer-test.cc:

http://gerrit.cloudera.org:8080/#/c/13891/1/src/kudu/master/autorebalancer-test.cc@69
PS1, Line 69: TEST_F(AutoRebalancerTest, ViewReports) {
> This test keeps a cluster around for at least a couple polling intervals, a
For initial POC testing that approach might be OK.  However, in the long run the tests should provide coverage to automatically verify the required functionality: nobody is going to peer at the output of those tests unless something is reported to be broken.

Probably, we want to represent auto-relabancer stats as a set of metrics which are programmatically retrievable from master.  As for the latter, in Kudu there is a standardized way to do so: take a look at so-called metric counters and gauges: grep for METRIC_DEFINE_counter and METRIC_DEFINE_gauge_int64 in the code and see how it's used there and in corresponding tests (grep for METRIC_DECLARE_counter macros and FindOrCreateCounter).  From what I remember, I recently worked with those somewhere in src/kudu/util/ttl_cache.h and src/kudu/util/ttl_cache-test.cc

In addition to be able to retrieve the metrics from within a process, those are exposed via debug web server as well, so it's possible to retrieve them in JSON format via HTTP.



-- 
To view, visit http://gerrit.cloudera.org:8080/13891
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibc829d14dc8636287bf3a4faa4ba76e86ab6885c
Gerrit-Change-Number: 13891
Gerrit-PatchSet: 1
Gerrit-Owner: Hannah Nguyen <ha...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Hannah Nguyen <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Tue, 23 Jul 2019 02:55:27 +0000
Gerrit-HasComments: Yes

[kudu-CR] WIP KUDU-2780: create thread for auto-rebalancing (1/n)

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/13891 )

Change subject: WIP KUDU-2780: create thread for auto-rebalancing (1/n)
......................................................................


Patch Set 4:

(7 comments)

Left some high-level comments, and didn't look much at all into the copy-pasted functions since the intent is to refactor those away anyway.

http://gerrit.cloudera.org:8080/#/c/13891/4/src/kudu/master/autorebalancer-test.cc
File src/kudu/master/autorebalancer-test.cc:

http://gerrit.cloudera.org:8080/#/c/13891/4/src/kudu/master/autorebalancer-test.cc@55
PS4, Line 55:   const int kNumTservers = 3;
            :   const int kNumTablets = 2;
            : 
It'd also be worth using a multi-master configuration to see what happens when multiple masters are running your code.


http://gerrit.cloudera.org:8080/#/c/13891/4/src/kudu/master/autorebalancer.cc
File src/kudu/master/autorebalancer.cc:

http://gerrit.cloudera.org:8080/#/c/13891/4/src/kudu/master/autorebalancer.cc@41
PS4, Line 41: #include "kudu/tools/ksck.h"
            : #include "kudu/tools/ksck_remote.h"
            : #include "kudu/tools/ksck_results.h"
We chatted about this offline; in terms of dependencies and whatnot, it probably makes sense to pull out some of the classes here into the /master or /common, and update /tools to use it, rather than depending on /tools code.


http://gerrit.cloudera.org:8080/#/c/13891/4/src/kudu/master/autorebalancer.cc@81
PS4, Line 81: DEFINE_string(autorebalancing_tables, "",
            :               "Tables to include (comma-separated list of table names)"
            :               "If not specified, includes all tables.");;
            : 
            : // Renamed from "kudu/tools/tool_action_common.cc"
            : DEFINE_string(autorebalancing_format, "pretty",
            :               "Format to use for printing list output tables.\n"
            :               "Possible values: pretty, space, tsv, csv, and json");
            : 
            : DEFINE_uint32(autorebalancing_interval_seconds, 5,
            :              "How long to sleep in between each check for skew "
            :              "and if possible, performing a rebalancing move.");
            : TAG_FLAG(autorebalancing_interval_seconds, runtime);
            : 
            : // Renamed from "kudu/tools/tool_action_cluster.cc"
            : DEFINE_string(autorebalancing_ignored_tservers, "",
            :               "UUIDs of tablet servers to ignore while rebalancing the cluster "
            :               "(comma-separated list). If specified, allow to run the rebalancing "
            :               "when some tablet servers in 'ignored_tservers' are unhealthy. "
            :               "If not specified, allow to run the rebalancing only when all tablet "
            :               "servers are healthy.");
            : 
            : // Renamed from "kudu/tools/tool_action_cluster.cc"
            : DEFINE_uint32(autorebalancing_max_moves_per_server, 5,
            :               "Maximum number of replica moves to perform concurrently on one "
            :               "tablet server: 'move from' and 'move to' are counted "
            :               "as separate move operations.");
            : 
            : // Renamed from "kudu/tools/tool_action_cluster.cc"
            : DEFINE_uint32(autorebalancing_max_staleness_interval_sec, 300,
            :               "Maximum duration of the 'staleness' interval, when the "
            :               "rebalancer cannot make any progress in scheduling new moves and "
            :               "no prior scheduled moves are left, even if re-synchronizing "
            :               "against the cluster's state again and again. Such a staleness "
            :               "usually happens in case of a persistent problem with the "
            :               "cluster or when some unexpected concurrent activity is "
            :               "present (such as automatic recovery of failed replicas, etc.)");
            : 
            : // Renamed from "kudu/tools/tool_action_cluster.cc"
            : DEFINE_int64(autorebalancing_max_run_time_sec, 0,
            :              "Maximum time to run the rebalancing, in seconds. Specifying 0 "
            :              "means not imposing any limit on the rebalancing run time.");
            : 
            : // Renamed from "kudu/tools/tool_action_cluster.cc"
            : DEFINE_string(autorebalancing_move_single_replicas, "auto",
            :               "Whether to move single replica tablets (i.e. replicas of tablets "
            :               "of replication factor 1). Acceptable values are: "
            :               "'auto', 'enabled', 'disabled'. The value of 'auto' means "
            :               "turn it on/off depending on the replica management scheme "
            :               "and Kudu version.");
            : 
            : // Renamed from "kudu/tools/tool_action_cluster.cc"
            : DEFINE_bool(autorebalancing_output_replica_distribution_details, false,
            :             "Whether to output details on per-table and per-server "
            :             "replica distribution");
            : 
            : // Renamed from "kudu/tools/tool_action_cluster.cc"
            : DEFINE_bool(autorebalancing_disable_policy_fixer, false,
            :             "In case of multi-location cluster, whether to detect and fix "
            :             "placement policy violations. Fixing placement policy violations "
            :             "involves moving tablet replicas across different locations "
            :             "of the cluster. "
            :             "This setting is applicable to multi-location clusters only.");
            : 
            : // Renamed from "kudu/tools/tool_action_cluster.cc"
            : DEFINE_bool(autorebalancing_disable_cross_location_rebalancing, false,
            :             "In case of multi-location cluster, whether to move tablet "
            :             "replicas between locations in attempt to spread tablet replicas "
            :             "among location evenly (equalizing loads of locations throughout "
            :             "the cluster). "
            :             "This setting is applicable to multi-location clusters only.");
            : 
            : // Renamed from "kudu/tools/tool_action_cluster.cc"
            : DEFINE_bool(autorebalancing_disable_intra_location_rebalancing, false,
            :             "In case of multi-location cluster, whether to rebalance tablet "
            :             "replica distribution within each location. "
            :             "This setting is applicable to multi-location clusters only.");
            : 
            : // Renamed and modified from "kudu/tools/tool_action_cluster.cc"
            : DEFINE_double(autorebalancing_load_imbalance_threshold, 1.0,
As a part of refactoring this, rather than exposing _all_ of these as flags, we should probably go through each of these, determine whether or not it's important for each to be present in the autorebalancer.


http://gerrit.cloudera.org:8080/#/c/13891/4/src/kudu/master/autorebalancer.cc@212
PS4, Line 212:     wake_up_cv_(&lock_) {
This is used in the HMS log listener as a means of "waking up" the log listener thread to tell it there's work to do. AFAICT this isn't the case here. We might want to in the future (e.g. if we want a tool to indicate sooner than within a polling interval to start the next move), but for now, we can probably leave this out.

If you want to learn more about condition variables, check out the big comment at the top of util/condition_variable.h


http://gerrit.cloudera.org:8080/#/c/13891/4/src/kudu/master/autorebalancer.cc@269
PS4, Line 269: void AutoRebalancerTask::RunLoop() {
It's probably worth looking into how the HMS log listener only does work when it's the leader. We'll probably want to do the same here. Specifically, I think the beginning of HmsNotificationLogListenerTask::Poll() would be interesting to peruse.


http://gerrit.cloudera.org:8080/#/c/13891/4/src/kudu/master/autorebalancer.cc@285
PS4, Line 285:       Status cluster_status = PrintStats(LOG(INFO));
This is a decent start; moving forward, we should be mindful of what's still in need of implementation. You shared this with us, but I'll post here for posterity, in this loop, we'll want a means to the following:

* stop_flag: boolean flag (persisted) set by a stop trigger--rebalancing should stop. Cleared by whatever set it (eg. when re-replication finishes, or the number of locations changes so that the placement policy is satisfiable).
* restart_flag: boolean flag (persisted) set by a start trigger--rebalancing should start, if it hasn’t, and next move should be recalculated, if it has. Cleared by auto-rebalancer when cluster is balanced.
* cluster_balanced: boolean value (run-time), whether or not the cluster is balanced
* get_next_move(): function that returns the best replica move to reduce skew
* execute_next_move(): function that executes the best replica move


http://gerrit.cloudera.org:8080/#/c/13891/4/src/kudu/master/autorebalancer.cc@791
PS4, Line 791: 
             : Status AutoRebalancerTask::Rebalancer::GetClusterRawInfo(const boost::optional<std::string>& location,
             :                                ClusterRawInfo* raw_info) {
             :   RETURN_NOT_OK(RefreshKsckResults());
             :   return KsckResultsToClusterRawInfo(location, ksck_->results(), raw_info);
             : }
I think this represents a good chunk of work, either refactoring, or doing some code shuffling, to get cluster info without having a ton of code duplication.



-- 
To view, visit http://gerrit.cloudera.org:8080/13891
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibc829d14dc8636287bf3a4faa4ba76e86ab6885c
Gerrit-Change-Number: 13891
Gerrit-PatchSet: 4
Gerrit-Owner: Hannah Nguyen <ha...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Hannah Nguyen <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Tue, 23 Jul 2019 21:05:49 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2780: create thread for auto-rebalancing (1/n)

Posted by "Hannah Nguyen (Code Review)" <ge...@cloudera.org>.
Hannah Nguyen has posted comments on this change. ( http://gerrit.cloudera.org:8080/13891 )

Change subject: KUDU-2780: create thread for auto-rebalancing (1/n)
......................................................................


Patch Set 1:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/13891/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13891/1//COMMIT_MSG@13
PS1, Line 13: Code that is modified/copied directly from the CLI rebalancing
            : tool in kudu/tools is marked.
> Did you consider separating the rebalancing piece that doesn't depend on th
Hmmm, the point of this patch was mostly to lay the infrastructure (create the Task in the master) by implementing some code that would log the current skew of the cluster. 

I don't think this patch will be merge-able, but the next one (which will migrate any common code into the master directory, and retrieve information from the master and not call 'ksck' at all) should be.

Will clarify this in the message.


http://gerrit.cloudera.org:8080/#/c/13891/1/src/kudu/master/CMakeLists.txt
File src/kudu/master/CMakeLists.txt:

http://gerrit.cloudera.org:8080/#/c/13891/1/src/kudu/master/CMakeLists.txt@61
PS1, Line 61:   ksck
> This introduce a cycling dependency which is a no-no:
Will refactor the auto-rebalancer to not call 'ksck' at all. 
Then moving common code into the master should also be ok, since tools already depends on the master directory.


http://gerrit.cloudera.org:8080/#/c/13891/1/src/kudu/master/autorebalancer-test.cc
File src/kudu/master/autorebalancer-test.cc:

http://gerrit.cloudera.org:8080/#/c/13891/1/src/kudu/master/autorebalancer-test.cc@55
PS1, Line 55: TEST_F(AutoRebalancerTest, NoReports) {
> We have many tests which start master and does some work before shutting it
Agreed, this test isn't necessary.


http://gerrit.cloudera.org:8080/#/c/13891/1/src/kudu/master/autorebalancer-test.cc@69
PS1, Line 69: TEST_F(AutoRebalancerTest, ViewReports) {
> What does this test tries to verify?
This test keeps a cluster around for at least a couple polling intervals, and since the auto-rebalancer is currently just logging the skew of the cluster, running the test should print some cluster reports of skew. 

I just looked at the output on my terminal manually. Is there a better way to write the test to check that it appears?


http://gerrit.cloudera.org:8080/#/c/13891/1/src/kudu/master/autorebalancer.h
File src/kudu/master/autorebalancer.h:

http://gerrit.cloudera.org:8080/#/c/13891/1/src/kudu/master/autorebalancer.h@74
PS1, Line 74: 	
> nit: we use spaces, not tabs in Kudu C++ for indentation.
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/13891
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibc829d14dc8636287bf3a4faa4ba76e86ab6885c
Gerrit-Change-Number: 13891
Gerrit-PatchSet: 1
Gerrit-Owner: Hannah Nguyen <ha...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Hannah Nguyen <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Tue, 23 Jul 2019 01:16:07 +0000
Gerrit-HasComments: Yes