You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "ZhangYao (Code Review)" <ge...@cloudera.org> on 2019/11/01 02:29:58 UTC

[kudu-CR] KUDU-2987 Intra location rebalance will crash in special case.

ZhangYao has uploaded this change for review. ( http://gerrit.cloudera.org:8080/14608


Change subject: KUDU-2987 Intra location rebalance will crash in special case.
......................................................................

KUDU-2987 Intra location rebalance will crash in special case.

Change-Id: Iea39472e55178fca688b390249432a0f7fefaaba
---
M src/kudu/tools/rebalancer_tool.cc
1 file changed, 2 insertions(+), 2 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/08/14608/1
-- 
To view, visit http://gerrit.cloudera.org:8080/14608
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Iea39472e55178fca688b390249432a0f7fefaaba
Gerrit-Change-Number: 14608
Gerrit-PatchSet: 1
Gerrit-Owner: ZhangYao <tr...@gmail.com>

[kudu-CR] KUDU-2987 Intra location rebalance crashes in special case.

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/14608 )

Change subject: KUDU-2987 Intra location rebalance crashes in special case.
......................................................................

KUDU-2987 Intra location rebalance crashes in special case.

The crash manifested itself in cases where a Kudu cluster
had a location that didn't host even a single replica of
a tablet.

Change-Id: Iea39472e55178fca688b390249432a0f7fefaaba
Reviewed-on: http://gerrit.cloudera.org:8080/14608
Tested-by: Alexey Serbin <as...@cloudera.com>
Reviewed-by: Alexey Serbin <as...@cloudera.com>
---
M src/kudu/tools/rebalancer_tool-test.cc
M src/kudu/tools/rebalancer_tool.cc
2 files changed, 120 insertions(+), 3 deletions(-)

Approvals:
  Alexey Serbin: Looks good to me, approved; Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/14608
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Iea39472e55178fca688b390249432a0f7fefaaba
Gerrit-Change-Number: 14608
Gerrit-PatchSet: 7
Gerrit-Owner: ZhangYao <tr...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: ZhangYao <tr...@gmail.com>

[kudu-CR] KUDU-2987 Intra location rebalance crashes in special case.

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has removed Kudu Jenkins from this change.  ( http://gerrit.cloudera.org:8080/14608 )

Change subject: KUDU-2987 Intra location rebalance crashes in special case.
......................................................................


Removed reviewer Kudu Jenkins with the following votes:

* Verified-1 by Kudu Jenkins (120)
-- 
To view, visit http://gerrit.cloudera.org:8080/14608
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteReviewer
Gerrit-Change-Id: Iea39472e55178fca688b390249432a0f7fefaaba
Gerrit-Change-Number: 14608
Gerrit-PatchSet: 6
Gerrit-Owner: ZhangYao <tr...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: ZhangYao <tr...@gmail.com>

[kudu-CR] KUDU-2987 Intra location rebalance crashes in special case.

Posted by "ZhangYao (Code Review)" <ge...@cloudera.org>.
ZhangYao has posted comments on this change. ( http://gerrit.cloudera.org:8080/14608 )

Change subject: KUDU-2987 Intra location rebalance crashes in special case.
......................................................................


Patch Set 6:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/14608/5/src/kudu/tools/rebalancer_tool-test.cc
File src/kudu/tools/rebalancer_tool-test.cc:

http://gerrit.cloudera.org:8080/#/c/14608/5/src/kudu/tools/rebalancer_tool-test.cc@1760
PS5, Line 1760:   // 2. Create table1 in { "/A", 3 }, { "/B", 3 }, { "/C", 3 }, {"/D", 1 }.
> nit: add a space
Done


http://gerrit.cloudera.org:8080/#/c/14608/5/src/kudu/tools/rebalancer_tool-test.cc@1762
PS5, Line 1762:   ASSERT_OK(CreateTablesExcludingTserversInLocations({ {"/D", 1 } },
> nit: add a space
Done


http://gerrit.cloudera.org:8080/#/c/14608/5/src/kudu/tools/rebalancer_tool-test.cc@1781
PS5, Line 1781:     // Intra location rebalancing in location "/D" should move one replica
              :     // to another tserver, so there is at least one move.
              :     ASSERT_STR_NOT_CONTAINS(out, "(moved 0 replicas)");
> Whoops, now this became confusing.  So, the rebalancer doesn't move even a 
It's that, because when we selecting tservers we use 2PC, so when using 'placement policy' it may have imbalance in other locations, so the movements should be greater or equal to 1 but we can't assert it to be equal to 1.  I mistakenly use  'ASSERT_STR_NOT_CONTAINS' as 'ASSERT_STR_CONTAINS' before so 'move 1 replicas' can pass(It may fail but we don't meet it).
Because we can make sure there is at least one move so ''ASSERT_STR_NOT_CONTAINS(out, "(moved 0 replicas)")' should pass all the time. It's not flaky.



-- 
To view, visit http://gerrit.cloudera.org:8080/14608
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iea39472e55178fca688b390249432a0f7fefaaba
Gerrit-Change-Number: 14608
Gerrit-PatchSet: 6
Gerrit-Owner: ZhangYao <tr...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: ZhangYao <tr...@gmail.com>
Gerrit-Comment-Date: Sun, 10 Nov 2019 11:13:52 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2987 Intra location rebalance will crash in special case.

Posted by "ZhangYao (Code Review)" <ge...@cloudera.org>.
ZhangYao has posted comments on this change. ( http://gerrit.cloudera.org:8080/14608 )

Change subject: KUDU-2987 Intra location rebalance will crash in special case.
......................................................................


Patch Set 2:

Add a test to simulate the crash situation. The new test will crash before this fix.


-- 
To view, visit http://gerrit.cloudera.org:8080/14608
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iea39472e55178fca688b390249432a0f7fefaaba
Gerrit-Change-Number: 14608
Gerrit-PatchSet: 2
Gerrit-Owner: ZhangYao <tr...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: ZhangYao <tr...@gmail.com>
Gerrit-Comment-Date: Sun, 03 Nov 2019 16:02:03 +0000
Gerrit-HasComments: No

[kudu-CR] KUDU-2987 Intra location rebalance crashes in special case.

Posted by "ZhangYao (Code Review)" <ge...@cloudera.org>.
Hello Alexey Serbin, Kudu Jenkins, Adar Dembo, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/14608

to look at the new patch set (#6).

Change subject: KUDU-2987 Intra location rebalance crashes in special case.
......................................................................

KUDU-2987 Intra location rebalance crashes in special case.

The crash manifested itself in cases where a Kudu cluster
had a location that didn't host even a single replica of
a tablet.

Change-Id: Iea39472e55178fca688b390249432a0f7fefaaba
---
M src/kudu/tools/rebalancer_tool-test.cc
M src/kudu/tools/rebalancer_tool.cc
2 files changed, 120 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/08/14608/6
-- 
To view, visit http://gerrit.cloudera.org:8080/14608
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iea39472e55178fca688b390249432a0f7fefaaba
Gerrit-Change-Number: 14608
Gerrit-PatchSet: 6
Gerrit-Owner: ZhangYao <tr...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: ZhangYao <tr...@gmail.com>

[kudu-CR] KUDU-2987 Intra location rebalance crashes in special case.

Posted by "ZhangYao (Code Review)" <ge...@cloudera.org>.
Hello Alexey Serbin, Kudu Jenkins, Adar Dembo, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/14608

to look at the new patch set (#4).

Change subject: KUDU-2987 Intra location rebalance crashes in special case.
......................................................................

KUDU-2987 Intra location rebalance crashes in special case.

The crash manifested itself in cases where a Kudu cluster
had a location that didn't host even a single replica of
a tablet.

Change-Id: Iea39472e55178fca688b390249432a0f7fefaaba
---
M src/kudu/tools/rebalancer_tool-test.cc
M src/kudu/tools/rebalancer_tool.cc
2 files changed, 120 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/08/14608/4
-- 
To view, visit http://gerrit.cloudera.org:8080/14608
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iea39472e55178fca688b390249432a0f7fefaaba
Gerrit-Change-Number: 14608
Gerrit-PatchSet: 4
Gerrit-Owner: ZhangYao <tr...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: ZhangYao <tr...@gmail.com>

[kudu-CR] KUDU-2987 Intra location rebalance will crash in special case.

Posted by "ZhangYao (Code Review)" <ge...@cloudera.org>.
Hello Alexey Serbin, Kudu Jenkins, Adar Dembo, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/14608

to look at the new patch set (#3).

Change subject: KUDU-2987 Intra location rebalance will crash in special case.
......................................................................

KUDU-2987 Intra location rebalance will crash in special case.

Change-Id: Iea39472e55178fca688b390249432a0f7fefaaba
---
M src/kudu/tools/rebalancer_tool-test.cc
M src/kudu/tools/rebalancer_tool.cc
2 files changed, 116 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/08/14608/3
-- 
To view, visit http://gerrit.cloudera.org:8080/14608
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iea39472e55178fca688b390249432a0f7fefaaba
Gerrit-Change-Number: 14608
Gerrit-PatchSet: 3
Gerrit-Owner: ZhangYao <tr...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: ZhangYao <tr...@gmail.com>

[kudu-CR] KUDU-2987 Intra location rebalance crashes in special case.

Posted by "ZhangYao (Code Review)" <ge...@cloudera.org>.
Hello Alexey Serbin, Kudu Jenkins, Adar Dembo, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/14608

to look at the new patch set (#5).

Change subject: KUDU-2987 Intra location rebalance crashes in special case.
......................................................................

KUDU-2987 Intra location rebalance crashes in special case.

The crash manifested itself in cases where a Kudu cluster
had a location that didn't host even a single replica of
a tablet.

Change-Id: Iea39472e55178fca688b390249432a0f7fefaaba
---
M src/kudu/tools/rebalancer_tool-test.cc
M src/kudu/tools/rebalancer_tool.cc
2 files changed, 120 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/08/14608/5
-- 
To view, visit http://gerrit.cloudera.org:8080/14608
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iea39472e55178fca688b390249432a0f7fefaaba
Gerrit-Change-Number: 14608
Gerrit-PatchSet: 5
Gerrit-Owner: ZhangYao <tr...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: ZhangYao <tr...@gmail.com>

[kudu-CR] KUDU-2987 Intra location rebalance will crash in special case.

Posted by "ZhangYao (Code Review)" <ge...@cloudera.org>.
Hello Alexey Serbin, Kudu Jenkins, Adar Dembo, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/14608

to look at the new patch set (#2).

Change subject: KUDU-2987 Intra location rebalance will crash in special case.
......................................................................

KUDU-2987 Intra location rebalance will crash in special case.

Change-Id: Iea39472e55178fca688b390249432a0f7fefaaba
---
M src/kudu/tools/rebalancer_tool-test.cc
M src/kudu/tools/rebalancer_tool.cc
2 files changed, 122 insertions(+), 9 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/08/14608/2
-- 
To view, visit http://gerrit.cloudera.org:8080/14608
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iea39472e55178fca688b390249432a0f7fefaaba
Gerrit-Change-Number: 14608
Gerrit-PatchSet: 2
Gerrit-Owner: ZhangYao <tr...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)

[kudu-CR] KUDU-2987 Intra location rebalance will crash in special case.

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/14608 )

Change subject: KUDU-2987 Intra location rebalance will crash in special case.
......................................................................


Patch Set 3:

(23 comments)

Overall looks good, just some nits.

Thank you very much for fixing the bug!

http://gerrit.cloudera.org:8080/#/c/14608/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/14608/3//COMMIT_MSG@7
PS3, Line 7: will crash
crashes


http://gerrit.cloudera.org:8080/#/c/14608/3//COMMIT_MSG@8
PS3, Line 8: 
Maybe, describe what were the conditions which lead to the crash scenario.  Something like the following, I guess:

  The crash manifested itself in cases where a Kudu cluster
  had a location that didn't host even a single replica
  of a tablet.


http://gerrit.cloudera.org:8080/#/c/14608/3/src/kudu/tools/rebalancer_tool-test.cc
File src/kudu/tools/rebalancer_tool-test.cc:

http://gerrit.cloudera.org:8080/#/c/14608/3/src/kudu/tools/rebalancer_tool-test.cc@483
PS3, Line 483:   // by the set of the specified locations.
Document the newly introduced parameter 'table_name_pattern': it' not obvious from reading the parameter list and the doc what it's about.


http://gerrit.cloudera.org:8080/#/c/14608/3/src/kudu/tools/rebalancer_tool-test.cc@487
PS3, Line 487:       const string& table_name_pattern = kTableNamePattern) {
code style: input parameters should come before output parameters in the function/methods signatures.  Move this new parameter before 'table_names'.


http://gerrit.cloudera.org:8080/#/c/14608/3/src/kudu/tools/rebalancer_tool-test.cc@523
PS3, Line 523: not
elsewhere but not


http://gerrit.cloudera.org:8080/#/c/14608/3/src/kudu/tools/rebalancer_tool-test.cc@524
PS3, Line 524:  This is to build the imbalance inside
             :   // the locations.
Drop.


http://gerrit.cloudera.org:8080/#/c/14608/3/src/kudu/tools/rebalancer_tool-test.cc@526
PS3, Line 526: Tserver
Tservers


http://gerrit.cloudera.org:8080/#/c/14608/3/src/kudu/tools/rebalancer_tool-test.cc@531
PS3, Line 531: excluded_tserver_uuid
nit: excluded_tserver_uuids


http://gerrit.cloudera.org:8080/#/c/14608/3/src/kudu/tools/rebalancer_tool-test.cc@537
PS3, Line 537: excluded_tservers.count(ts->location)
Is it possible to use ContainsKey() here instead?


http://gerrit.cloudera.org:8080/#/c/14608/3/src/kudu/tools/rebalancer_tool-test.cc@545
PS3, Line 545:     // Check the validation of excluded_tserver_by_location.
Drop: this sentence isn't adding any clarity and from the code it's obvious what's going on.


http://gerrit.cloudera.org:8080/#/c/14608/3/src/kudu/tools/rebalancer_tool-test.cc@547
PS3, Line 547: elem.second, 0
nit: the expected value comes first (i.e. swap these two)


http://gerrit.cloudera.org:8080/#/c/14608/3/src/kudu/tools/rebalancer_tool-test.cc@1716
PS3, Line 1716:       /*rep_factor=*/ 3,
              :       /*num_tservers=*/ 11) {
              :   }
nit: fix the alignment of these two lines


http://gerrit.cloudera.org:8080/#/c/14608/3/src/kudu/tools/rebalancer_tool-test.cc@1725
PS3, Line 1725: Rebalance_KUDU_2987
nit: I think it's better to name this scenario to reflect the essence of what it does, but add the information that this is KUDU-2987 regression test into the comment for this TEST_F() scope.

I guess the name for this scenario might be, e.g., LocationsWithEmptyTabletServers


http://gerrit.cloudera.org:8080/#/c/14608/3/src/kudu/tools/rebalancer_tool-test.cc@1733
PS3, Line 1733:                                        { "/D", 2} };
nit: add space


http://gerrit.cloudera.org:8080/#/c/14608/3/src/kudu/tools/rebalancer_tool-test.cc@1746
PS3, Line 1746:   // Create special intra location imbalance in location "/D".
I think it's important to note that there are 3 tablets in each table, so total it's 9 tablet replicas per table.


http://gerrit.cloudera.org:8080/#/c/14608/3/src/kudu/tools/rebalancer_tool-test.cc@1747
PS3, Line 1747: replica
tablet replicas


http://gerrit.cloudera.org:8080/#/c/14608/3/src/kudu/tools/rebalancer_tool-test.cc@1749
PS3, Line 1749: table0 in
table0 with tablet replicas in locations /A, /B, and /C.


http://gerrit.cloudera.org:8080/#/c/14608/3/src/kudu/tools/rebalancer_tool-test.cc@1750
PS3, Line 1750: doesn't over 1
is not greater than 1


http://gerrit.cloudera.org:8080/#/c/14608/3/src/kudu/tools/rebalancer_tool-test.cc@1750
PS3, Line 1750: the
              :   // table0 won't be moved
no replica of table0 will be moved


http://gerrit.cloudera.org:8080/#/c/14608/3/src/kudu/tools/rebalancer_tool-test.cc@1764
PS3, Line 1764:   // The run of the location-aware rebalancing tool should report the cluster
              :   // as balanced.
Drop since this is a bit confusing -- the rebalancer actually moves one replica.

I thinks it's worth adding a sentence explaining that prior to the fix added in this changelist the rebalancer would simply crash while running.


http://gerrit.cloudera.org:8080/#/c/14608/3/src/kudu/tools/rebalancer_tool-test.cc@1774
PS3, Line 1774: rebalance
rebalancing


http://gerrit.cloudera.org:8080/#/c/14608/3/src/kudu/tools/rebalancer_tool-test.cc@1774
PS3, Line 1774: supposed move
should move


http://gerrit.cloudera.org:8080/#/c/14608/3/src/kudu/tools/rebalancer_tool-test.cc@1777
PS3, Line 1777:     // All the violations of the placement policy should be
              :     // corrected.
There were no placement policy violations ever prior to running the rebalancer tool, right?



-- 
To view, visit http://gerrit.cloudera.org:8080/14608
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iea39472e55178fca688b390249432a0f7fefaaba
Gerrit-Change-Number: 14608
Gerrit-PatchSet: 3
Gerrit-Owner: ZhangYao <tr...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: ZhangYao <tr...@gmail.com>
Gerrit-Comment-Date: Thu, 07 Nov 2019 19:22:30 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2987 Intra location rebalance crashes in special case.

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/14608 )

Change subject: KUDU-2987 Intra location rebalance crashes in special case.
......................................................................


Patch Set 5:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/14608/5/src/kudu/tools/rebalancer_tool-test.cc
File src/kudu/tools/rebalancer_tool-test.cc:

http://gerrit.cloudera.org:8080/#/c/14608/5/src/kudu/tools/rebalancer_tool-test.cc@1760
PS5, Line 1760:   // 2. Create table1 in { "/A", 3 }, { "/B", 3 }, { "/C", 3 }, {"/D", 1}.
nit: add a space


http://gerrit.cloudera.org:8080/#/c/14608/5/src/kudu/tools/rebalancer_tool-test.cc@1762
PS5, Line 1762:   ASSERT_OK(CreateTablesExcludingTserversInLocations({ {"/D", 1} },
nit: add a space


http://gerrit.cloudera.org:8080/#/c/14608/5/src/kudu/tools/rebalancer_tool-test.cc@1781
PS5, Line 1781:     // Intra location rebalancing in location "/D" should move one replica
              :     // to another tserver, so there is at least one move.
              :     ASSERT_STR_NOT_CONTAINS(out, "(moved 0 replicas)");
Whoops, now this became confusing.  So, the rebalancer doesn't move even a single after all?  If so, then it's necessary to update the comment.  If it's moves one replica sometimes, but sometimes not a single one, then it's necessary to modify the scenario (maybe, have different layout of table replicas) to make the behavior of the test predictable and stable, otherwise there is a risk of introducing a test which is flaky by design.



-- 
To view, visit http://gerrit.cloudera.org:8080/14608
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iea39472e55178fca688b390249432a0f7fefaaba
Gerrit-Change-Number: 14608
Gerrit-PatchSet: 5
Gerrit-Owner: ZhangYao <tr...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: ZhangYao <tr...@gmail.com>
Gerrit-Comment-Date: Sun, 10 Nov 2019 06:19:00 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2987 Intra location rebalance crashes in special case.

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/14608 )

Change subject: KUDU-2987 Intra location rebalance crashes in special case.
......................................................................


Patch Set 6: Verified+1

(1 comment)

Unrelated test failure:
  * DenseNodeTest.RunTest

http://gerrit.cloudera.org:8080/#/c/14608/5/src/kudu/tools/rebalancer_tool-test.cc
File src/kudu/tools/rebalancer_tool-test.cc:

http://gerrit.cloudera.org:8080/#/c/14608/5/src/kudu/tools/rebalancer_tool-test.cc@1781
PS5, Line 1781:     // Intra location rebalancing in location "/D" should move one replica
              :     // to another tserver, so there is at least one move.
              :     ASSERT_STR_NOT_CONTAINS(out, "(moved 0 replicas)");
> It's that, because when we selecting tservers we use 2PC, so when using 'pl
Ah, I see.  It seems I missed the fact that the ASSERT was changed w.r.t. pattern match/no-match.



-- 
To view, visit http://gerrit.cloudera.org:8080/14608
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iea39472e55178fca688b390249432a0f7fefaaba
Gerrit-Change-Number: 14608
Gerrit-PatchSet: 6
Gerrit-Owner: ZhangYao <tr...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: ZhangYao <tr...@gmail.com>
Gerrit-Comment-Date: Sun, 10 Nov 2019 13:14:17 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2987 Intra location rebalance crashes in special case.

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/14608 )

Change subject: KUDU-2987 Intra location rebalance crashes in special case.
......................................................................


Patch Set 6: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/14608
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iea39472e55178fca688b390249432a0f7fefaaba
Gerrit-Change-Number: 14608
Gerrit-PatchSet: 6
Gerrit-Owner: ZhangYao <tr...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: ZhangYao <tr...@gmail.com>
Gerrit-Comment-Date: Sun, 10 Nov 2019 13:14:20 +0000
Gerrit-HasComments: No

[kudu-CR] KUDU-2987 Intra location rebalance crashes in special case.

Posted by "ZhangYao (Code Review)" <ge...@cloudera.org>.
ZhangYao has posted comments on this change. ( http://gerrit.cloudera.org:8080/14608 )

Change subject: KUDU-2987 Intra location rebalance crashes in special case.
......................................................................


Patch Set 6:

I think this test failure has nothing to do with my patch :)


-- 
To view, visit http://gerrit.cloudera.org:8080/14608
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iea39472e55178fca688b390249432a0f7fefaaba
Gerrit-Change-Number: 14608
Gerrit-PatchSet: 6
Gerrit-Owner: ZhangYao <tr...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: ZhangYao <tr...@gmail.com>
Gerrit-Comment-Date: Sun, 10 Nov 2019 11:58:43 +0000
Gerrit-HasComments: No

[kudu-CR] KUDU-2987 Intra location rebalance crashes in special case.

Posted by "ZhangYao (Code Review)" <ge...@cloudera.org>.
ZhangYao has posted comments on this change. ( http://gerrit.cloudera.org:8080/14608 )

Change subject: KUDU-2987 Intra location rebalance crashes in special case.
......................................................................


Patch Set 4:

(23 comments)

http://gerrit.cloudera.org:8080/#/c/14608/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/14608/3//COMMIT_MSG@7
PS3, Line 7: crashes in
> crashes
Done


http://gerrit.cloudera.org:8080/#/c/14608/3//COMMIT_MSG@8
PS3, Line 8: 
> Maybe, describe what were the conditions which lead to the crash scenario. 
Done


http://gerrit.cloudera.org:8080/#/c/14608/3/src/kudu/tools/rebalancer_tool-test.cc
File src/kudu/tools/rebalancer_tool-test.cc:

http://gerrit.cloudera.org:8080/#/c/14608/3/src/kudu/tools/rebalancer_tool-test.cc@483
PS3, Line 483:   // CreateUnbalancedTables() but the set of tablet servers to avoid is defined
> Document the newly introduced parameter 'table_name_pattern': it' not obvio
Done


http://gerrit.cloudera.org:8080/#/c/14608/3/src/kudu/tools/rebalancer_tool-test.cc@487
PS3, Line 487:   // function. Each call to this function can create differen
> code style: input parameters should come before output parameters in the fu
Done


http://gerrit.cloudera.org:8080/#/c/14608/3/src/kudu/tools/rebalancer_tool-test.cc@523
PS3, Line 523: 
> elsewhere but not
Done


http://gerrit.cloudera.org:8080/#/c/14608/3/src/kudu/tools/rebalancer_tool-test.cc@524
PS3, Line 524: 
             :     return Status::
> Drop.
Done


http://gerrit.cloudera.org:8080/#/c/14608/3/src/kudu/tools/rebalancer_tool-test.cc@526
PS3, Line 526: 
> Tservers
Done


http://gerrit.cloudera.org:8080/#/c/14608/3/src/kudu/tools/rebalancer_tool-test.cc@531
PS3, Line 531: ed_map<string, int>& 
> nit: excluded_tserver_uuids
Done


http://gerrit.cloudera.org:8080/#/c/14608/3/src/kudu/tools/rebalancer_tool-test.cc@537
PS3, Line 537: cas would be hosted by those servers.
> Is it possible to use ContainsKey() here instead?
Done


http://gerrit.cloudera.org:8080/#/c/14608/3/src/kudu/tools/rebalancer_tool-test.cc@545
PS3, Line 545:           cluster_->tablet_server_by_uuid(ts->uuid())->Shutd
> Drop: this sentence isn't adding any clarity and from the code it's obvious
Done


http://gerrit.cloudera.org:8080/#/c/14608/3/src/kudu/tools/rebalancer_tool-test.cc@547
PS3, Line 547: 
> nit: the expected value comes first (i.e. swap these two)
Done


http://gerrit.cloudera.org:8080/#/c/14608/3/src/kudu/tools/rebalancer_tool-test.cc@1716
PS3, Line 1716: class IntraLocationRebalancingBasicTest : public RebalancingTest {
              : public:
              :   I
> nit: fix the alignment of these two lines
Done


http://gerrit.cloudera.org:8080/#/c/14608/3/src/kudu/tools/rebalancer_tool-test.cc@1725
PS3, Line 1725: 
> nit: I think it's better to name this scenario to reflect the essence of wh
Done


http://gerrit.cloudera.org:8080/#/c/14608/3/src/kudu/tools/rebalancer_tool-test.cc@1733
PS3, Line 1733:   }
> nit: add space
Done


http://gerrit.cloudera.org:8080/#/c/14608/3/src/kudu/tools/rebalancer_tool-test.cc@1746
PS3, Line 1746:   FLAGS_num_replicas = rep_factor_;
> I think it's important to note that there are 3 tablets in each table, so t
Done


http://gerrit.cloudera.org:8080/#/c/14608/3/src/kudu/tools/rebalancer_tool-test.cc@1747
PS3, Line 1747: _, mast
> tablet replicas
Done


http://gerrit.cloudera.org:8080/#/c/14608/3/src/kudu/tools/rebalancer_tool-test.cc@1749
PS3, Line 1749: 
> table0 with tablet replicas in locations /A, /B, and /C.
Done


http://gerrit.cloudera.org:8080/#/c/14608/3/src/kudu/tools/rebalancer_tool-test.cc@1750
PS3, Line 1750: cial intra loc
> is not greater than 1
Done


http://gerrit.cloudera.org:8080/#/c/14608/3/src/kudu/tools/rebalancer_tool-test.cc@1750
PS3, Line 1750: 
              :   // should only contain t
> no replica of table0 will be moved
Done


http://gerrit.cloudera.org:8080/#/c/14608/3/src/kudu/tools/rebalancer_tool-test.cc@1764
PS3, Line 1764: 
              :   const vector<st
> Drop since this is a bit confusing -- the rebalancer actually moves one rep
Done


http://gerrit.cloudera.org:8080/#/c/14608/3/src/kudu/tools/rebalancer_tool-test.cc@1774
PS3, Line 1774: 
> rebalancing
Done


http://gerrit.cloudera.org:8080/#/c/14608/3/src/kudu/tools/rebalancer_tool-test.cc@1774
PS3, Line 1774: 
> should move
Done


http://gerrit.cloudera.org:8080/#/c/14608/3/src/kudu/tools/rebalancer_tool-test.cc@1777
PS3, Line 1777:     ASSERT_TRUE(s.ok()) << ToolRunInfo(s, out, err);
              :     ASSERT_STR_CO
> There were no placement policy violations ever prior to running the rebalan
Yeah. I moved this comment, it's confusing and not suitable here.



-- 
To view, visit http://gerrit.cloudera.org:8080/14608
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iea39472e55178fca688b390249432a0f7fefaaba
Gerrit-Change-Number: 14608
Gerrit-PatchSet: 4
Gerrit-Owner: ZhangYao <tr...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: ZhangYao <tr...@gmail.com>
Gerrit-Comment-Date: Fri, 08 Nov 2019 03:44:20 +0000
Gerrit-HasComments: Yes