You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Andrew Wong (Code Review)" <ge...@cloudera.org> on 2019/05/23 05:38:53 UTC

[kudu-CR] hms: have tools ignore other clusters

Andrew Wong has uploaded this change for review. ( http://gerrit.cloudera.org:8080/13411


Change subject: hms: have tools ignore other clusters
......................................................................

hms: have tools ignore other clusters

It is possible for there to exist metadata for multiple Kudu clusters in
a single Hive Metastore instance. As such, tooling to check and fix on
Kudu's HMS integration shouldn't affect another's.

Change-Id: Ib4e2ad5835fd7fedd1e963d234b153c1df5f8766
---
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_hms.cc
2 files changed, 64 insertions(+), 6 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/11/13411/1
-- 
To view, visit http://gerrit.cloudera.org:8080/13411
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ib4e2ad5835fd7fedd1e963d234b153c1df5f8766
Gerrit-Change-Number: 13411
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>

[kudu-CR] hms: have tools ignore other clusters

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Mike Percy, Alexey Serbin, Kudu Jenkins, Grant Henke, Hao Hao, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/13411

to look at the new patch set (#4).

Change subject: hms: have tools ignore other clusters
......................................................................

hms: have tools ignore other clusters

Currently, the HMS tooling will check and fix the master addresses field
of HMS entries based on strict string comparison, assuming that the HMS
entry found for a Kudu table _should_ belong to the Kudu cluster being
operated on.

However, it is possible for there to exist metadata for multiple Kudu
clusters in a single Hive Metastore instance. As such, tooling to check
and fix a Kudu cluster's HMS integration shouldn't necessarily affect
another's.

This patch addresses this by imbuing the HMS tooling with the following
behavior:
- If any master address in a given HMS entry coincides with the expected
  Kudu masters, fix them (as was the behavior before); this is common if
  adding new masters, or replacing dead masters.
- If none of the master addresses in the HMS entry coincide with those
  expected, the HMS entry may belong to another Kudu cluster. By
  default, the tools will ignore such HMS entries. The user can specify
  --noignore-other-clusters to have the tools check/fix such master
  addresses.

Change-Id: Ib4e2ad5835fd7fedd1e963d234b153c1df5f8766
---
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_hms.cc
2 files changed, 141 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/11/13411/4
-- 
To view, visit http://gerrit.cloudera.org:8080/13411
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib4e2ad5835fd7fedd1e963d234b153c1df5f8766
Gerrit-Change-Number: 13411
Gerrit-PatchSet: 4
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>

[kudu-CR] hms: have tools ignore other clusters

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/13411 )

Change subject: hms: have tools ignore other clusters
......................................................................


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/13411/1/src/kudu/tools/tool_action_hms.cc
File src/kudu/tools/tool_action_hms.cc:

http://gerrit.cloudera.org:8080/#/c/13411/1/src/kudu/tools/tool_action_hms.cc@161
PS1, Line 161:   return Status::OK();
> LGTM for the first proposal. For the second proposal is it to cover migrati
Done


http://gerrit.cloudera.org:8080/#/c/13411/2/src/kudu/tools/tool_action_hms.cc
File src/kudu/tools/tool_action_hms.cc:

http://gerrit.cloudera.org:8080/#/c/13411/2/src/kudu/tools/tool_action_hms.cc@306
PS2, Line 306: du_tables_by_name.emp
> nit: I'm not sure unordered_set buys much in this context.  Comparing order
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/13411
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4e2ad5835fd7fedd1e963d234b153c1df5f8766
Gerrit-Change-Number: 13411
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Comment-Date: Sun, 26 May 2019 09:22:17 +0000
Gerrit-HasComments: Yes

[kudu-CR] hms: have tools ignore other clusters

Posted by "Hao Hao (Code Review)" <ge...@cloudera.org>.
Hao Hao has posted comments on this change. ( http://gerrit.cloudera.org:8080/13411 )

Change subject: hms: have tools ignore other clusters
......................................................................


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/13411/4/src/kudu/tools/kudu-tool-test.cc
File src/kudu/tools/kudu-tool-test.cc:

http://gerrit.cloudera.org:8080/#/c/13411/4/src/kudu/tools/kudu-tool-test.cc@3837
PS4, Line 3837: // The check tool will ignore the HMS metadata from the other cluster, and
              :   // view the Kudu table as orphaned.
What happens if the table id of the table entry from the other cluster matches with the Kudu table? Do you think we should handle that?


http://gerrit.cloudera.org:8080/#/c/13411/4/src/kudu/tools/tool_action_hms.cc
File src/kudu/tools/tool_action_hms.cc:

http://gerrit.cloudera.org:8080/#/c/13411/4/src/kudu/tools/tool_action_hms.cc@322
PS4, Line 322: if (FLAGS_ignore_other_clusters) {
nit: Can you add a short description here of how master addrs are compared?



-- 
To view, visit http://gerrit.cloudera.org:8080/13411
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4e2ad5835fd7fedd1e963d234b153c1df5f8766
Gerrit-Change-Number: 13411
Gerrit-PatchSet: 4
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Comment-Date: Tue, 28 May 2019 19:27:36 +0000
Gerrit-HasComments: Yes

[kudu-CR] hms: have tools ignore other clusters

Posted by "Grant Henke (Code Review)" <ge...@cloudera.org>.
Grant Henke has posted comments on this change. ( http://gerrit.cloudera.org:8080/13411 )

Change subject: hms: have tools ignore other clusters
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13411/1/src/kudu/tools/tool_action_hms.cc
File src/kudu/tools/tool_action_hms.cc:

http://gerrit.cloudera.org:8080/#/c/13411/1/src/kudu/tools/tool_action_hms.cc@161
PS1, Line 161: // Note: checking the master addresses should be done separately.
> Hrm.. that's a good point. WDYT about adding a --ignore_master_addrs (defau
I think for the table filtering below, any table which shares a single master with this Kudu cluster should not be filtered out. Then those should be included and checked here. 

For tables where no-masters match, --ignore_master_addrs would be nice so that in cases where all masters don't match, users can still migrate/drop those hms entries.



-- 
To view, visit http://gerrit.cloudera.org:8080/13411
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4e2ad5835fd7fedd1e963d234b153c1df5f8766
Gerrit-Change-Number: 13411
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Comment-Date: Thu, 23 May 2019 18:37:32 +0000
Gerrit-HasComments: Yes

[kudu-CR] hms: have tools ignore other clusters

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/13411 )

Change subject: hms: have tools ignore other clusters
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13411/1/src/kudu/tools/tool_action_hms.cc
File src/kudu/tools/tool_action_hms.cc:

http://gerrit.cloudera.org:8080/#/c/13411/1/src/kudu/tools/tool_action_hms.cc@161
PS1, Line 161: // Note: checking the master addresses is done separately.
> nit: Should this say "is done separately"?
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/13411
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4e2ad5835fd7fedd1e963d234b153c1df5f8766
Gerrit-Change-Number: 13411
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Thu, 23 May 2019 17:33:53 +0000
Gerrit-HasComments: Yes

[kudu-CR] hms: have tools ignore other clusters

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/13411 )

Change subject: hms: have tools ignore other clusters
......................................................................


Patch Set 2:

Ah, this needs to be updated. One of the tests is unhappy.


-- 
To view, visit http://gerrit.cloudera.org:8080/13411
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4e2ad5835fd7fedd1e963d234b153c1df5f8766
Gerrit-Change-Number: 13411
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Comment-Date: Thu, 23 May 2019 17:53:21 +0000
Gerrit-HasComments: No

[kudu-CR] hms: have tools ignore other clusters

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/13411 )

Change subject: hms: have tools ignore other clusters
......................................................................


Patch Set 5:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/13411/4/src/kudu/tools/kudu-tool-test.cc
File src/kudu/tools/kudu-tool-test.cc:

http://gerrit.cloudera.org:8080/#/c/13411/4/src/kudu/tools/kudu-tool-test.cc@3837
PS4, Line 3837: ASSERT_OK(AlterHmsWithReplacedParam(&hms_client, "default", "table",
              :       HmsClient::kKuduMasterAddrsKey,
> I think it's more simple to keep these checks/concerns separate. Given we h
It's on another cluster, so I don't think it's a good idea to match IDs. UUIDs should be universally unique, but that scenario seems possible eg if for some reason multiple clusters have been brought up from the same physical copy.


http://gerrit.cloudera.org:8080/#/c/13411/4/src/kudu/tools/tool_action_hms.cc
File src/kudu/tools/tool_action_hms.cc:

http://gerrit.cloudera.org:8080/#/c/13411/4/src/kudu/tools/tool_action_hms.cc@322
PS4, Line 322: r (hive::Table& hms_table : hms_ta
> nit: Can you add a short description here of how master addrs are compared?
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/13411
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4e2ad5835fd7fedd1e963d234b153c1df5f8766
Gerrit-Change-Number: 13411
Gerrit-PatchSet: 5
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Comment-Date: Wed, 29 May 2019 23:55:33 +0000
Gerrit-HasComments: Yes

[kudu-CR] hms: have tools ignore other clusters

Posted by "Grant Henke (Code Review)" <ge...@cloudera.org>.
Grant Henke has removed a vote on this change.

Change subject: hms: have tools ignore other clusters
......................................................................


Removed Code-Review+2 by Grant Henke <gr...@apache.org>
-- 
To view, visit http://gerrit.cloudera.org:8080/13411
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: Ib4e2ad5835fd7fedd1e963d234b153c1df5f8766
Gerrit-Change-Number: 13411
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>

[kudu-CR] hms: have tools ignore other clusters

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Mike Percy, Alexey Serbin, Kudu Jenkins, Grant Henke, Hao Hao, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/13411

to look at the new patch set (#3).

Change subject: hms: have tools ignore other clusters
......................................................................

hms: have tools ignore other clusters

Currently, the HMS tooling will check and fix the master addresses field
of HMS entries based on strict string comparison, assuming that the HMS
entry found for a Kudu table _should_ belong to the Kudu cluster being
operated on.

However, it is possible for there to exist metadata for multiple Kudu
clusters in a single Hive Metastore instance. As such, tooling to check
and fix a Kudu cluster's HMS integration shouldn't necessarily affect
another's.

This patch addresses this by imbuing the HMS tooling with the following
behavior:
- If any master address in a given HMS entry coincides with the expected
  Kudu masters, fix them (as was the behavior before); this is common if
  adding new masters, or replacing dead masters.
- If none of the master addresses in the HMS entry coincide with those
  expected, the HMS entry may belong to another Kudu cluster. By
  default, the tools will ignore such HMS entries. The user can specify
  --noignore-other-clusters to have the tools check/fix such master
  addresses.

Change-Id: Ib4e2ad5835fd7fedd1e963d234b153c1df5f8766
---
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_hms.cc
2 files changed, 140 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/11/13411/3
-- 
To view, visit http://gerrit.cloudera.org:8080/13411
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib4e2ad5835fd7fedd1e963d234b153c1df5f8766
Gerrit-Change-Number: 13411
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>

[kudu-CR] hms: have tools ignore other clusters

Posted by "Hao Hao (Code Review)" <ge...@cloudera.org>.
Hao Hao has posted comments on this change. ( http://gerrit.cloudera.org:8080/13411 )

Change subject: hms: have tools ignore other clusters
......................................................................


Patch Set 5: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13411/4/src/kudu/tools/kudu-tool-test.cc
File src/kudu/tools/kudu-tool-test.cc:

http://gerrit.cloudera.org:8080/#/c/13411/4/src/kudu/tools/kudu-tool-test.cc@3837
PS4, Line 3837: ASSERT_OK(AlterHmsWithReplacedParam(&hms_client, "default", "table",
              :       HmsClient::kKuduMasterAddrsKey,
> It's on another cluster, so I don't think it's a good idea to match IDs. UU
Yeah, LGTM, I think this is not a normal case the tool should handle. Thanks!



-- 
To view, visit http://gerrit.cloudera.org:8080/13411
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4e2ad5835fd7fedd1e963d234b153c1df5f8766
Gerrit-Change-Number: 13411
Gerrit-PatchSet: 5
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Comment-Date: Thu, 30 May 2019 18:21:26 +0000
Gerrit-HasComments: Yes

[kudu-CR] hms: have tools ignore other clusters

Posted by "Hao Hao (Code Review)" <ge...@cloudera.org>.
Hao Hao has posted comments on this change. ( http://gerrit.cloudera.org:8080/13411 )

Change subject: hms: have tools ignore other clusters
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13411/1/src/kudu/tools/tool_action_hms.cc
File src/kudu/tools/tool_action_hms.cc:

http://gerrit.cloudera.org:8080/#/c/13411/1/src/kudu/tools/tool_action_hms.cc@161
PS1, Line 161: // Note: checking the master addresses is done separately.
> I think for the table filtering below, any table which shares a single mast
LGTM for the first proposal. For the second proposal is it to cover migration to a completely new Kudu cluster?



-- 
To view, visit http://gerrit.cloudera.org:8080/13411
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4e2ad5835fd7fedd1e963d234b153c1df5f8766
Gerrit-Change-Number: 13411
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Comment-Date: Thu, 23 May 2019 18:50:01 +0000
Gerrit-HasComments: Yes

[kudu-CR] hms: have tools ignore other clusters

Posted by "Grant Henke (Code Review)" <ge...@cloudera.org>.
Grant Henke has posted comments on this change. ( http://gerrit.cloudera.org:8080/13411 )

Change subject: hms: have tools ignore other clusters
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13411/1/src/kudu/tools/tool_action_hms.cc
File src/kudu/tools/tool_action_hms.cc:

http://gerrit.cloudera.org:8080/#/c/13411/1/src/kudu/tools/tool_action_hms.cc@161
PS1, Line 161: // Note: checking the master addresses should be done separately.
nit: Should this say "is done separately"?



-- 
To view, visit http://gerrit.cloudera.org:8080/13411
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4e2ad5835fd7fedd1e963d234b153c1df5f8766
Gerrit-Change-Number: 13411
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Thu, 23 May 2019 13:54:15 +0000
Gerrit-HasComments: Yes

[kudu-CR] hms: have tools ignore other clusters

Posted by "Hao Hao (Code Review)" <ge...@cloudera.org>.
Hao Hao has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/13411 )

Change subject: hms: have tools ignore other clusters
......................................................................

hms: have tools ignore other clusters

Currently, the HMS tooling will check and fix the master addresses field
of HMS entries based on strict string comparison, assuming that the HMS
entry found for a Kudu table _should_ belong to the Kudu cluster being
operated on.

However, it is possible for there to exist metadata for multiple Kudu
clusters in a single Hive Metastore instance. As such, tooling to check
and fix a Kudu cluster's HMS integration shouldn't necessarily affect
another's.

This patch addresses this by imbuing the HMS tooling with the following
behavior:
- If any master address in a given HMS entry coincides with the expected
  Kudu masters, fix them (as was the behavior before); this is common if
  adding new masters, or replacing dead masters.
- If none of the master addresses in the HMS entry coincide with those
  expected, the HMS entry may belong to another Kudu cluster. By
  default, the tools will ignore such HMS entries. The user can specify
  --noignore-other-clusters to have the tools check/fix such master
  addresses.

Change-Id: Ib4e2ad5835fd7fedd1e963d234b153c1df5f8766
Reviewed-on: http://gerrit.cloudera.org:8080/13411
Tested-by: Kudu Jenkins
Reviewed-by: Hao Hao <ha...@cloudera.com>
---
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_hms.cc
2 files changed, 143 insertions(+), 5 deletions(-)

Approvals:
  Kudu Jenkins: Verified
  Hao Hao: Looks good to me, approved

-- 
To view, visit http://gerrit.cloudera.org:8080/13411
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ib4e2ad5835fd7fedd1e963d234b153c1df5f8766
Gerrit-Change-Number: 13411
Gerrit-PatchSet: 6
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>

[kudu-CR] hms: have tools ignore other clusters

Posted by "Grant Henke (Code Review)" <ge...@cloudera.org>.
Grant Henke has posted comments on this change. ( http://gerrit.cloudera.org:8080/13411 )

Change subject: hms: have tools ignore other clusters
......................................................................


Patch Set 2: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/13411
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4e2ad5835fd7fedd1e963d234b153c1df5f8766
Gerrit-Change-Number: 13411
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Comment-Date: Thu, 23 May 2019 17:52:39 +0000
Gerrit-HasComments: No

[kudu-CR] hms: have tools ignore other clusters

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins, Grant Henke, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/13411

to look at the new patch set (#2).

Change subject: hms: have tools ignore other clusters
......................................................................

hms: have tools ignore other clusters

It is possible for there to exist metadata for multiple Kudu clusters in
a single Hive Metastore instance. As such, tooling to check and fix on
Kudu's HMS integration shouldn't affect another's.

Change-Id: Ib4e2ad5835fd7fedd1e963d234b153c1df5f8766
---
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_hms.cc
2 files changed, 64 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/11/13411/2
-- 
To view, visit http://gerrit.cloudera.org:8080/13411
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib4e2ad5835fd7fedd1e963d234b153c1df5f8766
Gerrit-Change-Number: 13411
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)

[kudu-CR] hms: have tools ignore other clusters

Posted by "Grant Henke (Code Review)" <ge...@cloudera.org>.
Grant Henke has posted comments on this change. ( http://gerrit.cloudera.org:8080/13411 )

Change subject: hms: have tools ignore other clusters
......................................................................


Patch Set 4: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13411/4/src/kudu/tools/kudu-tool-test.cc
File src/kudu/tools/kudu-tool-test.cc:

http://gerrit.cloudera.org:8080/#/c/13411/4/src/kudu/tools/kudu-tool-test.cc@3837
PS4, Line 3837: // The check tool will ignore the HMS metadata from the other cluster, and
              :   // view the Kudu table as orphaned.
> What happens if the table id of the table entry from the other cluster matc
I think it's more simple to keep these checks/concerns separate. Given we have the force option to ignore HMS things in the case the table really should be considered. 

Maybe the table ID could collide or be copied by some other process to a different cluster.



-- 
To view, visit http://gerrit.cloudera.org:8080/13411
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4e2ad5835fd7fedd1e963d234b153c1df5f8766
Gerrit-Change-Number: 13411
Gerrit-PatchSet: 4
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Comment-Date: Wed, 29 May 2019 15:49:14 +0000
Gerrit-HasComments: Yes

[kudu-CR] hms: have tools ignore other clusters

Posted by "Grant Henke (Code Review)" <ge...@cloudera.org>.
Grant Henke has posted comments on this change. ( http://gerrit.cloudera.org:8080/13411 )

Change subject: hms: have tools ignore other clusters
......................................................................


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/13411/1/src/kudu/tools/tool_action_hms.cc
File src/kudu/tools/tool_action_hms.cc:

http://gerrit.cloudera.org:8080/#/c/13411/1/src/kudu/tools/tool_action_hms.cc@161
PS1, Line 161: // Note: checking the master addresses should be done separately.
> Done
We may still want to validate this here to account for master migrations.


http://gerrit.cloudera.org:8080/#/c/13411/1/src/kudu/tools/tool_action_hms.cc@316
PS1, Line 316:       if (hms_master_addrs != kudu_master_addrs) {
Should any single matching HMS master count for this cluster to account for master migrations?



-- 
To view, visit http://gerrit.cloudera.org:8080/13411
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4e2ad5835fd7fedd1e963d234b153c1df5f8766
Gerrit-Change-Number: 13411
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Comment-Date: Thu, 23 May 2019 18:09:06 +0000
Gerrit-HasComments: Yes

[kudu-CR] hms: have tools ignore other clusters

Posted by "Grant Henke (Code Review)" <ge...@cloudera.org>.
Grant Henke has posted comments on this change. ( http://gerrit.cloudera.org:8080/13411 )

Change subject: hms: have tools ignore other clusters
......................................................................


Patch Set 1:

What about cases where the masters were moved or migrated? I suppose all of them moving is super rare, but one or two moving might be more common.


-- 
To view, visit http://gerrit.cloudera.org:8080/13411
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4e2ad5835fd7fedd1e963d234b153c1df5f8766
Gerrit-Change-Number: 13411
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Comment-Date: Thu, 23 May 2019 18:08:10 +0000
Gerrit-HasComments: No

[kudu-CR] hms: have tools ignore other clusters

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Mike Percy, Alexey Serbin, Kudu Jenkins, Grant Henke, Hao Hao, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/13411

to look at the new patch set (#5).

Change subject: hms: have tools ignore other clusters
......................................................................

hms: have tools ignore other clusters

Currently, the HMS tooling will check and fix the master addresses field
of HMS entries based on strict string comparison, assuming that the HMS
entry found for a Kudu table _should_ belong to the Kudu cluster being
operated on.

However, it is possible for there to exist metadata for multiple Kudu
clusters in a single Hive Metastore instance. As such, tooling to check
and fix a Kudu cluster's HMS integration shouldn't necessarily affect
another's.

This patch addresses this by imbuing the HMS tooling with the following
behavior:
- If any master address in a given HMS entry coincides with the expected
  Kudu masters, fix them (as was the behavior before); this is common if
  adding new masters, or replacing dead masters.
- If none of the master addresses in the HMS entry coincide with those
  expected, the HMS entry may belong to another Kudu cluster. By
  default, the tools will ignore such HMS entries. The user can specify
  --noignore-other-clusters to have the tools check/fix such master
  addresses.

Change-Id: Ib4e2ad5835fd7fedd1e963d234b153c1df5f8766
---
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_hms.cc
2 files changed, 143 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/11/13411/5
-- 
To view, visit http://gerrit.cloudera.org:8080/13411
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib4e2ad5835fd7fedd1e963d234b153c1df5f8766
Gerrit-Change-Number: 13411
Gerrit-PatchSet: 5
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>

[kudu-CR] hms: have tools ignore other clusters

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/13411 )

Change subject: hms: have tools ignore other clusters
......................................................................


Patch Set 2: Code-Review+1


-- 
To view, visit http://gerrit.cloudera.org:8080/13411
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4e2ad5835fd7fedd1e963d234b153c1df5f8766
Gerrit-Change-Number: 13411
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Comment-Date: Fri, 24 May 2019 20:35:18 +0000
Gerrit-HasComments: No

[kudu-CR] hms: have tools ignore other clusters

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/13411 )

Change subject: hms: have tools ignore other clusters
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13411/2/src/kudu/tools/tool_action_hms.cc
File src/kudu/tools/tool_action_hms.cc:

http://gerrit.cloudera.org:8080/#/c/13411/2/src/kudu/tools/tool_action_hms.cc@306
PS2, Line 306: unordered_set<string>
nit: I'm not sure unordered_set buys much in this context.  Comparing ordered sets is O(N), when comparing unordered sets is O(N^2).



-- 
To view, visit http://gerrit.cloudera.org:8080/13411
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4e2ad5835fd7fedd1e963d234b153c1df5f8766
Gerrit-Change-Number: 13411
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Comment-Date: Fri, 24 May 2019 20:35:02 +0000
Gerrit-HasComments: Yes

[kudu-CR] hms: have tools ignore other clusters

Posted by "Grant Henke (Code Review)" <ge...@cloudera.org>.
Grant Henke has posted comments on this change. ( http://gerrit.cloudera.org:8080/13411 )

Change subject: hms: have tools ignore other clusters
......................................................................


Patch Set 1: Code-Review+1


-- 
To view, visit http://gerrit.cloudera.org:8080/13411
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4e2ad5835fd7fedd1e963d234b153c1df5f8766
Gerrit-Change-Number: 13411
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Thu, 23 May 2019 13:54:19 +0000
Gerrit-HasComments: No

[kudu-CR] hms: have tools ignore other clusters

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/13411 )

Change subject: hms: have tools ignore other clusters
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13411/1/src/kudu/tools/tool_action_hms.cc
File src/kudu/tools/tool_action_hms.cc:

http://gerrit.cloudera.org:8080/#/c/13411/1/src/kudu/tools/tool_action_hms.cc@161
PS1, Line 161: // Note: checking the master addresses is done separately.
> We may still want to validate this here to account for master migrations.
Hrm.. that's a good point. WDYT about adding a --ignore_master_addrs (default false since migrations seem rare) option to the check/fix tools?



-- 
To view, visit http://gerrit.cloudera.org:8080/13411
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4e2ad5835fd7fedd1e963d234b153c1df5f8766
Gerrit-Change-Number: 13411
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Comment-Date: Thu, 23 May 2019 18:24:53 +0000
Gerrit-HasComments: Yes