You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Will Berkeley (Code Review)" <ge...@cloudera.org> on 2018/05/03 21:48:34 UTC

[kudu-CR] [tools] ksck imp. [8/n]: Easy-to-see errors when running w/o admin

Will Berkeley has uploaded this change for review. ( http://gerrit.cloudera.org:8080/10300


Change subject: [tools] ksck imp. [8/n]: Easy-to-see errors when running w/o admin
......................................................................

[tools] ksck imp. [8/n]: Easy-to-see errors when running w/o admin

The consensus info gathering parts of ksck have clearly become
central to its functionality, but occasionally administrators are
fooled by the results when they run ksck without kudu admin privileges.
This change surfaces unauthorized errors from the master and tablet
servers clearly.

This also removes the --consensus flag which could be used to turn off
gathering consensus info. It shouldn't really be used as it can give
misleading output where the cluster appears healthy but some tablets
are actually unable to make progress.

Change-Id: I644957103efefceeba4b82f4557376a603507423
---
M src/kudu/tools/ksck-test.cc
M src/kudu/tools/ksck.cc
M src/kudu/tools/tool_action_cluster.cc
3 files changed, 56 insertions(+), 12 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/00/10300/1
-- 
To view, visit http://gerrit.cloudera.org:8080/10300
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I644957103efefceeba4b82f4557376a603507423
Gerrit-Change-Number: 10300
Gerrit-PatchSet: 1
Gerrit-Owner: Will Berkeley <wd...@gmail.com>

[kudu-CR] [tools] ksck imp. [8/n]: Easy-to-see errors when running w/o admin

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Hello Alexey Serbin, Attila Bukor, Kudu Jenkins, Andrew Wong, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/10300

to look at the new patch set (#3).

Change subject: [tools] ksck imp. [8/n]: Easy-to-see errors when running w/o admin
......................................................................

[tools] ksck imp. [8/n]: Easy-to-see errors when running w/o admin

If ksck is not run as an admin user, then it cannot gather all of the
information it uses to do cluster checks. "Not authorized" errors are
easy to mistake for other errors. This patch handles these errors
specially to make it obvious to administrators when they have forgotten
to run as an administrative user.

The new errors at the end of the ksck output look like this:

==================
Errors:
==================
Not authorized: error fetching info from masters: failed to gather info from 1 of 1 masters due to lack of admin privileges
Not authorized: error fetching info from tablet servers: failed to gather info from 11 of 12 tablet servers due to lack of admin privileges

FAILED
Not authorized: please run ksck with administrator privileges

Change-Id: I644957103efefceeba4b82f4557376a603507423
---
M src/kudu/tools/ksck-test.cc
M src/kudu/tools/ksck.cc
M src/kudu/tools/ksck_results.cc
M src/kudu/tools/ksck_results.h
M src/kudu/tools/tool.proto
5 files changed, 77 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/00/10300/3
-- 
To view, visit http://gerrit.cloudera.org:8080/10300
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I644957103efefceeba4b82f4557376a603507423
Gerrit-Change-Number: 10300
Gerrit-PatchSet: 3
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] [tools] ksck imp. [8/n]: Easy-to-see errors when running w/o admin

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Will Berkeley has posted comments on this change. ( http://gerrit.cloudera.org:8080/10300 )

Change subject: [tools] ksck imp. [8/n]: Easy-to-see errors when running w/o admin
......................................................................


Patch Set 3:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/10300/3/src/kudu/tools/ksck.cc
File src/kudu/tools/ksck.cc:

http://gerrit.cloudera.org:8080/#/c/10300/3/src/kudu/tools/ksck.cc@146
PS3, Line 146: s.IsRemoteError
> OK, but there are also calls where a KuduClient would be used, and KuduClie
Yeah, this lil helper is meant strictly for this specific ksck context. It is not meant to applicable anywhere else.


http://gerrit.cloudera.org:8080/#/c/10300/3/src/kudu/tools/ksck_results.cc
File src/kudu/tools/ksck_results.cc:

http://gerrit.cloudera.org:8080/#/c/10300/3/src/kudu/tools/ksck_results.cc@171
PS3, Line 171:     case KsckServerHealth::UNAUTHORIZED:
             :       return 1;
             :     case KsckServerHealth::UNAVAILABLE:
             :       return 2;
             :     case KsckServerHealth::WRONG_SERVER_UUID:
> Maybe, that's just another reason to separate that different statuses like 
It's apples and oranges, ultimately, in that way. They are both apples in that they are the results of trying to fetch info from a tablet server as part of ksck. And as we need to sort things somehow, we have to pick some order. I think this one is reasonable for the reason I gave, but I agree yours is reasonable too. Consider which you'd want to notice and address first upon being given a ksck.


http://gerrit.cloudera.org:8080/#/c/10300/3/src/kudu/tools/tool.proto
File src/kudu/tools/tool.proto:

http://gerrit.cloudera.org:8080/#/c/10300/3/src/kudu/tools/tool.proto@224
PS3, Line 224: UNAUTHORIZED = 3;
> Yes, I know.  My question was about whether it would be enough in that case
The specifics errors are very useful for categorizing and sorting. They require different responses from the administrator.



-- 
To view, visit http://gerrit.cloudera.org:8080/10300
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I644957103efefceeba4b82f4557376a603507423
Gerrit-Change-Number: 10300
Gerrit-PatchSet: 3
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-Comment-Date: Fri, 25 May 2018 21:03:36 +0000
Gerrit-HasComments: Yes

[kudu-CR] [tools] ksck imp. [8/n]: Easy-to-see errors when running w/o admin

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Will Berkeley has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/10300 )

Change subject: [tools] ksck imp. [8/n]: Easy-to-see errors when running w/o admin
......................................................................

[tools] ksck imp. [8/n]: Easy-to-see errors when running w/o admin

If ksck is not run as an admin user, then it cannot gather all of the
information it uses to do cluster checks. "Not authorized" errors are
easy to mistake for other errors. This patch handles these errors
specially to make it obvious to administrators when they have forgotten
to run as an administrative user.

The new errors at the end of the ksck output look like this:

==================
Errors:
==================
Not authorized: error fetching info from masters: failed to gather info from 1 of 1 masters due to lack of admin privileges
Not authorized: error fetching info from tablet servers: failed to gather info from 11 of 12 tablet servers due to lack of admin privileges

FAILED
Not authorized: re-run ksck with administrator privileges

Change-Id: I644957103efefceeba4b82f4557376a603507423
Reviewed-on: http://gerrit.cloudera.org:8080/10300
Reviewed-by: Attila Bukor <ab...@cloudera.com>
Tested-by: Kudu Jenkins
Reviewed-by: Alexey Serbin <as...@cloudera.com>
---
M src/kudu/tools/ksck-test.cc
M src/kudu/tools/ksck.cc
M src/kudu/tools/ksck_results.cc
M src/kudu/tools/ksck_results.h
M src/kudu/tools/tool.proto
5 files changed, 77 insertions(+), 3 deletions(-)

Approvals:
  Attila Bukor: Looks good to me, but someone else must approve
  Kudu Jenkins: Verified
  Alexey Serbin: Looks good to me, approved

-- 
To view, visit http://gerrit.cloudera.org:8080/10300
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I644957103efefceeba4b82f4557376a603507423
Gerrit-Change-Number: 10300
Gerrit-PatchSet: 6
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] [tools] ksck imp. [8/n]: Easy-to-see errors when running w/o admin

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/10300 )

Change subject: [tools] ksck imp. [8/n]: Easy-to-see errors when running w/o admin
......................................................................


Patch Set 3:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/10300/3/src/kudu/tools/ksck-test.cc
File src/kudu/tools/ksck-test.cc:

http://gerrit.cloudera.org:8080/#/c/10300/3/src/kudu/tools/ksck-test.cc@769
PS3, Line 769: ASSERT_TRUE(s.IsNotAuthorized());
nit: maybe it's worth adding s.ToString():

ASSERT_TRUE(...) << s.ToString();


http://gerrit.cloudera.org:8080/#/c/10300/3/src/kudu/tools/ksck.cc
File src/kudu/tools/ksck.cc:

http://gerrit.cloudera.org:8080/#/c/10300/3/src/kudu/tools/ksck.cc@146
PS3, Line 146: s.IsRemoteError
I thought some of the errors are returned back as NotAuthorized() in that case, no?


http://gerrit.cloudera.org:8080/#/c/10300/3/src/kudu/tools/ksck.cc@399
PS3, Line 399: please run ksck
nit: maybe just 're-run ksck ...'


http://gerrit.cloudera.org:8080/#/c/10300/3/src/kudu/tools/ksck_results.cc
File src/kudu/tools/ksck_results.cc:

http://gerrit.cloudera.org:8080/#/c/10300/3/src/kudu/tools/ksck_results.cc@171
PS3, Line 171:     case KsckServerHealth::UNAUTHORIZED:
             :       return 1;
             :     case KsckServerHealth::UNAVAILABLE:
             :       return 2;
             :     case KsckServerHealth::WRONG_SERVER_UUID:
I would expect that to be in different order:

UNAUTHORIZED
WRONG_SERVER_UUID
UNAVAILABLE

Not sure it makes a big difference, but unavailable/dead server sounds like having worse health than server which is alive but has wrong UUID.  But that started with the previous revision, though.


http://gerrit.cloudera.org:8080/#/c/10300/3/src/kudu/tools/tool.proto
File src/kudu/tools/tool.proto:

http://gerrit.cloudera.org:8080/#/c/10300/3/src/kudu/tools/tool.proto@224
PS3, Line 224: UNAUTHORIZED = 3;
It becoming less and less ServerHealth-related, but I cannot suggest a good alternative, though.  Maybe, just use UNKNOWN?

Does it help to have that status in addition to the error  summary in the end of the ksck output?



-- 
To view, visit http://gerrit.cloudera.org:8080/10300
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I644957103efefceeba4b82f4557376a603507423
Gerrit-Change-Number: 10300
Gerrit-PatchSet: 3
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-Comment-Date: Fri, 25 May 2018 19:06:22 +0000
Gerrit-HasComments: Yes

[kudu-CR] [tools] ksck imp. [8/n]: Easy-to-see errors when running w/o admin

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Will Berkeley has posted comments on this change. ( http://gerrit.cloudera.org:8080/10300 )

Change subject: [tools] ksck imp. [8/n]: Easy-to-see errors when running w/o admin
......................................................................


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/10300/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/10300/2//COMMIT_MSG@13
PS2, Line 13: to run as an administrative user.
> can you add an example error message in the commit message?
Done


http://gerrit.cloudera.org:8080/#/c/10300/2/src/kudu/tools/ksck.cc
File src/kudu/tools/ksck.cc:

http://gerrit.cloudera.org:8080/#/c/10300/2/src/kudu/tools/ksck.cc@147
PS2, Line 147:          s.ToString().find("Not authorized: unauthorized access to method") != string::npos;
> it feels weird to string match the error message, could we maybe return EPE
Backwards compat is an important part of this patch, unfortunately. While string matching isn't the ideal way, I don't think the message is likely to change anytime soon.



-- 
To view, visit http://gerrit.cloudera.org:8080/10300
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I644957103efefceeba4b82f4557376a603507423
Gerrit-Change-Number: 10300
Gerrit-PatchSet: 2
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-Comment-Date: Fri, 25 May 2018 17:14:02 +0000
Gerrit-HasComments: Yes

[kudu-CR] [tools] ksck imp. [8/n]: Easy-to-see errors when running w/o admin

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/10300 )

Change subject: [tools] ksck imp. [8/n]: Easy-to-see errors when running w/o admin
......................................................................


Patch Set 5: Code-Review+2

(3 comments)

http://gerrit.cloudera.org:8080/#/c/10300/3/src/kudu/tools/ksck.cc
File src/kudu/tools/ksck.cc:

http://gerrit.cloudera.org:8080/#/c/10300/3/src/kudu/tools/ksck.cc@146
PS3, Line 146: s.IsRemoteError
> These particular errors come back as RemoteError.
OK, but there are also calls where a KuduClient would be used, and KuduClient converts RemoteError with specific PB error code into NotAuthorized.  My thinking was to make this function applicable for those cases as well.  However, I think that's OK in the YANGNI paradigm to postpone that up to the time when this function is about to be applied for statuses produced by KuduClient methods.


http://gerrit.cloudera.org:8080/#/c/10300/3/src/kudu/tools/ksck_results.cc
File src/kudu/tools/ksck_results.cc:

http://gerrit.cloudera.org:8080/#/c/10300/3/src/kudu/tools/ksck_results.cc@171
PS3, Line 171:     case KsckServerHealth::UNAUTHORIZED:
             :       return 1;
             :     case KsckServerHealth::UNAVAILABLE:
             :       return 2;
             :     case KsckServerHealth::WRONG_SERVER_UUID:
> I consider WRONG_SERVER_UUID to be worse because it means there's a tablet 
Maybe, that's just another reason to separate that different statuses like UNAVAILABLE and WRONG_SERVER_UUID :)

OK, if from server health perspective a dead server is better than alive server at wrong place, that's strange but acceptable.


http://gerrit.cloudera.org:8080/#/c/10300/3/src/kudu/tools/tool.proto
File src/kudu/tools/tool.proto:

http://gerrit.cloudera.org:8080/#/c/10300/3/src/kudu/tools/tool.proto@224
PS3, Line 224: UNAUTHORIZED = 3;
> It's not UNKNOWN, it's UNAUTHORIZED. I agree ServerHealth has already grown
Yes, I know.  My question was about whether it would be enough in that case to report the server's health as UNKNOWN instead of proliferating detailed error cases.  It seems there is some value of having the exact error specified, though.



-- 
To view, visit http://gerrit.cloudera.org:8080/10300
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I644957103efefceeba4b82f4557376a603507423
Gerrit-Change-Number: 10300
Gerrit-PatchSet: 5
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-Comment-Date: Fri, 25 May 2018 20:38:23 +0000
Gerrit-HasComments: Yes

[kudu-CR] [tools] ksck imp. [8/n]: Easy-to-see errors when running w/o admin

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Will Berkeley has posted comments on this change. ( http://gerrit.cloudera.org:8080/10300 )

Change subject: [tools] ksck imp. [8/n]: Easy-to-see errors when running w/o admin
......................................................................


Patch Set 1:

Probably isn't worth reviewing until KUDU-2426 (https://gerrit.cloudera.org/#/c/10293/) is fixed.


-- 
To view, visit http://gerrit.cloudera.org:8080/10300
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I644957103efefceeba4b82f4557376a603507423
Gerrit-Change-Number: 10300
Gerrit-PatchSet: 1
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-Comment-Date: Thu, 03 May 2018 21:49:14 +0000
Gerrit-HasComments: No

[kudu-CR] [tools] ksck imp. [8/n]: Easy-to-see errors when running w/o admin

Posted by "Attila Bukor (Code Review)" <ge...@cloudera.org>.
Attila Bukor has posted comments on this change. ( http://gerrit.cloudera.org:8080/10300 )

Change subject: [tools] ksck imp. [8/n]: Easy-to-see errors when running w/o admin
......................................................................


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10300/2/src/kudu/tools/ksck.cc
File src/kudu/tools/ksck.cc:

http://gerrit.cloudera.org:8080/#/c/10300/2/src/kudu/tools/ksck.cc@147
PS2, Line 147:          s.ToString().find("Not authorized: unauthorized access to method") != string::npos;
> Backwards compat is an important part of this patch, unfortunately. While s
I see. Would it make sense to still add EPERM/EACCES on the server side and use (string match OR error code match) here?



-- 
To view, visit http://gerrit.cloudera.org:8080/10300
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I644957103efefceeba4b82f4557376a603507423
Gerrit-Change-Number: 10300
Gerrit-PatchSet: 3
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-Comment-Date: Fri, 25 May 2018 17:27:29 +0000
Gerrit-HasComments: Yes

[kudu-CR] [tools] ksck imp. [8/n]: Easy-to-see errors when running w/o admin

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Hello Alexey Serbin, Attila Bukor, Kudu Jenkins, Andrew Wong, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/10300

to look at the new patch set (#4).

Change subject: [tools] ksck imp. [8/n]: Easy-to-see errors when running w/o admin
......................................................................

[tools] ksck imp. [8/n]: Easy-to-see errors when running w/o admin

If ksck is not run as an admin user, then it cannot gather all of the
information it uses to do cluster checks. "Not authorized" errors are
easy to mistake for other errors. This patch handles these errors
specially to make it obvious to administrators when they have forgotten
to run as an administrative user.

The new errors at the end of the ksck output look like this:

==================
Errors:
==================
Not authorized: error fetching info from masters: failed to gather info from 1 of 1 masters due to lack of admin privileges
Not authorized: error fetching info from tablet servers: failed to gather info from 11 of 12 tablet servers due to lack of admin privileges

FAILED
Not authorized: please run ksck with administrator privileges

Change-Id: I644957103efefceeba4b82f4557376a603507423
---
M src/kudu/tools/ksck-test.cc
M src/kudu/tools/ksck.cc
M src/kudu/tools/ksck_results.cc
M src/kudu/tools/ksck_results.h
M src/kudu/tools/tool.proto
5 files changed, 77 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/00/10300/4
-- 
To view, visit http://gerrit.cloudera.org:8080/10300
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I644957103efefceeba4b82f4557376a603507423
Gerrit-Change-Number: 10300
Gerrit-PatchSet: 4
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] [tools] ksck imp. [8/n]: Easy-to-see errors when running w/o admin

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Hello Alexey Serbin, Attila Bukor, Kudu Jenkins, Andrew Wong, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/10300

to look at the new patch set (#5).

Change subject: [tools] ksck imp. [8/n]: Easy-to-see errors when running w/o admin
......................................................................

[tools] ksck imp. [8/n]: Easy-to-see errors when running w/o admin

If ksck is not run as an admin user, then it cannot gather all of the
information it uses to do cluster checks. "Not authorized" errors are
easy to mistake for other errors. This patch handles these errors
specially to make it obvious to administrators when they have forgotten
to run as an administrative user.

The new errors at the end of the ksck output look like this:

==================
Errors:
==================
Not authorized: error fetching info from masters: failed to gather info from 1 of 1 masters due to lack of admin privileges
Not authorized: error fetching info from tablet servers: failed to gather info from 11 of 12 tablet servers due to lack of admin privileges

FAILED
Not authorized: re-run ksck with administrator privileges

Change-Id: I644957103efefceeba4b82f4557376a603507423
---
M src/kudu/tools/ksck-test.cc
M src/kudu/tools/ksck.cc
M src/kudu/tools/ksck_results.cc
M src/kudu/tools/ksck_results.h
M src/kudu/tools/tool.proto
5 files changed, 77 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/00/10300/5
-- 
To view, visit http://gerrit.cloudera.org:8080/10300
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I644957103efefceeba4b82f4557376a603507423
Gerrit-Change-Number: 10300
Gerrit-PatchSet: 5
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] [tools] ksck imp. [8/n]: Easy-to-see errors when running w/o admin

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/10300

to look at the new patch set (#2).

Change subject: [tools] ksck imp. [8/n]: Easy-to-see errors when running w/o admin
......................................................................

[tools] ksck imp. [8/n]: Easy-to-see errors when running w/o admin

If ksck is not run as an admin user, then it cannot gather all of the
information it uses to do cluster checks. "Not authorized" errors are
easy to mistake for other errors. This patch handles these errors
specially to make it obvious to administrators when they have forgotten
to run as an administrative user.

Change-Id: I644957103efefceeba4b82f4557376a603507423
---
M src/kudu/tools/ksck-test.cc
M src/kudu/tools/ksck.cc
M src/kudu/tools/ksck_results.cc
M src/kudu/tools/ksck_results.h
M src/kudu/tools/tool.proto
5 files changed, 77 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/00/10300/2
-- 
To view, visit http://gerrit.cloudera.org:8080/10300
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I644957103efefceeba4b82f4557376a603507423
Gerrit-Change-Number: 10300
Gerrit-PatchSet: 2
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] [tools] ksck imp. [8/n]: Easy-to-see errors when running w/o admin

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Will Berkeley has posted comments on this change. ( http://gerrit.cloudera.org:8080/10300 )

Change subject: [tools] ksck imp. [8/n]: Easy-to-see errors when running w/o admin
......................................................................


Patch Set 2: Verified+1

Unrelated flake.


-- 
To view, visit http://gerrit.cloudera.org:8080/10300
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I644957103efefceeba4b82f4557376a603507423
Gerrit-Change-Number: 10300
Gerrit-PatchSet: 2
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-Comment-Date: Fri, 25 May 2018 16:52:56 +0000
Gerrit-HasComments: No

[kudu-CR] [tools] ksck imp. [8/n]: Easy-to-see errors when running w/o admin

Posted by "Attila Bukor (Code Review)" <ge...@cloudera.org>.
Attila Bukor has posted comments on this change. ( http://gerrit.cloudera.org:8080/10300 )

Change subject: [tools] ksck imp. [8/n]: Easy-to-see errors when running w/o admin
......................................................................


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/10300/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/10300/2//COMMIT_MSG@13
PS2, Line 13: to run as an administrative user.
can you add an example error message in the commit message?


http://gerrit.cloudera.org:8080/#/c/10300/2/src/kudu/tools/ksck.cc
File src/kudu/tools/ksck.cc:

http://gerrit.cloudera.org:8080/#/c/10300/2/src/kudu/tools/ksck.cc@147
PS2, Line 147:          s.ToString().find("Not authorized: unauthorized access to method") != string::npos;
it feels weird to string match the error message, could we maybe return EPERM or EACCES on the server side with the RemoteError and check on that instead? That would mean a newer ksck wouldn't be able to detect not authorized errors when accessing older servers, but would mean better forward compatibility.



-- 
To view, visit http://gerrit.cloudera.org:8080/10300
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I644957103efefceeba4b82f4557376a603507423
Gerrit-Change-Number: 10300
Gerrit-PatchSet: 2
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-Comment-Date: Fri, 25 May 2018 16:04:51 +0000
Gerrit-HasComments: Yes

[kudu-CR] [tools] ksck imp. [8/n]: Easy-to-see errors when running w/o admin

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Will Berkeley has removed a vote on this change.

Change subject: [tools] ksck imp. [8/n]: Easy-to-see errors when running w/o admin
......................................................................


Removed Verified-1 by Kudu Jenkins (120)
-- 
To view, visit http://gerrit.cloudera.org:8080/10300
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: I644957103efefceeba4b82f4557376a603507423
Gerrit-Change-Number: 10300
Gerrit-PatchSet: 2
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] [tools] ksck imp. [8/n]: Easy-to-see errors when running w/o admin

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Will Berkeley has posted comments on this change. ( http://gerrit.cloudera.org:8080/10300 )

Change subject: [tools] ksck imp. [8/n]: Easy-to-see errors when running w/o admin
......................................................................


Patch Set 4:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/10300/3/src/kudu/tools/ksck-test.cc
File src/kudu/tools/ksck-test.cc:

http://gerrit.cloudera.org:8080/#/c/10300/3/src/kudu/tools/ksck-test.cc@769
PS3, Line 769: ASSERT_TRUE(s.IsNotAuthorized()) 
> nit: maybe it's worth adding s.ToString():
Done


http://gerrit.cloudera.org:8080/#/c/10300/3/src/kudu/tools/ksck.cc
File src/kudu/tools/ksck.cc:

http://gerrit.cloudera.org:8080/#/c/10300/3/src/kudu/tools/ksck.cc@146
PS3, Line 146: s.IsRemoteError
> I thought some of the errors are returned back as NotAuthorized() in that c
These particular errors come back as RemoteError.


http://gerrit.cloudera.org:8080/#/c/10300/3/src/kudu/tools/ksck.cc@399
PS3, Line 399: re-run ksck wit
> nit: maybe just 're-run ksck ...'
Done


http://gerrit.cloudera.org:8080/#/c/10300/3/src/kudu/tools/ksck_results.cc
File src/kudu/tools/ksck_results.cc:

http://gerrit.cloudera.org:8080/#/c/10300/3/src/kudu/tools/ksck_results.cc@171
PS3, Line 171:     case KsckServerHealth::UNAUTHORIZED:
             :       return 1;
             :     case KsckServerHealth::UNAVAILABLE:
             :       return 2;
             :     case KsckServerHealth::WRONG_SERVER_UUID:
> I would expect that to be in different order:
I consider WRONG_SERVER_UUID to be worse because it means there's a tablet server at address X that other servers did not expect to be there, which indicates some kind of disaster like the node was wiped out and restarted, or bad misconfiguration.

Is it worse if you go to your friend's house and knock on the door and they aren't home or if you knock on the door and some other dude answers? :)


http://gerrit.cloudera.org:8080/#/c/10300/3/src/kudu/tools/tool.proto
File src/kudu/tools/tool.proto:

http://gerrit.cloudera.org:8080/#/c/10300/3/src/kudu/tools/tool.proto@224
PS3, Line 224: UNAUTHORIZED = 3;
> It becoming less and less ServerHealth-related, but I cannot suggest a good
It's not UNKNOWN, it's UNAUTHORIZED. I agree ServerHealth has already grown beyond health. ServerStatus is kind of better except it wouldn't actually be of type Status.

We don't list all servers and their errors at the end of the ksck output. They are surfaced earlier, and there we do include both the status message and the ServerHealth.



-- 
To view, visit http://gerrit.cloudera.org:8080/10300
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I644957103efefceeba4b82f4557376a603507423
Gerrit-Change-Number: 10300
Gerrit-PatchSet: 4
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-Comment-Date: Fri, 25 May 2018 19:34:09 +0000
Gerrit-HasComments: Yes

[kudu-CR] [tools] ksck imp. [8/n]: Easy-to-see errors when running w/o admin

Posted by "Attila Bukor (Code Review)" <ge...@cloudera.org>.
Attila Bukor has posted comments on this change. ( http://gerrit.cloudera.org:8080/10300 )

Change subject: [tools] ksck imp. [8/n]: Easy-to-see errors when running w/o admin
......................................................................


Patch Set 5: Code-Review+1


-- 
To view, visit http://gerrit.cloudera.org:8080/10300
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I644957103efefceeba4b82f4557376a603507423
Gerrit-Change-Number: 10300
Gerrit-PatchSet: 5
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-Comment-Date: Fri, 25 May 2018 20:02:04 +0000
Gerrit-HasComments: No

[kudu-CR] [tools] ksck imp. [8/n]: Easy-to-see errors when running w/o admin

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Will Berkeley has posted comments on this change. ( http://gerrit.cloudera.org:8080/10300 )

Change subject: [tools] ksck imp. [8/n]: Easy-to-see errors when running w/o admin
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10300/2/src/kudu/tools/ksck.cc
File src/kudu/tools/ksck.cc:

http://gerrit.cloudera.org:8080/#/c/10300/2/src/kudu/tools/ksck.cc@147
PS2, Line 147:          s.ToString().find("Not authorized: unauthorized access to method") != string::npos;
> I see. Would it make sense to still add EPERM/EACCES on the server side and
I think we usually save the POSIX codes for statuses that result from syscalls. In any case, I really don't think it is worth it.



-- 
To view, visit http://gerrit.cloudera.org:8080/10300
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I644957103efefceeba4b82f4557376a603507423
Gerrit-Change-Number: 10300
Gerrit-PatchSet: 2
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-Comment-Date: Fri, 25 May 2018 19:06:22 +0000
Gerrit-HasComments: Yes