You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Fredy Wijaya (Code Review)" <ge...@cloudera.org> on 2019/04/24 01:41:25 UTC

[Impala-ASF-CR] IMPMALA-8444: Fix performance regression when building privilege name

Fredy Wijaya has uploaded this change for review. ( http://gerrit.cloudera.org:8080/13095


Change subject: IMPMALA-8444: Fix performance regression when building privilege name
......................................................................

IMPMALA-8444: Fix performance regression when building privilege name

This patch fixes the performance regression when building privilege name
by rewriting PrincipalPrivilege.buildPrivilegeName() with a simple
string concatentation instead of using a list that gets converted into a
string.

Below is the result of running a benchmark using JMH comparing the old
and new implementations:

Result "org.apache.impala.PrivilegeNameBenchmark.fast":
  ≈ 10⁻³ ms/op

Result "org.apache.impala.PrivilegeNameBenchmark.slow":
  0.001 ±(99.9%) 0.001 ms/op [Average]
  (min, avg, max) = (0.001, 0.001, 0.001), stdev = 0.001
  CI (99.9%): [0.001, 0.001] (assumes normal distribution)

Benchmark                    Mode  Cnt   Score    Error  Units
PrivilegeNameBenchmark.fast  avgt   25  ≈ 10⁻³           ms/op
PrivilegeNameBenchmark.slow  avgt   25   0.001 ±  0.001  ms/op

This patch also updates SentryAuthorizationPolicy.listPrivileges() to
reuse the privilege names that have already been built instead of
building them again. While fixing this, I found a bug where Principal
stores the PrincipalPrivilege in case insensitive way. This is true for
all privilege scopes, except URI. This patch fixes the issue by storing
URI privilege in a case sensitive manner.

This patch removes the synchronized keyword in
SentryAuthorizationPolicy.listPrivileges() in order to not cause the
operation to run in serial in a highly concurrent workload. This has a
trade off of stale authorization metadata, which may be acceptable to
get a better performance.

Testing:
- Ran all FE tests
- Ran all E2E authorization tests
- Added E2E test for URI privilege bug

Change-Id: I942d9b55f07c8972f69e532567d9b7d80fceb6e5
---
M fe/src/main/java/org/apache/impala/authorization/AuthorizationPolicy.java
M fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationPolicy.java
M fe/src/main/java/org/apache/impala/catalog/CatalogObjectCache.java
M fe/src/main/java/org/apache/impala/catalog/Principal.java
M fe/src/main/java/org/apache/impala/catalog/PrincipalPrivilege.java
M tests/authorization/test_grant_revoke.py
6 files changed, 121 insertions(+), 78 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/95/13095/1
-- 
To view, visit http://gerrit.cloudera.org:8080/13095
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I942d9b55f07c8972f69e532567d9b7d80fceb6e5
Gerrit-Change-Number: 13095
Gerrit-PatchSet: 1
Gerrit-Owner: Fredy Wijaya <fw...@cloudera.com>

[Impala-ASF-CR] IMPALA-8444: Fix performance regression when building privilege name

Posted by "Bharath Vissapragada (Code Review)" <ge...@cloudera.org>.
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/13095 )

Change subject: IMPALA-8444: Fix performance regression when building privilege name
......................................................................


Patch Set 6: Code-Review+1

(1 comment)

I'll let Csaba take another pass.

http://gerrit.cloudera.org:8080/#/c/13095/6/fe/src/main/java/org/apache/impala/catalog/Principal.java
File fe/src/main/java/org/apache/impala/catalog/Principal.java:

http://gerrit.cloudera.org:8080/#/c/13095/6/fe/src/main/java/org/apache/impala/catalog/Principal.java@82
PS6, Line 82:     return principalPrivileges_.keySet();
Actually thinking again, we are at a risk of exposing the underlying keySet() to the callers who can change it (same above). Probably safer to to return an ImmutableSet<> (and ImmutableList<> above). Anyway, they only do a shallow copy, so that should be fine I guess.



-- 
To view, visit http://gerrit.cloudera.org:8080/13095
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I942d9b55f07c8972f69e532567d9b7d80fceb6e5
Gerrit-Change-Number: 13095
Gerrit-PatchSet: 6
Gerrit-Owner: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Wed, 24 Apr 2019 21:57:59 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-8444: Fix performance regression when building privilege name

Posted by "Fredy Wijaya (Code Review)" <ge...@cloudera.org>.
Fredy Wijaya has posted comments on this change. ( http://gerrit.cloudera.org:8080/13095 )

Change subject: IMPALA-8444: Fix performance regression when building privilege name
......................................................................


Patch Set 6:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/13095/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13095/3//COMMIT_MSG@38
PS3, Line 38: This patch removes incorrect synchronization in
            : SentryAuthorizationPolicy.listPrivileges() that can cause the operation
            : to run in serial in a highly concurrent workload.
> How did you test this? I think this codepath is sensitive in the sense that
I meant to say remove incorrect synchronization. Updated the comment.


http://gerrit.cloudera.org:8080/#/c/13095/3/fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationPolicy.java
File fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationPolicy.java:

http://gerrit.cloudera.org:8080/#/c/13095/3/fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationPolicy.java@90
PS3, Line 90: 
> Is there a way to avoid a copy to a HashSet()? Not sure what that is buying
Yeah this isn't needed since we're returning the catalog keys backed by ConcurrentHashMap, which should be thread-safe. Done.



-- 
To view, visit http://gerrit.cloudera.org:8080/13095
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I942d9b55f07c8972f69e532567d9b7d80fceb6e5
Gerrit-Change-Number: 13095
Gerrit-PatchSet: 6
Gerrit-Owner: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Wed, 24 Apr 2019 19:33:46 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-8444: Fix performance regression when building privilege name

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13095 )

Change subject: IMPALA-8444: Fix performance regression when building privilege name
......................................................................


Patch Set 2:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/2873/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/13095
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I942d9b55f07c8972f69e532567d9b7d80fceb6e5
Gerrit-Change-Number: 13095
Gerrit-PatchSet: 2
Gerrit-Owner: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Wed, 24 Apr 2019 02:18:25 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8444: Fix performance regression when building privilege name

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/13095 )

Change subject: IMPALA-8444: Fix performance regression when building privilege name
......................................................................


Patch Set 7:

(4 comments)

I am still uncertain about the case sensitivity, once that is clarified, I can give +2.

http://gerrit.cloudera.org:8080/#/c/13095/7/fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationPolicy.java
File fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationPolicy.java:

http://gerrit.cloudera.org:8080/#/c/13095/7/fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationPolicy.java@66
PS7, Line 66:         for (String privilegeName: role.getPrivilegeNames()) {
            :           privileges.add(privilegeName);
            :         }
I think that this could be further simplified to privileges.addAll(role.getPrivilegeNames()); It is also possible that the union is a bit faster than inserting one-by-one, especially when adding the first set to the initial empty set.


http://gerrit.cloudera.org:8080/#/c/13095/7/fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationPolicy.java@84
PS7, Line 84:         for (String privilegeName: user.getPrivilegeNames()) {
            :           privileges.add(privilegeName);
            :         }
Same as line 66.


http://gerrit.cloudera.org:8080/#/c/13095/2/fe/src/main/java/org/apache/impala/catalog/Principal.java
File fe/src/main/java/org/apache/impala/catalog/Principal.java:

http://gerrit.cloudera.org:8080/#/c/13095/2/fe/src/main/java/org/apache/impala/catalog/Principal.java@63
PS2, Line 63:    */
            :   public boolean addPrivilege(PrincipalPrivilege privilege) {
            :     // URIs are case sensitive. Sentry treats everything in the URI 
> > if non-URI parts of the name are always lower case, then it
 > doesn't matter if we treat them as case insensitive or not
 > 
 > When calling getPrivilege(String privilegeName) for non URI, we
 > need to know the exact case that's stored in the cache. It's
 > simpler to special case the URI and make everything else to be case
 > insensitive.
 > 

This is still not clear to me. Keys for non-URIs are lowered twice: once in buildPrivilegeName() (or action and grant option can contain upper case latters?), and once again in CatalogObjectCache functions. If the goal is to find non-URI privileges in getPrivilege() even if there are upper/lower case differences, then this could be achieved by adding function to PrincipalPrivilege like string convertToCacheKey(string privilegeName) that check if the name is an URI, and call toLower() if it is not. I would prefer this because it would move the case handling issues to a single class (that contains buildPrivilegeName()).

+ see my comment for getPrivilege()

 > > if non-URI parts of the name can be upper case (including
 > 'server'), then this add() will also treat the 'server' part of
 > URIs in case sensitive way, which seems wrong
 > 
 > Yeah I agree, this is weird. This is actually a bug in Sentry.
 > Sentry treats everything in the URI as case sensitive including the
 > host name. We can fix it by always lower casing the host name
 > before sending it to Sentry. However if a privilege was granted
 > outside Impala that contains upper case host name, there's nothing
 > much that Impala can do. I'll add a comment in the code.

Ouch, thanks for the explanation.


http://gerrit.cloudera.org:8080/#/c/13095/2/fe/src/main/java/org/apache/impala/catalog/Principal.java@89
PS2, Line 89: 
Don't we need to handle URIs case sensitively in this function? principalPrivileges_.get(privilegeName) will call toLower() on privilegeName, so it will not match with URI keys if they contain upper case latter.



-- 
To view, visit http://gerrit.cloudera.org:8080/13095
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I942d9b55f07c8972f69e532567d9b7d80fceb6e5
Gerrit-Change-Number: 13095
Gerrit-PatchSet: 7
Gerrit-Owner: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Thu, 25 Apr 2019 15:40:58 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-8444: Fix performance regression when building privilege name

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13095 )

Change subject: IMPALA-8444: Fix performance regression when building privilege name
......................................................................


Patch Set 4:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/2886/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/13095
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I942d9b55f07c8972f69e532567d9b7d80fceb6e5
Gerrit-Change-Number: 13095
Gerrit-PatchSet: 4
Gerrit-Owner: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Wed, 24 Apr 2019 18:07:50 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8444: Fix performance regression when building privilege name

Posted by "Fredy Wijaya (Code Review)" <ge...@cloudera.org>.
Fredy Wijaya has uploaded a new patch set (#10). ( http://gerrit.cloudera.org:8080/13095 )

Change subject: IMPALA-8444: Fix performance regression when building privilege name
......................................................................

IMPALA-8444: Fix performance regression when building privilege name

This patch fixes the performance regression when building privilege name
by rewriting PrincipalPrivilege.buildPrivilegeName() with a simple
string concatentation instead of using a list that gets converted into a
string.

Below is the result of running a benchmark using JMH comparing the old
and new implementations:

Result "org.apache.impala.BuildPrivilegeNameBenchmark.fast":
  0.344 ±(99.9%) 0.004 us/op [Average]
  (min, avg, max) = (0.336, 0.344, 0.355), stdev = 0.005
  CI (99.9%): [0.339, 0.348] (assumes normal distribution)

Result "org.apache.impala.BuildPrivilegeNameBenchmark.slow":
  0.831 ±(99.9%) 0.011 us/op [Average]
  (min, avg, max) = (0.807, 0.831, 0.856), stdev = 0.015
  CI (99.9%): [0.820, 0.842] (assumes normal distribution)

Benchmark                         Mode  Cnt  Score   Error Units
BuildPrivilegeNameBenchmark.fast  avgt   25  0.344 ± 0.004 us/op
BuildPrivilegeNameBenchmark.slow  avgt   25  0.831 ± 0.011 us/op

This patch also updates SentryAuthorizationPolicy.listPrivileges() to
reuse the privilege names that have already been built instead of
building them again. While fixing this, I found a bug where Principal
stores the PrincipalPrivilege in a case insensitive way. This is true
for all privilege scopes, except URI. This patch fixes the issue by
making privilege name to be case sensitive instead.

This patch removes incorrect synchronization in
SentryAuthorizationPolicy.listPrivileges() that can cause the operation
to run in serial in a highly concurrent workload.

Testing:
- Ran all FE tests
- Ran all E2E authorization tests
- Added E2E test for privilege name case sensitivity bug

Change-Id: I942d9b55f07c8972f69e532567d9b7d80fceb6e5
---
M fe/src/main/java/org/apache/impala/authorization/AuthorizationPolicy.java
M fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationPolicy.java
M fe/src/main/java/org/apache/impala/authorization/sentry/SentryProxy.java
M fe/src/main/java/org/apache/impala/catalog/Catalog.java
M fe/src/main/java/org/apache/impala/catalog/CatalogObjectCache.java
M fe/src/main/java/org/apache/impala/catalog/Principal.java
M fe/src/main/java/org/apache/impala/catalog/PrincipalPrivilege.java
M tests/authorization/test_grant_revoke.py
8 files changed, 141 insertions(+), 126 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/95/13095/10
-- 
To view, visit http://gerrit.cloudera.org:8080/13095
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I942d9b55f07c8972f69e532567d9b7d80fceb6e5
Gerrit-Change-Number: 13095
Gerrit-PatchSet: 10
Gerrit-Owner: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>

[Impala-ASF-CR] IMPALA-8444: Fix performance regression when building privilege name

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13095 )

Change subject: IMPALA-8444: Fix performance regression when building privilege name
......................................................................


Patch Set 9:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/2934/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/13095
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I942d9b55f07c8972f69e532567d9b7d80fceb6e5
Gerrit-Change-Number: 13095
Gerrit-PatchSet: 9
Gerrit-Owner: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Fri, 26 Apr 2019 16:52:18 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8444: Fix performance regression when building privilege name

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13095 )

Change subject: IMPALA-8444: Fix performance regression when building privilege name
......................................................................


Patch Set 11: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/13095
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I942d9b55f07c8972f69e532567d9b7d80fceb6e5
Gerrit-Change-Number: 13095
Gerrit-PatchSet: 11
Gerrit-Owner: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Fri, 26 Apr 2019 16:07:21 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8444: Fix performance regression when building privilege name

Posted by "Fredy Wijaya (Code Review)" <ge...@cloudera.org>.
Fredy Wijaya has posted comments on this change. ( http://gerrit.cloudera.org:8080/13095 )

Change subject: IMPALA-8444: Fix performance regression when building privilege name
......................................................................


Patch Set 8:

Csaba, this is ready for review.


-- 
To view, visit http://gerrit.cloudera.org:8080/13095
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I942d9b55f07c8972f69e532567d9b7d80fceb6e5
Gerrit-Change-Number: 13095
Gerrit-PatchSet: 8
Gerrit-Owner: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Thu, 25 Apr 2019 20:28:19 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8444: Fix performance regression when building privilege name

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/13095 )

Change subject: IMPALA-8444: Fix performance regression when building privilege name
......................................................................


Patch Set 9: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13095/9//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13095/9//COMMIT_MSG@45
PS9, Line 45: insensitive
nit: probably "case sensitivity" would be better



-- 
To view, visit http://gerrit.cloudera.org:8080/13095
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I942d9b55f07c8972f69e532567d9b7d80fceb6e5
Gerrit-Change-Number: 13095
Gerrit-PatchSet: 9
Gerrit-Owner: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Fri, 26 Apr 2019 16:04:45 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-8444: Fix performance regression when building privilege name

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13095 )

Change subject: IMPALA-8444: Fix performance regression when building privilege name
......................................................................


Patch Set 11:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4084/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/13095
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I942d9b55f07c8972f69e532567d9b7d80fceb6e5
Gerrit-Change-Number: 13095
Gerrit-PatchSet: 11
Gerrit-Owner: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Fri, 26 Apr 2019 16:07:22 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8444: Fix performance regression when building privilege name

Posted by "Fredy Wijaya (Code Review)" <ge...@cloudera.org>.
Fredy Wijaya has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/13095 )

Change subject: IMPALA-8444: Fix performance regression when building privilege name
......................................................................

IMPALA-8444: Fix performance regression when building privilege name

This patch fixes the performance regression when building privilege name
by rewriting PrincipalPrivilege.buildPrivilegeName() with a simple
string concatentation instead of using a list that gets converted into a
string.

Below is the result of running a benchmark using JMH comparing the old
and new implementations:

Result "org.apache.impala.BuildPrivilegeNameBenchmark.fast":
  0.344 ±(99.9%) 0.004 us/op [Average]
  (min, avg, max) = (0.336, 0.344, 0.355), stdev = 0.005
  CI (99.9%): [0.339, 0.348] (assumes normal distribution)

Result "org.apache.impala.BuildPrivilegeNameBenchmark.slow":
  0.831 ±(99.9%) 0.011 us/op [Average]
  (min, avg, max) = (0.807, 0.831, 0.856), stdev = 0.015
  CI (99.9%): [0.820, 0.842] (assumes normal distribution)

Benchmark                         Mode  Cnt  Score   Error Units
BuildPrivilegeNameBenchmark.fast  avgt   25  0.344 ± 0.004 us/op
BuildPrivilegeNameBenchmark.slow  avgt   25  0.831 ± 0.011 us/op

This patch also updates SentryAuthorizationPolicy.listPrivileges() to
reuse the privilege names that have already been built instead of
building them again. While fixing this, I found a bug where Principal
stores the PrincipalPrivilege in case insensitive way. This is true for
all privilege scopes, except URI. This patch fixes the issue by storing
URI privilege in a case sensitive manner.

This patch removes the synchronized keyword in
SentryAuthorizationPolicy.listPrivileges() in order to not cause the
operation to run in serial in a highly concurrent workload. This has a
trade off of stale authorization metadata, which may be acceptable to
get a better performance.

Testing:
- Ran all FE tests
- Ran all E2E authorization tests
- Added E2E test for URI privilege bug

Change-Id: I942d9b55f07c8972f69e532567d9b7d80fceb6e5
---
M fe/src/main/java/org/apache/impala/authorization/AuthorizationPolicy.java
M fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationPolicy.java
M fe/src/main/java/org/apache/impala/catalog/CatalogObjectCache.java
M fe/src/main/java/org/apache/impala/catalog/Principal.java
M fe/src/main/java/org/apache/impala/catalog/PrincipalPrivilege.java
M tests/authorization/test_grant_revoke.py
6 files changed, 153 insertions(+), 121 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/95/13095/4
-- 
To view, visit http://gerrit.cloudera.org:8080/13095
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I942d9b55f07c8972f69e532567d9b7d80fceb6e5
Gerrit-Change-Number: 13095
Gerrit-PatchSet: 4
Gerrit-Owner: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>

[Impala-ASF-CR] IMPALA-8444: Fix performance regression when building privilege name

Posted by "Fredy Wijaya (Code Review)" <ge...@cloudera.org>.
Fredy Wijaya has posted comments on this change. ( http://gerrit.cloudera.org:8080/13095 )

Change subject: IMPALA-8444: Fix performance regression when building privilege name
......................................................................


Patch Set 3:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/13095/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13095/2//COMMIT_MSG@26
PS2, Line 26: 
            : Benchmark                         Mode  Cnt  Score   Error Uni
> The numbers seem weird, as 10?³ == 0.001, so to only difference seems to be
I used ms instead of us. I think for a number that's way too small, JMH doesn't show it correctly. I re-ran the benchmark with us instead.


http://gerrit.cloudera.org:8080/#/c/13095/2//COMMIT_MSG@36
PS2, Line 36: URI privilege in a case sensitive manner.
            : 
            : This patch removes the synchronized keyword in
            : SentryAuthorizationPolicy.listPrivileges() in order to not cause the
            : operation to run in seria
> Note that I do not have a solid understanding about integration with Sentry
As part of the refactoring AuthorizationPolicy to SentryAuthorizationPolicy, the concurrency model is no longer correct. In the past, it listPrivileges() used to be in AuthorizationPolicy. Here's what I mean by staleness.

List<Role> grantedRoles = authzPolicy_.getGrantedRoles(groupName); // yes, this operation is thread-safe
// by the time we reach here, another thread may be granting a new role into the group, hence we're now iterating on the stale granted roles
for (Role role: grantedRoles)

To avoid this issue, we can lock on authzPolicy_ in listPrivileges() but I also agree that this has a side effect of not allowing concurrent modifications.


http://gerrit.cloudera.org:8080/#/c/13095/2/fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationPolicy.java
File fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationPolicy.java:

http://gerrit.cloudera.org:8080/#/c/13095/2/fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationPolicy.java@67
PS2, Line 67:           if (privilegeName == null) {
            :             if (LOG.isTraceEnabled()) {
            :               LOG.trace("Ignoring invalid privilege: " + privilegeName);
            :             }
            :             continue;
            :           }
> I think that we do not need to handle the null case, as the keys cannot be 
I missed this. Done.


http://gerrit.cloudera.org:8080/#/c/13095/2/fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationPolicy.java@91
PS2, Line 91:           if (privilegeName == null) {
            :             if (LOG.isTraceEnabled()) {
            :               LOG.trace("Ignoring invalid privilege: " + privilegeName);
            :             }
            :             continue;
            :           }
> See comment at line 67.
Done


http://gerrit.cloudera.org:8080/#/c/13095/2/fe/src/main/java/org/apache/impala/catalog/CatalogObjectCache.java
File fe/src/main/java/org/apache/impala/catalog/CatalogObjectCache.java:

http://gerrit.cloudera.org:8080/#/c/13095/2/fe/src/main/java/org/apache/impala/catalog/CatalogObjectCache.java@67
PS2, Line 67:    * The case sensitivity of name will depend on the value of 'caseInsensitiveKeys'.
> nit: missing 'of'?
Done


http://gerrit.cloudera.org:8080/#/c/13095/2/fe/src/main/java/org/apache/impala/catalog/Principal.java
File fe/src/main/java/org/apache/impala/catalog/Principal.java:

http://gerrit.cloudera.org:8080/#/c/13095/2/fe/src/main/java/org/apache/impala/catalog/Principal.java@63
PS2, Line 63:     // URIs are case sensitive. Sentry treats everything in the URI including the host
            :     // name as case sensitive.
            :     return principalPrivileges_.add(privilege,
> if non-URI parts of the name are always lower case, then it doesn't matter if we treat them as case insensitive or not

When calling getPrivilege(String privilegeName) for non URI, we need to know the exact case that's stored in the cache. It's simpler to special case the URI and make everything else to be case insensitive.

> if non-URI parts of the name can be upper case (including 'server'), then this add() will also treat the 'server' part of URIs in case sensitive way, which seems wrong

Yeah I agree, this is weird. This is actually a bug in Sentry. Sentry treats everything in the URI as case sensitive including the host name. We can fix it by always lower casing the host name before sending it to Sentry. However if a privilege was granted outside Impala that contains upper case host name, there's nothing much that Impala can do. I'll add a comment in the code.


http://gerrit.cloudera.org:8080/#/c/13095/2/fe/src/main/java/org/apache/impala/catalog/Principal.java@97
PS2, Line 97:   public PrincipalPrivilege removePrivilege(String privilegeName) {
            :     // URIs are case sensitive. Sentry treats everything in the URI including the host
> See comment at line 63.
Same as above.


http://gerrit.cloudera.org:8080/#/c/13095/2/tests/authorization/test_grant_revoke.py
File tests/authorization/test_grant_revoke.py:

http://gerrit.cloudera.org:8080/#/c/13095/2/tests/authorization/test_grant_revoke.py@389
PS2, Line 389: test_uri_privilege_case_sensitive
> Is it possible to add two uri privileges that only differ in case sensitivi
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/13095
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I942d9b55f07c8972f69e532567d9b7d80fceb6e5
Gerrit-Change-Number: 13095
Gerrit-PatchSet: 3
Gerrit-Owner: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Wed, 24 Apr 2019 16:40:23 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-8444: Fix performance regression when building privilege name

Posted by "Fredy Wijaya (Code Review)" <ge...@cloudera.org>.
Fredy Wijaya has posted comments on this change. ( http://gerrit.cloudera.org:8080/13095 )

Change subject: IMPALA-8444: Fix performance regression when building privilege name
......................................................................


Patch Set 7: Code-Review+1

(1 comment)

Carry Bharath's +1.

http://gerrit.cloudera.org:8080/#/c/13095/6/fe/src/main/java/org/apache/impala/catalog/Principal.java
File fe/src/main/java/org/apache/impala/catalog/Principal.java:

http://gerrit.cloudera.org:8080/#/c/13095/6/fe/src/main/java/org/apache/impala/catalog/Principal.java@82
PS6, Line 82:    */
> Actually thinking again, we are at a risk of exposing the underlying keySet
Done.



-- 
To view, visit http://gerrit.cloudera.org:8080/13095
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I942d9b55f07c8972f69e532567d9b7d80fceb6e5
Gerrit-Change-Number: 13095
Gerrit-PatchSet: 7
Gerrit-Owner: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Wed, 24 Apr 2019 22:07:29 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-8444: Fix performance regression when building privilege name

Posted by "Fredy Wijaya (Code Review)" <ge...@cloudera.org>.
Fredy Wijaya has posted comments on this change. ( http://gerrit.cloudera.org:8080/13095 )

Change subject: IMPALA-8444: Fix performance regression when building privilege name
......................................................................


Patch Set 10: Code-Review+2

(1 comment)

Carry Csaba's +2

http://gerrit.cloudera.org:8080/#/c/13095/9//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13095/9//COMMIT_MSG@45
PS9, Line 45: sensitivity
> nit: probably "case sensitivity" would be better
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/13095
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I942d9b55f07c8972f69e532567d9b7d80fceb6e5
Gerrit-Change-Number: 13095
Gerrit-PatchSet: 10
Gerrit-Owner: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Fri, 26 Apr 2019 16:06:58 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-8444: Fix performance regression when building privilege name

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13095 )

Change subject: IMPALA-8444: Fix performance regression when building privilege name
......................................................................


Patch Set 10:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/2935/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/13095
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I942d9b55f07c8972f69e532567d9b7d80fceb6e5
Gerrit-Change-Number: 13095
Gerrit-PatchSet: 10
Gerrit-Owner: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Fri, 26 Apr 2019 17:00:50 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8444: Fix performance regression when building privilege name

Posted by "Fredy Wijaya (Code Review)" <ge...@cloudera.org>.
Fredy Wijaya has uploaded a new patch set (#6). ( http://gerrit.cloudera.org:8080/13095 )

Change subject: IMPALA-8444: Fix performance regression when building privilege name
......................................................................

IMPALA-8444: Fix performance regression when building privilege name

This patch fixes the performance regression when building privilege name
by rewriting PrincipalPrivilege.buildPrivilegeName() with a simple
string concatentation instead of using a list that gets converted into a
string.

Below is the result of running a benchmark using JMH comparing the old
and new implementations:

Result "org.apache.impala.BuildPrivilegeNameBenchmark.fast":
  0.344 ±(99.9%) 0.004 us/op [Average]
  (min, avg, max) = (0.336, 0.344, 0.355), stdev = 0.005
  CI (99.9%): [0.339, 0.348] (assumes normal distribution)

Result "org.apache.impala.BuildPrivilegeNameBenchmark.slow":
  0.831 ±(99.9%) 0.011 us/op [Average]
  (min, avg, max) = (0.807, 0.831, 0.856), stdev = 0.015
  CI (99.9%): [0.820, 0.842] (assumes normal distribution)

Benchmark                         Mode  Cnt  Score   Error Units
BuildPrivilegeNameBenchmark.fast  avgt   25  0.344 ± 0.004 us/op
BuildPrivilegeNameBenchmark.slow  avgt   25  0.831 ± 0.011 us/op

This patch also updates SentryAuthorizationPolicy.listPrivileges() to
reuse the privilege names that have already been built instead of
building them again. While fixing this, I found a bug where Principal
stores the PrincipalPrivilege in case insensitive way. This is true for
all privilege scopes, except URI. This patch fixes the issue by storing
URI privilege in a case sensitive manner.

This patch removes incorrect synchronization in
SentryAuthorizationPolicy.listPrivileges() that can cause the operation
to run in serial in a highly concurrent workload.

Testing:
- Ran all FE tests
- Ran all E2E authorization tests
- Added E2E test for URI privilege bug

Change-Id: I942d9b55f07c8972f69e532567d9b7d80fceb6e5
---
M fe/src/main/java/org/apache/impala/authorization/AuthorizationPolicy.java
M fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationPolicy.java
M fe/src/main/java/org/apache/impala/catalog/CatalogObjectCache.java
M fe/src/main/java/org/apache/impala/catalog/Principal.java
M fe/src/main/java/org/apache/impala/catalog/PrincipalPrivilege.java
M tests/authorization/test_grant_revoke.py
6 files changed, 155 insertions(+), 123 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/95/13095/6
-- 
To view, visit http://gerrit.cloudera.org:8080/13095
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I942d9b55f07c8972f69e532567d9b7d80fceb6e5
Gerrit-Change-Number: 13095
Gerrit-PatchSet: 6
Gerrit-Owner: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>

[Impala-ASF-CR] IMPALA-8444: Fix performance regression when building privilege name

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13095 )

Change subject: IMPALA-8444: Fix performance regression when building privilege name
......................................................................


Patch Set 7:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/2899/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/13095
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I942d9b55f07c8972f69e532567d9b7d80fceb6e5
Gerrit-Change-Number: 13095
Gerrit-PatchSet: 7
Gerrit-Owner: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Wed, 24 Apr 2019 22:51:34 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8444: Fix performance regression when building privilege name

Posted by "Fredy Wijaya (Code Review)" <ge...@cloudera.org>.
Fredy Wijaya has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/13095 )

Change subject: IMPALA-8444: Fix performance regression when building privilege name
......................................................................

IMPALA-8444: Fix performance regression when building privilege name

This patch fixes the performance regression when building privilege name
by rewriting PrincipalPrivilege.buildPrivilegeName() with a simple
string concatentation instead of using a list that gets converted into a
string.

Below is the result of running a benchmark using JMH comparing the old
and new implementations:

Result "org.apache.impala.PrivilegeNameBenchmark.fast":
  ≈ 10⁻³ ms/op

Result "org.apache.impala.PrivilegeNameBenchmark.slow":
  0.001 ±(99.9%) 0.001 ms/op [Average]
  (min, avg, max) = (0.001, 0.001, 0.001), stdev = 0.001
  CI (99.9%): [0.001, 0.001] (assumes normal distribution)

Benchmark                    Mode  Cnt   Score    Error  Units
PrivilegeNameBenchmark.fast  avgt   25  ≈ 10⁻³           ms/op
PrivilegeNameBenchmark.slow  avgt   25   0.001 ±  0.001  ms/op

This patch also updates SentryAuthorizationPolicy.listPrivileges() to
reuse the privilege names that have already been built instead of
building them again. While fixing this, I found a bug where Principal
stores the PrincipalPrivilege in case insensitive way. This is true for
all privilege scopes, except URI. This patch fixes the issue by storing
URI privilege in a case sensitive manner.

This patch removes the synchronized keyword in
SentryAuthorizationPolicy.listPrivileges() in order to not cause the
operation to run in serial in a highly concurrent workload. This has a
trade off of stale authorization metadata, which may be acceptable to
get a better performance.

Testing:
- Ran all FE tests
- Ran all E2E authorization tests
- Added E2E test for URI privilege bug

Change-Id: I942d9b55f07c8972f69e532567d9b7d80fceb6e5
---
M fe/src/main/java/org/apache/impala/authorization/AuthorizationPolicy.java
M fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationPolicy.java
M fe/src/main/java/org/apache/impala/catalog/CatalogObjectCache.java
M fe/src/main/java/org/apache/impala/catalog/Principal.java
M fe/src/main/java/org/apache/impala/catalog/PrincipalPrivilege.java
M tests/authorization/test_grant_revoke.py
6 files changed, 121 insertions(+), 78 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/95/13095/2
-- 
To view, visit http://gerrit.cloudera.org:8080/13095
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I942d9b55f07c8972f69e532567d9b7d80fceb6e5
Gerrit-Change-Number: 13095
Gerrit-PatchSet: 2
Gerrit-Owner: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>

[Impala-ASF-CR] IMPALA-8444: Fix performance regression when building privilege name

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13095 )

Change subject: IMPALA-8444: Fix performance regression when building privilege name
......................................................................


Patch Set 3:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/2885/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/13095
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I942d9b55f07c8972f69e532567d9b7d80fceb6e5
Gerrit-Change-Number: 13095
Gerrit-PatchSet: 3
Gerrit-Owner: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Wed, 24 Apr 2019 17:25:20 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8444: Fix performance regression when building privilege name

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/13095 )

Change subject: IMPALA-8444: Fix performance regression when building privilege name
......................................................................

IMPALA-8444: Fix performance regression when building privilege name

This patch fixes the performance regression when building privilege name
by rewriting PrincipalPrivilege.buildPrivilegeName() with a simple
string concatentation instead of using a list that gets converted into a
string.

Below is the result of running a benchmark using JMH comparing the old
and new implementations:

Result "org.apache.impala.BuildPrivilegeNameBenchmark.fast":
  0.344 ±(99.9%) 0.004 us/op [Average]
  (min, avg, max) = (0.336, 0.344, 0.355), stdev = 0.005
  CI (99.9%): [0.339, 0.348] (assumes normal distribution)

Result "org.apache.impala.BuildPrivilegeNameBenchmark.slow":
  0.831 ±(99.9%) 0.011 us/op [Average]
  (min, avg, max) = (0.807, 0.831, 0.856), stdev = 0.015
  CI (99.9%): [0.820, 0.842] (assumes normal distribution)

Benchmark                         Mode  Cnt  Score   Error Units
BuildPrivilegeNameBenchmark.fast  avgt   25  0.344 ± 0.004 us/op
BuildPrivilegeNameBenchmark.slow  avgt   25  0.831 ± 0.011 us/op

This patch also updates SentryAuthorizationPolicy.listPrivileges() to
reuse the privilege names that have already been built instead of
building them again. While fixing this, I found a bug where Principal
stores the PrincipalPrivilege in a case insensitive way. This is true
for all privilege scopes, except URI. This patch fixes the issue by
making privilege name to be case sensitive instead.

This patch removes incorrect synchronization in
SentryAuthorizationPolicy.listPrivileges() that can cause the operation
to run in serial in a highly concurrent workload.

Testing:
- Ran all FE tests
- Ran all E2E authorization tests
- Added E2E test for privilege name case sensitivity bug

Change-Id: I942d9b55f07c8972f69e532567d9b7d80fceb6e5
Reviewed-on: http://gerrit.cloudera.org:8080/13095
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
M fe/src/main/java/org/apache/impala/authorization/AuthorizationPolicy.java
M fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationPolicy.java
M fe/src/main/java/org/apache/impala/authorization/sentry/SentryProxy.java
M fe/src/main/java/org/apache/impala/catalog/Catalog.java
M fe/src/main/java/org/apache/impala/catalog/CatalogObjectCache.java
M fe/src/main/java/org/apache/impala/catalog/Principal.java
M fe/src/main/java/org/apache/impala/catalog/PrincipalPrivilege.java
M tests/authorization/test_grant_revoke.py
8 files changed, 141 insertions(+), 126 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/13095
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I942d9b55f07c8972f69e532567d9b7d80fceb6e5
Gerrit-Change-Number: 13095
Gerrit-PatchSet: 12
Gerrit-Owner: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>

[Impala-ASF-CR] IMPALA-8444: Fix performance regression when building privilege name

Posted by "Fredy Wijaya (Code Review)" <ge...@cloudera.org>.
Fredy Wijaya has uploaded a new patch set (#9). ( http://gerrit.cloudera.org:8080/13095 )

Change subject: IMPALA-8444: Fix performance regression when building privilege name
......................................................................

IMPALA-8444: Fix performance regression when building privilege name

This patch fixes the performance regression when building privilege name
by rewriting PrincipalPrivilege.buildPrivilegeName() with a simple
string concatentation instead of using a list that gets converted into a
string.

Below is the result of running a benchmark using JMH comparing the old
and new implementations:

Result "org.apache.impala.BuildPrivilegeNameBenchmark.fast":
  0.344 ±(99.9%) 0.004 us/op [Average]
  (min, avg, max) = (0.336, 0.344, 0.355), stdev = 0.005
  CI (99.9%): [0.339, 0.348] (assumes normal distribution)

Result "org.apache.impala.BuildPrivilegeNameBenchmark.slow":
  0.831 ±(99.9%) 0.011 us/op [Average]
  (min, avg, max) = (0.807, 0.831, 0.856), stdev = 0.015
  CI (99.9%): [0.820, 0.842] (assumes normal distribution)

Benchmark                         Mode  Cnt  Score   Error Units
BuildPrivilegeNameBenchmark.fast  avgt   25  0.344 ± 0.004 us/op
BuildPrivilegeNameBenchmark.slow  avgt   25  0.831 ± 0.011 us/op

This patch also updates SentryAuthorizationPolicy.listPrivileges() to
reuse the privilege names that have already been built instead of
building them again. While fixing this, I found a bug where Principal
stores the PrincipalPrivilege in a case insensitive way. This is true
for all privilege scopes, except URI. This patch fixes the issue by
making privilege name to be case sensitive instead.

This patch removes incorrect synchronization in
SentryAuthorizationPolicy.listPrivileges() that can cause the operation
to run in serial in a highly concurrent workload.

Testing:
- Ran all FE tests
- Ran all E2E authorization tests
- Added E2E test for privilege name case insensitive bug

Change-Id: I942d9b55f07c8972f69e532567d9b7d80fceb6e5
---
M fe/src/main/java/org/apache/impala/authorization/AuthorizationPolicy.java
M fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationPolicy.java
M fe/src/main/java/org/apache/impala/authorization/sentry/SentryProxy.java
M fe/src/main/java/org/apache/impala/catalog/Catalog.java
M fe/src/main/java/org/apache/impala/catalog/CatalogObjectCache.java
M fe/src/main/java/org/apache/impala/catalog/Principal.java
M fe/src/main/java/org/apache/impala/catalog/PrincipalPrivilege.java
M tests/authorization/test_grant_revoke.py
8 files changed, 141 insertions(+), 126 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/95/13095/9
-- 
To view, visit http://gerrit.cloudera.org:8080/13095
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I942d9b55f07c8972f69e532567d9b7d80fceb6e5
Gerrit-Change-Number: 13095
Gerrit-PatchSet: 9
Gerrit-Owner: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>

[Impala-ASF-CR] IMPALA-8444: Fix performance regression when building privilege name

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13095 )

Change subject: IMPALA-8444: Fix performance regression when building privilege name
......................................................................


Patch Set 6:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/2892/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/13095
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I942d9b55f07c8972f69e532567d9b7d80fceb6e5
Gerrit-Change-Number: 13095
Gerrit-PatchSet: 6
Gerrit-Owner: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Wed, 24 Apr 2019 20:21:16 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8444: Fix performance regression when building privilege name

Posted by "Fredy Wijaya (Code Review)" <ge...@cloudera.org>.
Fredy Wijaya has uploaded a new patch set (#7). ( http://gerrit.cloudera.org:8080/13095 )

Change subject: IMPALA-8444: Fix performance regression when building privilege name
......................................................................

IMPALA-8444: Fix performance regression when building privilege name

This patch fixes the performance regression when building privilege name
by rewriting PrincipalPrivilege.buildPrivilegeName() with a simple
string concatentation instead of using a list that gets converted into a
string.

Below is the result of running a benchmark using JMH comparing the old
and new implementations:

Result "org.apache.impala.BuildPrivilegeNameBenchmark.fast":
  0.344 ±(99.9%) 0.004 us/op [Average]
  (min, avg, max) = (0.336, 0.344, 0.355), stdev = 0.005
  CI (99.9%): [0.339, 0.348] (assumes normal distribution)

Result "org.apache.impala.BuildPrivilegeNameBenchmark.slow":
  0.831 ±(99.9%) 0.011 us/op [Average]
  (min, avg, max) = (0.807, 0.831, 0.856), stdev = 0.015
  CI (99.9%): [0.820, 0.842] (assumes normal distribution)

Benchmark                         Mode  Cnt  Score   Error Units
BuildPrivilegeNameBenchmark.fast  avgt   25  0.344 ± 0.004 us/op
BuildPrivilegeNameBenchmark.slow  avgt   25  0.831 ± 0.011 us/op

This patch also updates SentryAuthorizationPolicy.listPrivileges() to
reuse the privilege names that have already been built instead of
building them again. While fixing this, I found a bug where Principal
stores the PrincipalPrivilege in case insensitive way. This is true for
all privilege scopes, except URI. This patch fixes the issue by storing
URI privilege in a case sensitive manner.

This patch removes incorrect synchronization in
SentryAuthorizationPolicy.listPrivileges() that can cause the operation
to run in serial in a highly concurrent workload.

Testing:
- Ran all FE tests
- Ran all E2E authorization tests
- Added E2E test for URI privilege bug

Change-Id: I942d9b55f07c8972f69e532567d9b7d80fceb6e5
---
M fe/src/main/java/org/apache/impala/authorization/AuthorizationPolicy.java
M fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationPolicy.java
M fe/src/main/java/org/apache/impala/catalog/CatalogObjectCache.java
M fe/src/main/java/org/apache/impala/catalog/Principal.java
M fe/src/main/java/org/apache/impala/catalog/PrincipalPrivilege.java
M tests/authorization/test_grant_revoke.py
6 files changed, 157 insertions(+), 123 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/95/13095/7
-- 
To view, visit http://gerrit.cloudera.org:8080/13095
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I942d9b55f07c8972f69e532567d9b7d80fceb6e5
Gerrit-Change-Number: 13095
Gerrit-PatchSet: 7
Gerrit-Owner: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>

[Impala-ASF-CR] IMPALA-8444: Fix performance regression when building privilege name

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13095 )

Change subject: IMPALA-8444: Fix performance regression when building privilege name
......................................................................


Patch Set 8:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/2920/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/13095
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I942d9b55f07c8972f69e532567d9b7d80fceb6e5
Gerrit-Change-Number: 13095
Gerrit-PatchSet: 8
Gerrit-Owner: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Thu, 25 Apr 2019 21:22:44 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8444: Fix performance regression when building privilege name

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/13095 )

Change subject: IMPALA-8444: Fix performance regression when building privilege name
......................................................................


Patch Set 2:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/13095/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13095/2//COMMIT_MSG@26
PS2, Line 26: PrivilegeNameBenchmark.fast  avgt   25  ≈ 10⁻³           ms/op
            : PrivilegeNameBenchmark.slow  avgt   25   0.001 ±  0.001  ms/op
The numbers seem weird, as 10?³ == 0.001, so to only difference seems to be the error, while I expect the new implementation to be at least 2x faster.


http://gerrit.cloudera.org:8080/#/c/13095/2//COMMIT_MSG@36
PS2, Line 36: This patch removes the synchronized keyword in
            : SentryAuthorizationPolicy.listPrivileges() in order to not cause the
            : operation to run in serial in a highly concurrent workload. This has a
            : trade off of stale authorization metadata, which may be acceptable to
            : get a better performance.
Note that I do not have a solid understanding about integration with Sentry, so I may have basic misconceptions.

How can this make the metadata stale (er)? I didn't see any place where SentryAuthorizationPolicy was used as a lock with the exception of listPrivileges() functions, but these seem to be "read functions" without side effect, so I do not see the merit of locking. Locking on authzPolicy_ would be a different case, as AuthorizationPolicy is designed to be thread safe and locking on it will not allow concurrent modifications of the metadata.

I think that if the metadata can change during listPrivileges(), the results can be worse than staleness:

0. role R1 has no privilege P while role R2 has P, and group G has both R1 and R2 roles
1. Sentry: privilege P is granted for role R1
2. Sentry: privilege P is revoked for role R2
3. Impala: listPrivileges() starts for group G, R1's privileges are collected, and P is not included
4. the two changes on Sentry are synchronized with Impala in one refresh
5. Impala: listPrivileges() continues, R2's privileges are collected, and P is not included

So group G transiently doesn't have privilege P in Impala, while it always has it in Sentry. I do not know if it is a design goal to avoid such (transient) issues.

If we would like to see a snapshot of SentryAuthorizationPolicy during listPrivileges(), then we could move the logic of the functions to SentryAuthorizationPolicy, and use a read/write lock to ensure that concurrent readers do not block each other. A more sophisticated way would be to use version numbers to check if the relevant metadata changed during listPrivileges().


http://gerrit.cloudera.org:8080/#/c/13095/2/fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationPolicy.java
File fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationPolicy.java:

http://gerrit.cloudera.org:8080/#/c/13095/2/fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationPolicy.java@67
PS2, Line 67:           if (privilegeName == null) {
            :             if (LOG.isTraceEnabled()) {
            :               LOG.trace("Ignoring invalid privilege: " + privilegeName);
            :             }
            :             continue;
            :           }
I think that we do not need to handle the null case, as the keys cannot be null in the underlying map by design. PrincipalPrivilege.buildPrivilegeName() seems to handle it, but most call sites that used it expected non-null even before https://gerrit.cloudera.org/#/c/11509

If there are places where nullness should be handled, then those are the places where TPrivileges from external sources are added/removed, e.g. SentryProxy.refreshPrivilegesInCatalog().


http://gerrit.cloudera.org:8080/#/c/13095/2/fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationPolicy.java@91
PS2, Line 91:           if (privilegeName == null) {
            :             if (LOG.isTraceEnabled()) {
            :               LOG.trace("Ignoring invalid privilege: " + privilegeName);
            :             }
            :             continue;
            :           }
See comment at line 67.


http://gerrit.cloudera.org:8080/#/c/13095/2/fe/src/main/java/org/apache/impala/catalog/CatalogObjectCache.java
File fe/src/main/java/org/apache/impala/catalog/CatalogObjectCache.java:

http://gerrit.cloudera.org:8080/#/c/13095/2/fe/src/main/java/org/apache/impala/catalog/CatalogObjectCache.java@67
PS2, Line 67:    * The case sensitivity name will depend on the value of 'caseInsensitiveKeys'.
nit: missing 'of'?


http://gerrit.cloudera.org:8080/#/c/13095/2/fe/src/main/java/org/apache/impala/catalog/Principal.java
File fe/src/main/java/org/apache/impala/catalog/Principal.java:

http://gerrit.cloudera.org:8080/#/c/13095/2/fe/src/main/java/org/apache/impala/catalog/Principal.java@63
PS2, Line 63:     // URIs are case sensitive.
            :     return principalPrivileges_.add(privilege,
            :         privilege.getPrivilege().getScope() != TPrivilegeScope.URI);
I would prefer to change the whole principalPrivileges_ to be case sensitive based on the following reasoning:
- if non-URI parts of the name are always lower case, then it doesn't matter if we treat them as case insensitive or not
- if non-URI parts of the name can be upper case (including 'server'), then this add() will also treat the 'server' part of URIs in case sensitive way, which seems wrong

Note that doing it case sensitive way is faster, as the toLower() calls are skipped.


http://gerrit.cloudera.org:8080/#/c/13095/2/fe/src/main/java/org/apache/impala/catalog/Principal.java@97
PS2, Line 97:     // URIs are case sensitive.
            :     return principalPrivileges_.remove(privilegeName, !privilegeName.contains("->uri="));
See comment at line 63.


http://gerrit.cloudera.org:8080/#/c/13095/2/tests/authorization/test_grant_revoke.py
File tests/authorization/test_grant_revoke.py:

http://gerrit.cloudera.org:8080/#/c/13095/2/tests/authorization/test_grant_revoke.py@389
PS2, Line 389: test_uri_privilege_case_sensitive
Is it possible to add two uri privileges that only differ in case sensitivity? If there are not tests for this case, then we could test it here.



-- 
To view, visit http://gerrit.cloudera.org:8080/13095
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I942d9b55f07c8972f69e532567d9b7d80fceb6e5
Gerrit-Change-Number: 13095
Gerrit-PatchSet: 2
Gerrit-Owner: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Wed, 24 Apr 2019 10:29:02 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-8444: Fix performance regression when building privilege name

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13095 )

Change subject: IMPALA-8444: Fix performance regression when building privilege name
......................................................................


Patch Set 11: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/13095
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I942d9b55f07c8972f69e532567d9b7d80fceb6e5
Gerrit-Change-Number: 13095
Gerrit-PatchSet: 11
Gerrit-Owner: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Fri, 26 Apr 2019 21:11:42 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8444: Fix performance regression when building privilege name

Posted by "Fredy Wijaya (Code Review)" <ge...@cloudera.org>.
Fredy Wijaya has uploaded a new patch set (#8). ( http://gerrit.cloudera.org:8080/13095 )

Change subject: IMPALA-8444: Fix performance regression when building privilege name
......................................................................

IMPALA-8444: Fix performance regression when building privilege name

This patch fixes the performance regression when building privilege name
by rewriting PrincipalPrivilege.buildPrivilegeName() with a simple
string concatentation instead of using a list that gets converted into a
string.

Below is the result of running a benchmark using JMH comparing the old
and new implementations:

Result "org.apache.impala.BuildPrivilegeNameBenchmark.fast":
  0.344 ±(99.9%) 0.004 us/op [Average]
  (min, avg, max) = (0.336, 0.344, 0.355), stdev = 0.005
  CI (99.9%): [0.339, 0.348] (assumes normal distribution)

Result "org.apache.impala.BuildPrivilegeNameBenchmark.slow":
  0.831 ±(99.9%) 0.011 us/op [Average]
  (min, avg, max) = (0.807, 0.831, 0.856), stdev = 0.015
  CI (99.9%): [0.820, 0.842] (assumes normal distribution)

Benchmark                         Mode  Cnt  Score   Error Units
BuildPrivilegeNameBenchmark.fast  avgt   25  0.344 ± 0.004 us/op
BuildPrivilegeNameBenchmark.slow  avgt   25  0.831 ± 0.011 us/op

This patch also updates SentryAuthorizationPolicy.listPrivileges() to
reuse the privilege names that have already been built instead of
building them again. While fixing this, I found a bug where Principal
stores the PrincipalPrivilege in case insensitive way. This is true for
all privilege scopes, except URI. This patch fixes the issue by storing
URI privilege in a case sensitive manner.

This patch removes incorrect synchronization in
SentryAuthorizationPolicy.listPrivileges() that can cause the operation
to run in serial in a highly concurrent workload.

Testing:
- Ran all FE tests
- Ran all E2E authorization tests
- Added E2E test for URI privilege bug

Change-Id: I942d9b55f07c8972f69e532567d9b7d80fceb6e5
---
M fe/src/main/java/org/apache/impala/authorization/AuthorizationPolicy.java
M fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationPolicy.java
M fe/src/main/java/org/apache/impala/authorization/sentry/SentryProxy.java
M fe/src/main/java/org/apache/impala/catalog/Catalog.java
M fe/src/main/java/org/apache/impala/catalog/CatalogObjectCache.java
M fe/src/main/java/org/apache/impala/catalog/Principal.java
M fe/src/main/java/org/apache/impala/catalog/PrincipalPrivilege.java
M tests/authorization/test_grant_revoke.py
8 files changed, 128 insertions(+), 126 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/95/13095/8
-- 
To view, visit http://gerrit.cloudera.org:8080/13095
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I942d9b55f07c8972f69e532567d9b7d80fceb6e5
Gerrit-Change-Number: 13095
Gerrit-PatchSet: 8
Gerrit-Owner: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>

[Impala-ASF-CR] IMPALA-8444: Fix performance regression when building privilege name

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13095 )

Change subject: IMPALA-8444: Fix performance regression when building privilege name
......................................................................


Patch Set 1:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/2872/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/13095
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I942d9b55f07c8972f69e532567d9b7d80fceb6e5
Gerrit-Change-Number: 13095
Gerrit-PatchSet: 1
Gerrit-Owner: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Wed, 24 Apr 2019 02:13:45 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8444: Fix performance regression when building privilege name

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/13095 )

Change subject: IMPALA-8444: Fix performance regression when building privilege name
......................................................................


Patch Set 8:

(4 comments)

Thanks for the changes! Few more missing bits.

http://gerrit.cloudera.org:8080/#/c/13095/8//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13095/8//COMMIT_MSG@8
PS8, Line 8: 
The commit message could also reflect that there are also some fixes related to case sensitivity.


http://gerrit.cloudera.org:8080/#/c/13095/8//COMMIT_MSG@33
PS8, Line 33: While fixing this, I found a bug where Principal
            : stores the PrincipalPrivilege in case insensitive way. This is true for
            : all privilege scopes, except URI. This patch fixes the issue by storing
            : URI privilege in a case sensitive manner.
Is this still correct?


http://gerrit.cloudera.org:8080/#/c/13095/8/fe/src/main/java/org/apache/impala/catalog/CatalogObjectCache.java
File fe/src/main/java/org/apache/impala/catalog/CatalogObjectCache.java:

http://gerrit.cloudera.org:8080/#/c/13095/8/fe/src/main/java/org/apache/impala/catalog/CatalogObjectCache.java@158
PS8, Line 158: }
Is there some kind of whitespace change here?


http://gerrit.cloudera.org:8080/#/c/13095/2/fe/src/main/java/org/apache/impala/catalog/Principal.java
File fe/src/main/java/org/apache/impala/catalog/Principal.java:

http://gerrit.cloudera.org:8080/#/c/13095/2/fe/src/main/java/org/apache/impala/catalog/Principal.java@63
PS2, Line 63:    */
            :   public boolean addPrivilege(PrincipalPrivilege privilege) {
            :     return principalPrivileges_.add(privilege);
> I decided to change the whole thing to make the cache stores the privilege 
Can you add tests for the bugs that were fixed?



-- 
To view, visit http://gerrit.cloudera.org:8080/13095
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I942d9b55f07c8972f69e532567d9b7d80fceb6e5
Gerrit-Change-Number: 13095
Gerrit-PatchSet: 8
Gerrit-Owner: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Thu, 25 Apr 2019 22:12:03 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-8444: Fix performance regression when building privilege name

Posted by "Fredy Wijaya (Code Review)" <ge...@cloudera.org>.
Fredy Wijaya has posted comments on this change. ( http://gerrit.cloudera.org:8080/13095 )

Change subject: IMPALA-8444: Fix performance regression when building privilege name
......................................................................


Patch Set 8:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/13095/7/fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationPolicy.java
File fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationPolicy.java:

http://gerrit.cloudera.org:8080/#/c/13095/7/fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationPolicy.java@66
PS7, Line 66:         privileges.addAll(role.getPrivilegeNames());
            :       }
            :     }
> I think that this could be further simplified to privileges.addAll(role.get
Done


http://gerrit.cloudera.org:8080/#/c/13095/7/fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationPolicy.java@84
PS7, Line 84:     }
            :     return privileges;
            :   }
> Same as line 66.
Done


http://gerrit.cloudera.org:8080/#/c/13095/2/fe/src/main/java/org/apache/impala/catalog/Principal.java
File fe/src/main/java/org/apache/impala/catalog/Principal.java:

http://gerrit.cloudera.org:8080/#/c/13095/2/fe/src/main/java/org/apache/impala/catalog/Principal.java@63
PS2, Line 63:    */
            :   public boolean addPrivilege(PrincipalPrivilege privilege) {
            :     return principalPrivileges_.add(privilege);
> > > if non-URI parts of the name are always lower case, then it
I decided to change the whole thing to make the cache stores the privilege name in a case sensitive way as per your original suggestion and I do agree that this is actually a better solution. Interestingly I found various bugs while fixing this, which will be updated in the next patchset.


http://gerrit.cloudera.org:8080/#/c/13095/2/fe/src/main/java/org/apache/impala/catalog/Principal.java@89
PS2, Line 89: 
> Don't we need to handle URIs case sensitively in this function? principalPr
Done.


http://gerrit.cloudera.org:8080/#/c/13095/2/fe/src/main/java/org/apache/impala/catalog/Principal.java@97
PS2, Line 97:     return principalPrivileges_.remove(privilegeName);
            :   }
> Same as above.
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/13095
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I942d9b55f07c8972f69e532567d9b7d80fceb6e5
Gerrit-Change-Number: 13095
Gerrit-PatchSet: 8
Gerrit-Owner: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Thu, 25 Apr 2019 20:28:11 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-8444: Fix performance regression when building privilege name

Posted by "Fredy Wijaya (Code Review)" <ge...@cloudera.org>.
Fredy Wijaya has posted comments on this change. ( http://gerrit.cloudera.org:8080/13095 )

Change subject: IMPALA-8444: Fix performance regression when building privilege name
......................................................................


Patch Set 9:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/13095/8//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13095/8//COMMIT_MSG@8
PS8, Line 8: 
> The commit message could also reflect that there are also some fixes relate
Done


http://gerrit.cloudera.org:8080/#/c/13095/8//COMMIT_MSG@33
PS8, Line 33: While fixing this, I found a bug where Principal
            : stores the PrincipalPrivilege in a case insensitive way. This is true
            : for all privilege scopes, except URI. This patch fixes the issue by
            : making privilege name to be case sensitiv
> Is this still correct?
Comment updated. Done.


http://gerrit.cloudera.org:8080/#/c/13095/8/fe/src/main/java/org/apache/impala/catalog/CatalogObjectCache.java
File fe/src/main/java/org/apache/impala/catalog/CatalogObjectCache.java:

http://gerrit.cloudera.org:8080/#/c/13095/8/fe/src/main/java/org/apache/impala/catalog/CatalogObjectCache.java@158
PS8, Line 158: }
> Is there some kind of whitespace change here?
I think this is a gerrit bug. I've seen this before. There should not be any change in this file.


http://gerrit.cloudera.org:8080/#/c/13095/2/fe/src/main/java/org/apache/impala/catalog/Principal.java
File fe/src/main/java/org/apache/impala/catalog/Principal.java:

http://gerrit.cloudera.org:8080/#/c/13095/2/fe/src/main/java/org/apache/impala/catalog/Principal.java@63
PS2, Line 63:    */
            :   public boolean addPrivilege(PrincipalPrivilege privilege) {
            :     return principalPrivileges_.add(privilege);
> Can you add tests for the bugs that were fixed?
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/13095
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I942d9b55f07c8972f69e532567d9b7d80fceb6e5
Gerrit-Change-Number: 13095
Gerrit-PatchSet: 9
Gerrit-Owner: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Fri, 26 Apr 2019 15:52:54 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-8444: Fix performance regression when building privilege name

Posted by "Bharath Vissapragada (Code Review)" <ge...@cloudera.org>.
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/13095 )

Change subject: IMPALA-8444: Fix performance regression when building privilege name
......................................................................


Patch Set 2:

(2 comments)

Couple of minor comments, lgtm otherwise.

http://gerrit.cloudera.org:8080/#/c/13095/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13095/3//COMMIT_MSG@38
PS3, Line 38: operation to run in serial in a highly concurrent workload. This has a
            : trade off of stale authorization metadata, which may be acceptable to
            : get a better performance.
How did you test this? I think this codepath is sensitive in the sense that most prod users run with sentry turned on and any ConccurentModificationException in prod workloads might be a disaster.

Probably run a concurrency stress test of some sorts with sentry turned on?


http://gerrit.cloudera.org:8080/#/c/13095/3/fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationPolicy.java
File fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationPolicy.java:

http://gerrit.cloudera.org:8080/#/c/13095/3/fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationPolicy.java@90
PS3, Line 90: getPrivilegeNames
Is there a way to avoid a copy to a HashSet()? Not sure what that is buying us.



-- 
To view, visit http://gerrit.cloudera.org:8080/13095
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I942d9b55f07c8972f69e532567d9b7d80fceb6e5
Gerrit-Change-Number: 13095
Gerrit-PatchSet: 2
Gerrit-Owner: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Wed, 24 Apr 2019 18:21:04 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-8444: Fix performance regression when building privilege name

Posted by "Fredy Wijaya (Code Review)" <ge...@cloudera.org>.
Fredy Wijaya has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/13095 )

Change subject: IMPALA-8444: Fix performance regression when building privilege name
......................................................................

IMPALA-8444: Fix performance regression when building privilege name

This patch fixes the performance regression when building privilege name
by rewriting PrincipalPrivilege.buildPrivilegeName() with a simple
string concatentation instead of using a list that gets converted into a
string.

Below is the result of running a benchmark using JMH comparing the old
and new implementations:

Result "org.apache.impala.BuildPrivilegeNameBenchmark.fast":
  0.344 ±(99.9%) 0.004 us/op [Average]
  (min, avg, max) = (0.336, 0.344, 0.355), stdev = 0.005
  CI (99.9%): [0.339, 0.348] (assumes normal distribution)

Result "org.apache.impala.BuildPrivilegeNameBenchmark.slow":
  0.831 ±(99.9%) 0.011 us/op [Average]
  (min, avg, max) = (0.807, 0.831, 0.856), stdev = 0.015
  CI (99.9%): [0.820, 0.842] (assumes normal distribution)

Benchmark                         Mode  Cnt  Score   Error Units
BuildPrivilegeNameBenchmark.fast  avgt   25  0.344 ± 0.004 us/op
BuildPrivilegeNameBenchmark.slow  avgt   25  0.831 ± 0.011 us/op

This patch also updates SentryAuthorizationPolicy.listPrivileges() to
reuse the privilege names that have already been built instead of
building them again. While fixing this, I found a bug where Principal
stores the PrincipalPrivilege in case insensitive way. This is true for
all privilege scopes, except URI. This patch fixes the issue by storing
URI privilege in a case sensitive manner.

This patch removes the synchronized keyword in
SentryAuthorizationPolicy.listPrivileges() in order to not cause the
operation to run in serial in a highly concurrent workload. This has a
trade off of stale authorization metadata, which may be acceptable to
get a better performance.

Testing:
- Ran all FE tests
- Ran all E2E authorization tests
- Added E2E test for URI privilege bug

Change-Id: I942d9b55f07c8972f69e532567d9b7d80fceb6e5
---
M fe/src/main/java/org/apache/impala/authorization/AuthorizationPolicy.java
M fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationPolicy.java
M fe/src/main/java/org/apache/impala/catalog/CatalogObjectCache.java
M fe/src/main/java/org/apache/impala/catalog/Principal.java
M fe/src/main/java/org/apache/impala/catalog/PrincipalPrivilege.java
M tests/authorization/test_grant_revoke.py
6 files changed, 157 insertions(+), 113 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/95/13095/3
-- 
To view, visit http://gerrit.cloudera.org:8080/13095
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I942d9b55f07c8972f69e532567d9b7d80fceb6e5
Gerrit-Change-Number: 13095
Gerrit-PatchSet: 3
Gerrit-Owner: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bh...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Fredy Wijaya <fw...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>