You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sentry.apache.org by Colin Ma <ju...@intel.com> on 2014/12/08 13:14:10 UTC
Review Request 28800: SENTRY-565:Improvement the performance when
Sentry filter the entity
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28800/
-----------------------------------------------------------
Review request for sentry.
Repository: sentry
Description
-------
Currently, when get the metadata from hive, eg, "show tables", "show databases". Sentry will filter the result and output the authorized entities. There will be many RPC calls when filtering the result. The related code is in HiveAuthzBinding, for example, in filterShowTables:
......
for (String tableName : queryResult) {
......
hiveAuthzBinding.authorize(operation, tableMetaDataPrivilege, subject, inputHierarchy,
outputHierarchy, providedPrivileges);
......
}
......
hiveAuthzBinding.authorize will get the privileges from sentry service, if there are many tables in the hive, the filtering process will spend much time. Considering sentry also need to filter the column, HiveAuthzBinding should be improved to reduce the number of rpc calls when doing the filter.
Diffs
-----
sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/HiveAuthzBindingHook.java 97ef3b8
sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/HiveAuthzBindingPreExecHook.java 813200a
sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/authz/HiveAuthzBinding.java b4b69e1
sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/metastore/MetastoreAuthzBinding.java e7f96c1
sentry-binding/sentry-binding-hive/src/test/java/org/apache/sentry/binding/hive/TestHiveAuthzBindings.java d41f6cf
sentry-provider/sentry-provider-common/src/main/java/org/apache/sentry/provider/common/AuthorizationProvider.java a88d2f8
sentry-provider/sentry-provider-common/src/main/java/org/apache/sentry/provider/common/NoAuthorizationProvider.java a814527
sentry-provider/sentry-provider-common/src/main/java/org/apache/sentry/provider/common/ResourceAuthorizationProvider.java 6449405
Diff: https://reviews.apache.org/r/28800/diff/
Testing
-------
Thanks,
Colin Ma
Re: Review Request 28800: SENTRY-565:Improvement the performance when
Sentry filter the entity
Posted by Lenni Kuff <ls...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28800/#review101398
-----------------------------------------------------------
sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/HiveAuthzBindingHook.java (line 668)
<https://reviews.apache.org/r/28800/#comment158808>
Can you add a brief comment on this method and explain the return value?
Maybe rename to getActivePrivilegesForUser()?
sentry-provider/sentry-provider-common/src/main/java/org/apache/sentry/provider/common/AuthorizationProvider.java (line 67)
<https://reviews.apache.org/r/28800/#comment158810>
This is kind of a strange API to me. The caller of this class shouldn't really care about what the privileges are and why would they should rely on Sentry as the source of truth.
I don't know if we want to add this to our main AuthorizationProvider class.
One thing to also look at is the PrivilegeCache / SimpleCacheProviderBackend classes. They are not quite what you are looking for, but do allows you to inject privileges. In your case you could implement a version of that class that just calls getPrivilegesForUser()...
sentry-provider/sentry-provider-common/src/main/java/org/apache/sentry/provider/common/ResourceAuthorizationProvider.java (line 149)
<https://reviews.apache.org/r/28800/#comment158809>
I feel like the previous API was much more structured. You have just encoded this structure into strings, which is going to be harder to debug.
I also don't understand why you would call getPrivileges and pass set of privileges.
Think a bit more about the naming and see if you can find a way to avoid the convertion to strings .
- Lenni Kuff
On Dec. 8, 2014, 12:14 p.m., Colin Ma wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/28800/
> -----------------------------------------------------------
>
> (Updated Dec. 8, 2014, 12:14 p.m.)
>
>
> Review request for sentry.
>
>
> Repository: sentry
>
>
> Description
> -------
>
> Currently, when get the metadata from hive, eg, "show tables", "show databases". Sentry will filter the result and output the authorized entities. There will be many RPC calls when filtering the result. The related code is in HiveAuthzBinding, for example, in filterShowTables:
>
> ......
> for (String tableName : queryResult) {
> ......
> hiveAuthzBinding.authorize(operation, tableMetaDataPrivilege, subject, inputHierarchy,
> outputHierarchy, providedPrivileges);
> ......
> }
> ......
>
> hiveAuthzBinding.authorize will get the privileges from sentry service, if there are many tables in the hive, the filtering process will spend much time. Considering sentry also need to filter the column, HiveAuthzBinding should be improved to reduce the number of rpc calls when doing the filter.
>
>
> Diffs
> -----
>
> sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/HiveAuthzBindingHook.java 97ef3b8
> sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/HiveAuthzBindingPreExecHook.java 813200a
> sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/authz/HiveAuthzBinding.java b4b69e1
> sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/metastore/MetastoreAuthzBinding.java e7f96c1
> sentry-binding/sentry-binding-hive/src/test/java/org/apache/sentry/binding/hive/TestHiveAuthzBindings.java d41f6cf
> sentry-provider/sentry-provider-common/src/main/java/org/apache/sentry/provider/common/AuthorizationProvider.java a88d2f8
> sentry-provider/sentry-provider-common/src/main/java/org/apache/sentry/provider/common/NoAuthorizationProvider.java a814527
> sentry-provider/sentry-provider-common/src/main/java/org/apache/sentry/provider/common/ResourceAuthorizationProvider.java 6449405
>
> Diff: https://reviews.apache.org/r/28800/diff/
>
>
> Testing
> -------
>
>
> Thanks,
>
> Colin Ma
>
>
Re: Review Request 28800: SENTRY-565:Improvement the performance when
Sentry filter the entity
Posted by Colin Ma <ju...@intel.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28800/
-----------------------------------------------------------
(Updated Oct. 28, 2015, 7:42 a.m.)
Review request for sentry and Lenni Kuff.
Repository: sentry
Description
-------
Currently, when get the metadata from hive, eg, "show tables", "show databases". Sentry will filter the result and output the authorized entities. There will be many RPC calls when filtering the result. The related code is in HiveAuthzBinding, for example, in filterShowTables:
......
for (String tableName : queryResult) {
......
hiveAuthzBinding.authorize(operation, tableMetaDataPrivilege, subject, inputHierarchy,
outputHierarchy, providedPrivileges);
......
}
......
hiveAuthzBinding.authorize will get the privileges from sentry service, if there are many tables in the hive, the filtering process will spend much time. Considering sentry also need to filter the column, HiveAuthzBinding should be improved to reduce the number of rpc calls when doing the filter.
Diffs (updated)
-----
sentry-provider/sentry-provider-cache/pom.xml c67f094
sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/CachedPrivilegeWrap.java PRE-CREATION
sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/DBPrivilegeCache.java PRE-CREATION
sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/SimpleCacheProviderBackend.java 4b98447
sentry-provider/sentry-provider-cache/src/test/java/org/apache/sentry/provider/cache/TestCacheProvider.java e5b29b8
sentry-provider/sentry-provider-db/src/gen/thrift/gen-javabean/org/apache/sentry/provider/db/service/thrift/SentryPolicyService.java 0c24449
sentry-provider/sentry-provider-db/src/gen/thrift/gen-javabean/org/apache/sentry/provider/db/service/thrift/TListSentryPrivilegesForCachedRequest.java PRE-CREATION
sentry-provider/sentry-provider-db/src/gen/thrift/gen-javabean/org/apache/sentry/provider/db/service/thrift/TListSentryPrivilegesForCachedResponse.java PRE-CREATION
sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java fbb611e
sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/thrift/SentryPolicyServiceClient.java cbc0aaf
sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/thrift/SentryPolicyServiceClientDefaultImpl.java 74f379a
sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/thrift/SentryPolicyStoreProcessor.java 4f8c834
sentry-provider/sentry-provider-db/src/main/resources/sentry_policy_service.thrift 40889e8
Diff: https://reviews.apache.org/r/28800/diff/
Testing
-------
Thanks,
Colin Ma
Re: Review Request 28800: SENTRY-565:Improvement the performance when
Sentry filter the entity
Posted by Colin Ma <ju...@intel.com>.
> On Oct. 21, 2015, 12:48 a.m., Sravya Tirukkovalur wrote:
> > Thanks for the update Colin! Will it be possible for you to add description on how you are solving the problem. It would be easy to follow along and also would help review the approach before delving into the code details.
>
> Colin Ma wrote:
> hi, Sravya, the following is the current idea for the cached privilege:
> 1. CachedPrivilegeManagement is responsible for getPrivileges, and it likes an agent for providerBackend. CachedPrivilegeManagement will be initialized in HiveAuthzBinding according to the configuration file.
> 2. All cached privileges are stored as HashMap in CachedPrivilegeManagement. The key for this map is username, and the value is wrapped in the CachedPrivilege including all privileges for this username and the timestamp when refresh the privilege.
> 3. When do the authorization, SimpleDBPolicyEngine.getPrivileges will be called, and CachedPrivilegeManagement.getPrivileges will be called. If CachedPrivilegeManagement is initialized and cached privilege enable in configuraion, CachedPrivilegeManagement.getPrivileges will get the privileges from cache first, if not exist or expire, get the privileges by backend and refresh the cache. The expired time can be configured and default is 5 mins.
>
> The core part for this problem is: HiveAuthzBinding, CachedPrivilege, CachedPrivilegeManagement, SimpleDBPolicyEngine, TestCachedPrivilege. The other code is just for the interface change.
> The disadvantage for this solution is if admin change the privilege for user, this will not be effect immediately. There will be delay according to the expire time setting in configuration file. I think this is accepted for the authorization management.
>
> Feel free for any comments, thanks for your review.
>
> Sravya Tirukkovalur wrote:
> Hi Colin,
>
> I think the main problem here is predicate push down. We should focus on what is the best way to be able to send the tables/ db names in the RPC when doing auth check instead of doing auth checks for each table/dbname. Also, we already have a cache provider, if we decide to improve performance of hive binding by moving to a cache provider, we should move everything to using a cache provider and also have a way to refresh this cache. Moving just one path to cache provider does not seem like a right approach as it can lead to behavior which is hard to reason.
Hi, Sravya,
Thanks for the comments, change the implementation to send table/db names in the RPC will be a lots of changes on the core code(eg, SentryStore), this can be a seperate task if necessary.
I think cache privilege is very useful for Sentry, and I'll implement this with the PrivilegeCache.
The main idea will be the same as my description above, maybe some changes on the data structure for cached privilege, using Map<group, cachedPrivilege> instead of Map<username, cachedPrivilege> to reduce the duplicated privileges.
- Colin
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28800/#review103333
-----------------------------------------------------------
On Oct. 16, 2015, 11:18 a.m., Colin Ma wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/28800/
> -----------------------------------------------------------
>
> (Updated Oct. 16, 2015, 11:18 a.m.)
>
>
> Review request for sentry and Lenni Kuff.
>
>
> Repository: sentry
>
>
> Description
> -------
>
> Currently, when get the metadata from hive, eg, "show tables", "show databases". Sentry will filter the result and output the authorized entities. There will be many RPC calls when filtering the result. The related code is in HiveAuthzBinding, for example, in filterShowTables:
>
> ......
> for (String tableName : queryResult) {
> ......
> hiveAuthzBinding.authorize(operation, tableMetaDataPrivilege, subject, inputHierarchy,
> outputHierarchy, providedPrivileges);
> ......
> }
> ......
>
> hiveAuthzBinding.authorize will get the privileges from sentry service, if there are many tables in the hive, the filtering process will spend much time. Considering sentry also need to filter the column, HiveAuthzBinding should be improved to reduce the number of rpc calls when doing the filter.
>
>
> Diffs
> -----
>
> sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/authz/HiveAuthzBinding.java 3071475
> sentry-policy/sentry-policy-common/src/main/java/org/apache/sentry/policy/common/PolicyEngine.java 38a5b65
> sentry-policy/sentry-policy-db/src/main/java/org/apache/sentry/policy/db/CachedPrivilege.java PRE-CREATION
> sentry-policy/sentry-policy-db/src/main/java/org/apache/sentry/policy/db/CachedPrivilegeManagement.java PRE-CREATION
> sentry-policy/sentry-policy-db/src/main/java/org/apache/sentry/policy/db/SimpleDBPolicyEngine.java a03794e
> sentry-policy/sentry-policy-db/src/test/java/org/apache/sentry/policy/db/AbstractTestSimplePolicyEngine.java d1151e3
> sentry-policy/sentry-policy-db/src/test/java/org/apache/sentry/policy/db/TestPolicyParsingNegative.java 5f7c671
> sentry-policy/sentry-policy-db/src/test/java/org/apache/sentry/policy/db/TestSimpleDBPolicyEngineDFS.java f8c36e2
> sentry-policy/sentry-policy-indexer/src/main/java/org/apache/sentry/policy/indexer/SimpleIndexerPolicyEngine.java 2f4bc1d
> sentry-policy/sentry-policy-indexer/src/test/java/org/apache/sentry/policy/indexer/AbstractTestIndexerPolicyEngine.java d7d1ae2
> sentry-policy/sentry-policy-indexer/src/test/java/org/apache/sentry/policy/indexer/TestIndexerPolicyNegative.java 0706560
> sentry-policy/sentry-policy-search/src/main/java/org/apache/sentry/policy/search/SimpleSearchPolicyEngine.java bc518fb
> sentry-policy/sentry-policy-search/src/test/java/org/apache/sentry/policy/search/AbstractTestSearchPolicyEngine.java d1c415b
> sentry-policy/sentry-policy-search/src/test/java/org/apache/sentry/policy/search/TestSearchPolicyNegative.java 2abe8f2
> sentry-policy/sentry-policy-sqoop/src/main/java/org/apache/sentry/policy/sqoop/SimpleSqoopPolicyEngine.java e8615a0
> sentry-policy/sentry-policy-sqoop/src/test/java/org/apache/sentry/policy/sqoop/AbstractTestSqoopPolicyEngine.java 1389fca
> sentry-policy/sentry-policy-sqoop/src/test/java/org/apache/sentry/policy/sqoop/TestSqoopPolicyNegative.java 406e53f
> sentry-provider/sentry-provider-common/src/main/java/org/apache/sentry/provider/common/ResourceAuthorizationProvider.java 06573b7
> sentry-provider/sentry-provider-common/src/test/java/org/apache/sentry/provider/common/TestGetGroupMapping.java f57198a
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/ServiceConstants.java bc35742
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestCachedPrivilege.java PRE-CREATION
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/hive/AbstractTestWithStaticConfiguration.java cc5daef
>
> Diff: https://reviews.apache.org/r/28800/diff/
>
>
> Testing
> -------
>
>
> Thanks,
>
> Colin Ma
>
>
Re: Review Request 28800: SENTRY-565:Improvement the performance when
Sentry filter the entity
Posted by Colin Ma <ju...@intel.com>.
> On Oct. 21, 2015, 12:48 a.m., Sravya Tirukkovalur wrote:
> > Thanks for the update Colin! Will it be possible for you to add description on how you are solving the problem. It would be easy to follow along and also would help review the approach before delving into the code details.
hi, Sravya, the following is the current idea for the cached privilege:
1. CachedPrivilegeManagement is responsible for getPrivileges, and it likes an agent for providerBackend. CachedPrivilegeManagement will be initialized in HiveAuthzBinding according to the configuration file.
2. All cached privileges are stored as HashMap in CachedPrivilegeManagement. The key for this map is username, and the value is wrapped in the CachedPrivilege including all privileges for this username and the timestamp when refresh the privilege.
3. When do the authorization, SimpleDBPolicyEngine.getPrivileges will be called, and CachedPrivilegeManagement.getPrivileges will be called. If CachedPrivilegeManagement is initialized and cached privilege enable in configuraion, CachedPrivilegeManagement.getPrivileges will get the privileges from cache first, if not exist or expire, get the privileges by backend and refresh the cache. The expired time can be configured and default is 5 mins.
The core part for this problem is: HiveAuthzBinding, CachedPrivilege, CachedPrivilegeManagement, SimpleDBPolicyEngine, TestCachedPrivilege. The other code is just for the interface change.
The disadvantage for this solution is if admin change the privilege for user, this will not be effect immediately. There will be delay according to the expire time setting in configuration file. I think this is accepted for the authorization management.
Feel free for any comments, thanks for your review.
- Colin
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28800/#review103333
-----------------------------------------------------------
On Oct. 16, 2015, 11:18 a.m., Colin Ma wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/28800/
> -----------------------------------------------------------
>
> (Updated Oct. 16, 2015, 11:18 a.m.)
>
>
> Review request for sentry and Lenni Kuff.
>
>
> Repository: sentry
>
>
> Description
> -------
>
> Currently, when get the metadata from hive, eg, "show tables", "show databases". Sentry will filter the result and output the authorized entities. There will be many RPC calls when filtering the result. The related code is in HiveAuthzBinding, for example, in filterShowTables:
>
> ......
> for (String tableName : queryResult) {
> ......
> hiveAuthzBinding.authorize(operation, tableMetaDataPrivilege, subject, inputHierarchy,
> outputHierarchy, providedPrivileges);
> ......
> }
> ......
>
> hiveAuthzBinding.authorize will get the privileges from sentry service, if there are many tables in the hive, the filtering process will spend much time. Considering sentry also need to filter the column, HiveAuthzBinding should be improved to reduce the number of rpc calls when doing the filter.
>
>
> Diffs
> -----
>
> sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/authz/HiveAuthzBinding.java 3071475
> sentry-policy/sentry-policy-common/src/main/java/org/apache/sentry/policy/common/PolicyEngine.java 38a5b65
> sentry-policy/sentry-policy-db/src/main/java/org/apache/sentry/policy/db/CachedPrivilege.java PRE-CREATION
> sentry-policy/sentry-policy-db/src/main/java/org/apache/sentry/policy/db/CachedPrivilegeManagement.java PRE-CREATION
> sentry-policy/sentry-policy-db/src/main/java/org/apache/sentry/policy/db/SimpleDBPolicyEngine.java a03794e
> sentry-policy/sentry-policy-db/src/test/java/org/apache/sentry/policy/db/AbstractTestSimplePolicyEngine.java d1151e3
> sentry-policy/sentry-policy-db/src/test/java/org/apache/sentry/policy/db/TestPolicyParsingNegative.java 5f7c671
> sentry-policy/sentry-policy-db/src/test/java/org/apache/sentry/policy/db/TestSimpleDBPolicyEngineDFS.java f8c36e2
> sentry-policy/sentry-policy-indexer/src/main/java/org/apache/sentry/policy/indexer/SimpleIndexerPolicyEngine.java 2f4bc1d
> sentry-policy/sentry-policy-indexer/src/test/java/org/apache/sentry/policy/indexer/AbstractTestIndexerPolicyEngine.java d7d1ae2
> sentry-policy/sentry-policy-indexer/src/test/java/org/apache/sentry/policy/indexer/TestIndexerPolicyNegative.java 0706560
> sentry-policy/sentry-policy-search/src/main/java/org/apache/sentry/policy/search/SimpleSearchPolicyEngine.java bc518fb
> sentry-policy/sentry-policy-search/src/test/java/org/apache/sentry/policy/search/AbstractTestSearchPolicyEngine.java d1c415b
> sentry-policy/sentry-policy-search/src/test/java/org/apache/sentry/policy/search/TestSearchPolicyNegative.java 2abe8f2
> sentry-policy/sentry-policy-sqoop/src/main/java/org/apache/sentry/policy/sqoop/SimpleSqoopPolicyEngine.java e8615a0
> sentry-policy/sentry-policy-sqoop/src/test/java/org/apache/sentry/policy/sqoop/AbstractTestSqoopPolicyEngine.java 1389fca
> sentry-policy/sentry-policy-sqoop/src/test/java/org/apache/sentry/policy/sqoop/TestSqoopPolicyNegative.java 406e53f
> sentry-provider/sentry-provider-common/src/main/java/org/apache/sentry/provider/common/ResourceAuthorizationProvider.java 06573b7
> sentry-provider/sentry-provider-common/src/test/java/org/apache/sentry/provider/common/TestGetGroupMapping.java f57198a
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/ServiceConstants.java bc35742
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestCachedPrivilege.java PRE-CREATION
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/hive/AbstractTestWithStaticConfiguration.java cc5daef
>
> Diff: https://reviews.apache.org/r/28800/diff/
>
>
> Testing
> -------
>
>
> Thanks,
>
> Colin Ma
>
>
Re: Review Request 28800: SENTRY-565:Improvement the performance when
Sentry filter the entity
Posted by Sravya Tirukkovalur <sr...@cloudera.com>.
> On Oct. 21, 2015, 12:48 a.m., Sravya Tirukkovalur wrote:
> > Thanks for the update Colin! Will it be possible for you to add description on how you are solving the problem. It would be easy to follow along and also would help review the approach before delving into the code details.
>
> Colin Ma wrote:
> hi, Sravya, the following is the current idea for the cached privilege:
> 1. CachedPrivilegeManagement is responsible for getPrivileges, and it likes an agent for providerBackend. CachedPrivilegeManagement will be initialized in HiveAuthzBinding according to the configuration file.
> 2. All cached privileges are stored as HashMap in CachedPrivilegeManagement. The key for this map is username, and the value is wrapped in the CachedPrivilege including all privileges for this username and the timestamp when refresh the privilege.
> 3. When do the authorization, SimpleDBPolicyEngine.getPrivileges will be called, and CachedPrivilegeManagement.getPrivileges will be called. If CachedPrivilegeManagement is initialized and cached privilege enable in configuraion, CachedPrivilegeManagement.getPrivileges will get the privileges from cache first, if not exist or expire, get the privileges by backend and refresh the cache. The expired time can be configured and default is 5 mins.
>
> The core part for this problem is: HiveAuthzBinding, CachedPrivilege, CachedPrivilegeManagement, SimpleDBPolicyEngine, TestCachedPrivilege. The other code is just for the interface change.
> The disadvantage for this solution is if admin change the privilege for user, this will not be effect immediately. There will be delay according to the expire time setting in configuration file. I think this is accepted for the authorization management.
>
> Feel free for any comments, thanks for your review.
Hi Colin,
I think the main problem here is predicate push down. We should focus on what is the best way to be able to send the tables/ db names in the RPC when doing auth check instead of doing auth checks for each table/dbname. Also, we already have a cache provider, if we decide to improve performance of hive binding by moving to a cache provider, we should move everything to using a cache provider and also have a way to refresh this cache. Moving just one path to cache provider does not seem like a right approach as it can lead to behavior which is hard to reason.
- Sravya
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28800/#review103333
-----------------------------------------------------------
On Oct. 16, 2015, 11:18 a.m., Colin Ma wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/28800/
> -----------------------------------------------------------
>
> (Updated Oct. 16, 2015, 11:18 a.m.)
>
>
> Review request for sentry and Lenni Kuff.
>
>
> Repository: sentry
>
>
> Description
> -------
>
> Currently, when get the metadata from hive, eg, "show tables", "show databases". Sentry will filter the result and output the authorized entities. There will be many RPC calls when filtering the result. The related code is in HiveAuthzBinding, for example, in filterShowTables:
>
> ......
> for (String tableName : queryResult) {
> ......
> hiveAuthzBinding.authorize(operation, tableMetaDataPrivilege, subject, inputHierarchy,
> outputHierarchy, providedPrivileges);
> ......
> }
> ......
>
> hiveAuthzBinding.authorize will get the privileges from sentry service, if there are many tables in the hive, the filtering process will spend much time. Considering sentry also need to filter the column, HiveAuthzBinding should be improved to reduce the number of rpc calls when doing the filter.
>
>
> Diffs
> -----
>
> sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/authz/HiveAuthzBinding.java 3071475
> sentry-policy/sentry-policy-common/src/main/java/org/apache/sentry/policy/common/PolicyEngine.java 38a5b65
> sentry-policy/sentry-policy-db/src/main/java/org/apache/sentry/policy/db/CachedPrivilege.java PRE-CREATION
> sentry-policy/sentry-policy-db/src/main/java/org/apache/sentry/policy/db/CachedPrivilegeManagement.java PRE-CREATION
> sentry-policy/sentry-policy-db/src/main/java/org/apache/sentry/policy/db/SimpleDBPolicyEngine.java a03794e
> sentry-policy/sentry-policy-db/src/test/java/org/apache/sentry/policy/db/AbstractTestSimplePolicyEngine.java d1151e3
> sentry-policy/sentry-policy-db/src/test/java/org/apache/sentry/policy/db/TestPolicyParsingNegative.java 5f7c671
> sentry-policy/sentry-policy-db/src/test/java/org/apache/sentry/policy/db/TestSimpleDBPolicyEngineDFS.java f8c36e2
> sentry-policy/sentry-policy-indexer/src/main/java/org/apache/sentry/policy/indexer/SimpleIndexerPolicyEngine.java 2f4bc1d
> sentry-policy/sentry-policy-indexer/src/test/java/org/apache/sentry/policy/indexer/AbstractTestIndexerPolicyEngine.java d7d1ae2
> sentry-policy/sentry-policy-indexer/src/test/java/org/apache/sentry/policy/indexer/TestIndexerPolicyNegative.java 0706560
> sentry-policy/sentry-policy-search/src/main/java/org/apache/sentry/policy/search/SimpleSearchPolicyEngine.java bc518fb
> sentry-policy/sentry-policy-search/src/test/java/org/apache/sentry/policy/search/AbstractTestSearchPolicyEngine.java d1c415b
> sentry-policy/sentry-policy-search/src/test/java/org/apache/sentry/policy/search/TestSearchPolicyNegative.java 2abe8f2
> sentry-policy/sentry-policy-sqoop/src/main/java/org/apache/sentry/policy/sqoop/SimpleSqoopPolicyEngine.java e8615a0
> sentry-policy/sentry-policy-sqoop/src/test/java/org/apache/sentry/policy/sqoop/AbstractTestSqoopPolicyEngine.java 1389fca
> sentry-policy/sentry-policy-sqoop/src/test/java/org/apache/sentry/policy/sqoop/TestSqoopPolicyNegative.java 406e53f
> sentry-provider/sentry-provider-common/src/main/java/org/apache/sentry/provider/common/ResourceAuthorizationProvider.java 06573b7
> sentry-provider/sentry-provider-common/src/test/java/org/apache/sentry/provider/common/TestGetGroupMapping.java f57198a
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/ServiceConstants.java bc35742
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestCachedPrivilege.java PRE-CREATION
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/hive/AbstractTestWithStaticConfiguration.java cc5daef
>
> Diff: https://reviews.apache.org/r/28800/diff/
>
>
> Testing
> -------
>
>
> Thanks,
>
> Colin Ma
>
>
Re: Review Request 28800: SENTRY-565:Improvement the performance when
Sentry filter the entity
Posted by Sravya Tirukkovalur <sr...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28800/#review103333
-----------------------------------------------------------
Thanks for the update Colin! Will it be possible for you to add description on how you are solving the problem. It would be easy to follow along and also would help review the approach before delving into the code details.
- Sravya Tirukkovalur
On Oct. 16, 2015, 11:18 a.m., Colin Ma wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/28800/
> -----------------------------------------------------------
>
> (Updated Oct. 16, 2015, 11:18 a.m.)
>
>
> Review request for sentry and Lenni Kuff.
>
>
> Repository: sentry
>
>
> Description
> -------
>
> Currently, when get the metadata from hive, eg, "show tables", "show databases". Sentry will filter the result and output the authorized entities. There will be many RPC calls when filtering the result. The related code is in HiveAuthzBinding, for example, in filterShowTables:
>
> ......
> for (String tableName : queryResult) {
> ......
> hiveAuthzBinding.authorize(operation, tableMetaDataPrivilege, subject, inputHierarchy,
> outputHierarchy, providedPrivileges);
> ......
> }
> ......
>
> hiveAuthzBinding.authorize will get the privileges from sentry service, if there are many tables in the hive, the filtering process will spend much time. Considering sentry also need to filter the column, HiveAuthzBinding should be improved to reduce the number of rpc calls when doing the filter.
>
>
> Diffs
> -----
>
> sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/authz/HiveAuthzBinding.java 3071475
> sentry-policy/sentry-policy-common/src/main/java/org/apache/sentry/policy/common/PolicyEngine.java 38a5b65
> sentry-policy/sentry-policy-db/src/main/java/org/apache/sentry/policy/db/CachedPrivilege.java PRE-CREATION
> sentry-policy/sentry-policy-db/src/main/java/org/apache/sentry/policy/db/CachedPrivilegeManagement.java PRE-CREATION
> sentry-policy/sentry-policy-db/src/main/java/org/apache/sentry/policy/db/SimpleDBPolicyEngine.java a03794e
> sentry-policy/sentry-policy-db/src/test/java/org/apache/sentry/policy/db/AbstractTestSimplePolicyEngine.java d1151e3
> sentry-policy/sentry-policy-db/src/test/java/org/apache/sentry/policy/db/TestPolicyParsingNegative.java 5f7c671
> sentry-policy/sentry-policy-db/src/test/java/org/apache/sentry/policy/db/TestSimpleDBPolicyEngineDFS.java f8c36e2
> sentry-policy/sentry-policy-indexer/src/main/java/org/apache/sentry/policy/indexer/SimpleIndexerPolicyEngine.java 2f4bc1d
> sentry-policy/sentry-policy-indexer/src/test/java/org/apache/sentry/policy/indexer/AbstractTestIndexerPolicyEngine.java d7d1ae2
> sentry-policy/sentry-policy-indexer/src/test/java/org/apache/sentry/policy/indexer/TestIndexerPolicyNegative.java 0706560
> sentry-policy/sentry-policy-search/src/main/java/org/apache/sentry/policy/search/SimpleSearchPolicyEngine.java bc518fb
> sentry-policy/sentry-policy-search/src/test/java/org/apache/sentry/policy/search/AbstractTestSearchPolicyEngine.java d1c415b
> sentry-policy/sentry-policy-search/src/test/java/org/apache/sentry/policy/search/TestSearchPolicyNegative.java 2abe8f2
> sentry-policy/sentry-policy-sqoop/src/main/java/org/apache/sentry/policy/sqoop/SimpleSqoopPolicyEngine.java e8615a0
> sentry-policy/sentry-policy-sqoop/src/test/java/org/apache/sentry/policy/sqoop/AbstractTestSqoopPolicyEngine.java 1389fca
> sentry-policy/sentry-policy-sqoop/src/test/java/org/apache/sentry/policy/sqoop/TestSqoopPolicyNegative.java 406e53f
> sentry-provider/sentry-provider-common/src/main/java/org/apache/sentry/provider/common/ResourceAuthorizationProvider.java 06573b7
> sentry-provider/sentry-provider-common/src/test/java/org/apache/sentry/provider/common/TestGetGroupMapping.java f57198a
> sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/ServiceConstants.java bc35742
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestCachedPrivilege.java PRE-CREATION
> sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/hive/AbstractTestWithStaticConfiguration.java cc5daef
>
> Diff: https://reviews.apache.org/r/28800/diff/
>
>
> Testing
> -------
>
>
> Thanks,
>
> Colin Ma
>
>
Re: Review Request 28800: SENTRY-565:Improvement the performance when
Sentry filter the entity
Posted by Colin Ma <ju...@intel.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28800/
-----------------------------------------------------------
(Updated Oct. 16, 2015, 11:18 a.m.)
Review request for sentry and Lenni Kuff.
Repository: sentry
Description
-------
Currently, when get the metadata from hive, eg, "show tables", "show databases". Sentry will filter the result and output the authorized entities. There will be many RPC calls when filtering the result. The related code is in HiveAuthzBinding, for example, in filterShowTables:
......
for (String tableName : queryResult) {
......
hiveAuthzBinding.authorize(operation, tableMetaDataPrivilege, subject, inputHierarchy,
outputHierarchy, providedPrivileges);
......
}
......
hiveAuthzBinding.authorize will get the privileges from sentry service, if there are many tables in the hive, the filtering process will spend much time. Considering sentry also need to filter the column, HiveAuthzBinding should be improved to reduce the number of rpc calls when doing the filter.
Diffs (updated)
-----
sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/authz/HiveAuthzBinding.java 3071475
sentry-policy/sentry-policy-common/src/main/java/org/apache/sentry/policy/common/PolicyEngine.java 38a5b65
sentry-policy/sentry-policy-db/src/main/java/org/apache/sentry/policy/db/CachedPrivilege.java PRE-CREATION
sentry-policy/sentry-policy-db/src/main/java/org/apache/sentry/policy/db/CachedPrivilegeManagement.java PRE-CREATION
sentry-policy/sentry-policy-db/src/main/java/org/apache/sentry/policy/db/SimpleDBPolicyEngine.java a03794e
sentry-policy/sentry-policy-db/src/test/java/org/apache/sentry/policy/db/AbstractTestSimplePolicyEngine.java d1151e3
sentry-policy/sentry-policy-db/src/test/java/org/apache/sentry/policy/db/TestPolicyParsingNegative.java 5f7c671
sentry-policy/sentry-policy-db/src/test/java/org/apache/sentry/policy/db/TestSimpleDBPolicyEngineDFS.java f8c36e2
sentry-policy/sentry-policy-indexer/src/main/java/org/apache/sentry/policy/indexer/SimpleIndexerPolicyEngine.java 2f4bc1d
sentry-policy/sentry-policy-indexer/src/test/java/org/apache/sentry/policy/indexer/AbstractTestIndexerPolicyEngine.java d7d1ae2
sentry-policy/sentry-policy-indexer/src/test/java/org/apache/sentry/policy/indexer/TestIndexerPolicyNegative.java 0706560
sentry-policy/sentry-policy-search/src/main/java/org/apache/sentry/policy/search/SimpleSearchPolicyEngine.java bc518fb
sentry-policy/sentry-policy-search/src/test/java/org/apache/sentry/policy/search/AbstractTestSearchPolicyEngine.java d1c415b
sentry-policy/sentry-policy-search/src/test/java/org/apache/sentry/policy/search/TestSearchPolicyNegative.java 2abe8f2
sentry-policy/sentry-policy-sqoop/src/main/java/org/apache/sentry/policy/sqoop/SimpleSqoopPolicyEngine.java e8615a0
sentry-policy/sentry-policy-sqoop/src/test/java/org/apache/sentry/policy/sqoop/AbstractTestSqoopPolicyEngine.java 1389fca
sentry-policy/sentry-policy-sqoop/src/test/java/org/apache/sentry/policy/sqoop/TestSqoopPolicyNegative.java 406e53f
sentry-provider/sentry-provider-common/src/main/java/org/apache/sentry/provider/common/ResourceAuthorizationProvider.java 06573b7
sentry-provider/sentry-provider-common/src/test/java/org/apache/sentry/provider/common/TestGetGroupMapping.java f57198a
sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/ServiceConstants.java bc35742
sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestCachedPrivilege.java PRE-CREATION
sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/hive/AbstractTestWithStaticConfiguration.java cc5daef
Diff: https://reviews.apache.org/r/28800/diff/
Testing
-------
Thanks,
Colin Ma