You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sentry.apache.org by Colin Ma <ju...@intel.com> on 2015/11/13 05:10:14 UTC

Re: Review Request 28800: SENTRY-565:Improvement the performance when Sentry filter the entity

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28800/
-----------------------------------------------------------

(Updated Nov. 13, 2015, 4:10 a.m.)


Review request for sentry, Lenni Kuff and Sravya Tirukkovalur.


Repository: sentry


Description
-------

Currently, when get the metadata from hive, eg, "show tables", "show databases". Sentry will filter the result and output the authorized entities. There will be many RPC calls when filtering the result. The related code is in HiveAuthzBinding, for example, in filterShowTables:

......
for (String tableName : queryResult) {
  ......
  hiveAuthzBinding.authorize(operation, tableMetaDataPrivilege, subject, inputHierarchy,
            outputHierarchy, providedPrivileges);
  ......
}
......

hiveAuthzBinding.authorize will get the privileges from sentry service, if there are many tables in the hive, the filtering process will spend much time. Considering sentry also need to filter the column, HiveAuthzBinding should be improved to reduce the number of rpc calls when doing the filter.


Diffs (updated)
-----

  sentry-provider/sentry-provider-cache/pom.xml c67f094 
  sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/CachedPrivilegeWrap.java PRE-CREATION 
  sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/DBPrivilegeCache.java PRE-CREATION 
  sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/SimpleCacheProviderBackend.java 4b98447 
  sentry-provider/sentry-provider-cache/src/test/java/org/apache/sentry/provider/cache/PrivilegeCacheTestImpl.java a7566e7 
  sentry-provider/sentry-provider-cache/src/test/java/org/apache/sentry/provider/cache/TestCacheProvider.java e5b29b8 
  sentry-provider/sentry-provider-db/src/gen/thrift/gen-javabean/org/apache/sentry/provider/db/service/thrift/SentryPolicyService.java 0c24449 
  sentry-provider/sentry-provider-db/src/gen/thrift/gen-javabean/org/apache/sentry/provider/db/service/thrift/TListSentryPrivilegesForCachedRequest.java PRE-CREATION 
  sentry-provider/sentry-provider-db/src/gen/thrift/gen-javabean/org/apache/sentry/provider/db/service/thrift/TListSentryPrivilegesForCachedResponse.java PRE-CREATION 
  sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java 8c9401c 
  sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/thrift/SentryPolicyServiceClient.java cbc0aaf 
  sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/thrift/SentryPolicyServiceClientDefaultImpl.java 74f379a 
  sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/thrift/SentryPolicyStoreProcessor.java 4f8c834 
  sentry-provider/sentry-provider-db/src/main/resources/sentry_policy_service.thrift 40889e8 

Diff: https://reviews.apache.org/r/28800/diff/


Testing
-------


Thanks,

Colin Ma


Re: Review Request 28800: SENTRY-565:Improvement the performance when Sentry filter the entity

Posted by Colin Ma <ju...@intel.com>.

> On Nov. 16, 2015, 9:35 p.m., Li Li wrote:
> > sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/DBPrivilegeCache.java, line 57
> > <https://reviews.apache.org/r/28800/diff/4/?file=1124779#file1124779line57>
> >
> >     I am not sure if it is the best desige here to refresh cache when quering the cache. Because we may get outdated privilege info, and also it will affect the performance if refreshcache takes too much time.
> >     I think maybe we'd better:
> >     1. use a background thread to refresh the data
> >     2. refresh the cache whenever data is updated

Thanks for the comments, I need re-consider the implementation of this feature, performance, accuracy of data, cache retire, everything is important.


- Colin


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28800/#review106735
-----------------------------------------------------------


On Nov. 13, 2015, 4:10 a.m., Colin Ma wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/28800/
> -----------------------------------------------------------
> 
> (Updated Nov. 13, 2015, 4:10 a.m.)
> 
> 
> Review request for sentry, Lenni Kuff and Sravya Tirukkovalur.
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> Currently, when get the metadata from hive, eg, "show tables", "show databases". Sentry will filter the result and output the authorized entities. There will be many RPC calls when filtering the result. The related code is in HiveAuthzBinding, for example, in filterShowTables:
> 
> ......
> for (String tableName : queryResult) {
>   ......
>   hiveAuthzBinding.authorize(operation, tableMetaDataPrivilege, subject, inputHierarchy,
>             outputHierarchy, providedPrivileges);
>   ......
> }
> ......
> 
> hiveAuthzBinding.authorize will get the privileges from sentry service, if there are many tables in the hive, the filtering process will spend much time. Considering sentry also need to filter the column, HiveAuthzBinding should be improved to reduce the number of rpc calls when doing the filter.
> 
> 
> Diffs
> -----
> 
>   sentry-provider/sentry-provider-cache/pom.xml c67f094 
>   sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/CachedPrivilegeWrap.java PRE-CREATION 
>   sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/DBPrivilegeCache.java PRE-CREATION 
>   sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/SimpleCacheProviderBackend.java 4b98447 
>   sentry-provider/sentry-provider-cache/src/test/java/org/apache/sentry/provider/cache/PrivilegeCacheTestImpl.java a7566e7 
>   sentry-provider/sentry-provider-cache/src/test/java/org/apache/sentry/provider/cache/TestCacheProvider.java e5b29b8 
>   sentry-provider/sentry-provider-db/src/gen/thrift/gen-javabean/org/apache/sentry/provider/db/service/thrift/SentryPolicyService.java 0c24449 
>   sentry-provider/sentry-provider-db/src/gen/thrift/gen-javabean/org/apache/sentry/provider/db/service/thrift/TListSentryPrivilegesForCachedRequest.java PRE-CREATION 
>   sentry-provider/sentry-provider-db/src/gen/thrift/gen-javabean/org/apache/sentry/provider/db/service/thrift/TListSentryPrivilegesForCachedResponse.java PRE-CREATION 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java 8c9401c 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/thrift/SentryPolicyServiceClient.java cbc0aaf 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/thrift/SentryPolicyServiceClientDefaultImpl.java 74f379a 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/thrift/SentryPolicyStoreProcessor.java 4f8c834 
>   sentry-provider/sentry-provider-db/src/main/resources/sentry_policy_service.thrift 40889e8 
> 
> Diff: https://reviews.apache.org/r/28800/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Colin Ma
> 
>


Re: Review Request 28800: SENTRY-565:Improvement the performance when Sentry filter the entity

Posted by Li Li <li...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28800/#review106735
-----------------------------------------------------------



sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/DBPrivilegeCache.java (line 57)
<https://reviews.apache.org/r/28800/#comment165476>

    I am not sure if it is the best desige here to refresh cache when quering the cache. Because we may get outdated privilege info, and also it will affect the performance if refreshcache takes too much time.
    I think maybe we'd better:
    1. use a background thread to refresh the data
    2. refresh the cache whenever data is updated



sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/DBPrivilegeCache.java (line 91)
<https://reviews.apache.org/r/28800/#comment165468>

    I do not think it is good to use expire time to decide refresh the cache or not. Because for sentry, having the consistent privilege info is the most important thing. Sometimes we need sacrifice the speed in order to guarantee the accuracy of data.


- Li Li


On Nov. 13, 2015, 4:10 a.m., Colin Ma wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/28800/
> -----------------------------------------------------------
> 
> (Updated Nov. 13, 2015, 4:10 a.m.)
> 
> 
> Review request for sentry, Lenni Kuff and Sravya Tirukkovalur.
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> Currently, when get the metadata from hive, eg, "show tables", "show databases". Sentry will filter the result and output the authorized entities. There will be many RPC calls when filtering the result. The related code is in HiveAuthzBinding, for example, in filterShowTables:
> 
> ......
> for (String tableName : queryResult) {
>   ......
>   hiveAuthzBinding.authorize(operation, tableMetaDataPrivilege, subject, inputHierarchy,
>             outputHierarchy, providedPrivileges);
>   ......
> }
> ......
> 
> hiveAuthzBinding.authorize will get the privileges from sentry service, if there are many tables in the hive, the filtering process will spend much time. Considering sentry also need to filter the column, HiveAuthzBinding should be improved to reduce the number of rpc calls when doing the filter.
> 
> 
> Diffs
> -----
> 
>   sentry-provider/sentry-provider-cache/pom.xml c67f094 
>   sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/CachedPrivilegeWrap.java PRE-CREATION 
>   sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/DBPrivilegeCache.java PRE-CREATION 
>   sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/SimpleCacheProviderBackend.java 4b98447 
>   sentry-provider/sentry-provider-cache/src/test/java/org/apache/sentry/provider/cache/PrivilegeCacheTestImpl.java a7566e7 
>   sentry-provider/sentry-provider-cache/src/test/java/org/apache/sentry/provider/cache/TestCacheProvider.java e5b29b8 
>   sentry-provider/sentry-provider-db/src/gen/thrift/gen-javabean/org/apache/sentry/provider/db/service/thrift/SentryPolicyService.java 0c24449 
>   sentry-provider/sentry-provider-db/src/gen/thrift/gen-javabean/org/apache/sentry/provider/db/service/thrift/TListSentryPrivilegesForCachedRequest.java PRE-CREATION 
>   sentry-provider/sentry-provider-db/src/gen/thrift/gen-javabean/org/apache/sentry/provider/db/service/thrift/TListSentryPrivilegesForCachedResponse.java PRE-CREATION 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java 8c9401c 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/thrift/SentryPolicyServiceClient.java cbc0aaf 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/thrift/SentryPolicyServiceClientDefaultImpl.java 74f379a 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/thrift/SentryPolicyStoreProcessor.java 4f8c834 
>   sentry-provider/sentry-provider-db/src/main/resources/sentry_policy_service.thrift 40889e8 
> 
> Diff: https://reviews.apache.org/r/28800/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Colin Ma
> 
>


Re: Review Request 28800: SENTRY-565:Improvement the performance when Sentry filter the entity

Posted by Colin Ma <ju...@intel.com>.

> On Nov. 21, 2015, 9:57 a.m., Lenni Kuff wrote:
> > Thanks for putting in the effort for this patch! I'm a little confused about what this is doing and how it is solving the problem. Is the problem that when you do a SHOW TABLES that we make a RPC to Sentry to check the same privielges for each one of the tables? My comments below assume that's the case:
> > 
> > 1) Why are we adding any new thrift interfaces here? It seems like this should be contained within the HMS plugin?
> > 2) Do we need to have a cache that expires? That seems to add complexity here.
> > 3) What does it even mean to have an RPC for a cached privilege set?
> > 
> > I would expect a solution that looked like (using psedo code):
> > 
> > 
> > List<Privilege> getPrivilegesForUse(String username);
> > 
> > 
> > 
> > List<String> filterTables() {
> > 
> >     
> >     List<Privilege> userPrivileges = getPrivilegesForUser(username);
> >     
> >     // Note, we create a new cache each time filterTables() is called. 
> >     PrivilegeCache cache = new PrivilegeCache(userPrivileges);
> >     AuthProvider provider = new AuthProvider(cache);
> >     
> >     for table in getAllTables() {
> >        if (privider.hasAccess(table) ) {
> >           // include this table;
> >        }
> >     }
> > }
> > 
> > It appears that your solution is trying to maintain a global cache, which I don't think is necessary. I also don't think that any thrift changes should be required to make this work. 
> > 
> > Does this make sense? Am I missing something?

Thanks for the suggestion, I think your solution make sense for the metadata filter problems. And I'll update another patch with this solution.


- Colin


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28800/#review107488
-----------------------------------------------------------


On Nov. 13, 2015, 4:10 a.m., Colin Ma wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/28800/
> -----------------------------------------------------------
> 
> (Updated Nov. 13, 2015, 4:10 a.m.)
> 
> 
> Review request for sentry, Lenni Kuff and Sravya Tirukkovalur.
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> Currently, when get the metadata from hive, eg, "show tables", "show databases". Sentry will filter the result and output the authorized entities. There will be many RPC calls when filtering the result. The related code is in HiveAuthzBinding, for example, in filterShowTables:
> 
> ......
> for (String tableName : queryResult) {
>   ......
>   hiveAuthzBinding.authorize(operation, tableMetaDataPrivilege, subject, inputHierarchy,
>             outputHierarchy, providedPrivileges);
>   ......
> }
> ......
> 
> hiveAuthzBinding.authorize will get the privileges from sentry service, if there are many tables in the hive, the filtering process will spend much time. Considering sentry also need to filter the column, HiveAuthzBinding should be improved to reduce the number of rpc calls when doing the filter.
> 
> 
> Diffs
> -----
> 
>   sentry-provider/sentry-provider-cache/pom.xml c67f094 
>   sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/CachedPrivilegeWrap.java PRE-CREATION 
>   sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/DBPrivilegeCache.java PRE-CREATION 
>   sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/SimpleCacheProviderBackend.java 4b98447 
>   sentry-provider/sentry-provider-cache/src/test/java/org/apache/sentry/provider/cache/PrivilegeCacheTestImpl.java a7566e7 
>   sentry-provider/sentry-provider-cache/src/test/java/org/apache/sentry/provider/cache/TestCacheProvider.java e5b29b8 
>   sentry-provider/sentry-provider-db/src/gen/thrift/gen-javabean/org/apache/sentry/provider/db/service/thrift/SentryPolicyService.java 0c24449 
>   sentry-provider/sentry-provider-db/src/gen/thrift/gen-javabean/org/apache/sentry/provider/db/service/thrift/TListSentryPrivilegesForCachedRequest.java PRE-CREATION 
>   sentry-provider/sentry-provider-db/src/gen/thrift/gen-javabean/org/apache/sentry/provider/db/service/thrift/TListSentryPrivilegesForCachedResponse.java PRE-CREATION 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java 8c9401c 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/thrift/SentryPolicyServiceClient.java cbc0aaf 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/thrift/SentryPolicyServiceClientDefaultImpl.java 74f379a 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/thrift/SentryPolicyStoreProcessor.java 4f8c834 
>   sentry-provider/sentry-provider-db/src/main/resources/sentry_policy_service.thrift 40889e8 
> 
> Diff: https://reviews.apache.org/r/28800/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Colin Ma
> 
>


Re: Review Request 28800: SENTRY-565:Improvement the performance when Sentry filter the entity

Posted by Lenni Kuff <ls...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28800/#review107488
-----------------------------------------------------------


Thanks for putting in the effort for this patch! I'm a little confused about what this is doing and how it is solving the problem. Is the problem that when you do a SHOW TABLES that we make a RPC to Sentry to check the same privielges for each one of the tables? My comments below assume that's the case:

1) Why are we adding any new thrift interfaces here? It seems like this should be contained within the HMS plugin?
2) Do we need to have a cache that expires? That seems to add complexity here.
3) What does it even mean to have an RPC for a cached privilege set?

I would expect a solution that looked like (using psedo code):


List<Privilege> getPrivilegesForUse(String username);



List<String> filterTables() {

    
    List<Privilege> userPrivileges = getPrivilegesForUser(username);
    
    // Note, we create a new cache each time filterTables() is called. 
    PrivilegeCache cache = new PrivilegeCache(userPrivileges);
    AuthProvider provider = new AuthProvider(cache);
    
    for table in getAllTables() {
       if (privider.hasAccess(table) ) {
          // include this table;
       }
    }
}

It appears that your solution is trying to maintain a global cache, which I don't think is necessary. I also don't think that any thrift changes should be required to make this work. 

Does this make sense? Am I missing something?

- Lenni Kuff


On Nov. 13, 2015, 4:10 a.m., Colin Ma wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/28800/
> -----------------------------------------------------------
> 
> (Updated Nov. 13, 2015, 4:10 a.m.)
> 
> 
> Review request for sentry, Lenni Kuff and Sravya Tirukkovalur.
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> Currently, when get the metadata from hive, eg, "show tables", "show databases". Sentry will filter the result and output the authorized entities. There will be many RPC calls when filtering the result. The related code is in HiveAuthzBinding, for example, in filterShowTables:
> 
> ......
> for (String tableName : queryResult) {
>   ......
>   hiveAuthzBinding.authorize(operation, tableMetaDataPrivilege, subject, inputHierarchy,
>             outputHierarchy, providedPrivileges);
>   ......
> }
> ......
> 
> hiveAuthzBinding.authorize will get the privileges from sentry service, if there are many tables in the hive, the filtering process will spend much time. Considering sentry also need to filter the column, HiveAuthzBinding should be improved to reduce the number of rpc calls when doing the filter.
> 
> 
> Diffs
> -----
> 
>   sentry-provider/sentry-provider-cache/pom.xml c67f094 
>   sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/CachedPrivilegeWrap.java PRE-CREATION 
>   sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/DBPrivilegeCache.java PRE-CREATION 
>   sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/SimpleCacheProviderBackend.java 4b98447 
>   sentry-provider/sentry-provider-cache/src/test/java/org/apache/sentry/provider/cache/PrivilegeCacheTestImpl.java a7566e7 
>   sentry-provider/sentry-provider-cache/src/test/java/org/apache/sentry/provider/cache/TestCacheProvider.java e5b29b8 
>   sentry-provider/sentry-provider-db/src/gen/thrift/gen-javabean/org/apache/sentry/provider/db/service/thrift/SentryPolicyService.java 0c24449 
>   sentry-provider/sentry-provider-db/src/gen/thrift/gen-javabean/org/apache/sentry/provider/db/service/thrift/TListSentryPrivilegesForCachedRequest.java PRE-CREATION 
>   sentry-provider/sentry-provider-db/src/gen/thrift/gen-javabean/org/apache/sentry/provider/db/service/thrift/TListSentryPrivilegesForCachedResponse.java PRE-CREATION 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java 8c9401c 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/thrift/SentryPolicyServiceClient.java cbc0aaf 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/thrift/SentryPolicyServiceClientDefaultImpl.java 74f379a 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/thrift/SentryPolicyStoreProcessor.java 4f8c834 
>   sentry-provider/sentry-provider-db/src/main/resources/sentry_policy_service.thrift 40889e8 
> 
> Diff: https://reviews.apache.org/r/28800/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Colin Ma
> 
>


Re: Review Request 28800: SENTRY-565:Improvement the performance when Sentry filter the entity

Posted by Colin Ma <ju...@intel.com>.

> On Nov. 13, 2015, 6:15 a.m., Li Li wrote:
> > sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/DBPrivilegeCache.java, line 34
> > <https://reviews.apache.org/r/28800/diff/4/?file=1124779#file1124779line34>
> >
> >     maybe better to add some class comments for the newly added classes?

I'll update it.


> On Nov. 13, 2015, 6:15 a.m., Li Li wrote:
> > sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/DBPrivilegeCache.java, line 71
> > <https://reviews.apache.org/r/28800/diff/4/?file=1124779#file1124779line71>
> >
> >     maybe it is better to specify the exception type?

The try clause throw the Exception, and don't specify the type, so currently catch Exception.


> On Nov. 13, 2015, 6:15 a.m., Li Li wrote:
> > sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/SimpleCacheProviderBackend.java, line 36
> > <https://reviews.apache.org/r/28800/diff/4/?file=1124780#file1124780line36>
> >
> >     make it final?

I'll update it.


> On Nov. 13, 2015, 6:15 a.m., Li Li wrote:
> > sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/SimpleCacheProviderBackend.java, line 35
> > <https://reviews.apache.org/r/28800/diff/4/?file=1124780#file1124780line35>
> >
> >     make it final?
> >     make it public so that TestCacheProvider can use it in line 39?

I'll update it.


> On Nov. 13, 2015, 6:15 a.m., Li Li wrote:
> > sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/SimpleCacheProviderBackend.java, line 38
> > <https://reviews.apache.org/r/28800/diff/4/?file=1124780#file1124780line38>
> >
> >     seems we can also remove this constructor method, since resourcePath is not used at all?

I think this constructor here is for the compatibility, if the backend of the CacheProvider get the privileges from file, the resourcePath should be path of that file.


> On Nov. 13, 2015, 6:15 a.m., Li Li wrote:
> > sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/SimpleCacheProviderBackend.java, line 42
> > <https://reviews.apache.org/r/28800/diff/4/?file=1124780#file1124780line42>
> >
> >     Maybe it is better to specify the exception type, eg. ClassNotFoundException and etc.

I'll update it.


> On Nov. 13, 2015, 6:15 a.m., Li Li wrote:
> > sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/SimpleCacheProviderBackend.java, line 51
> > <https://reviews.apache.org/r/28800/diff/4/?file=1124780#file1124780line51>
> >
> >     should we update the method comment since currently it is only initailized once, or just deleted this method?
> >     Besides, it is better to add an empty line between two methods.

The method inherit from the super class, and can't be delete, I'll add comments for it.


> On Nov. 13, 2015, 6:15 a.m., Li Li wrote:
> > sentry-provider/sentry-provider-db/src/gen/thrift/gen-javabean/org/apache/sentry/provider/db/service/thrift/TListSentryPrivilegesForCachedRequest.java, line 121
> > <https://reviews.apache.org/r/28800/diff/4/?file=1124784#file1124784line121>
> >
> >     it is better to remove the trailing space here and line 123, 124, 126, 521, 529, 547, 556, and also in SentryPolicyService.java and TListSentryPrivilegesForCachedResponse.java.

Thanks for catch that, I'll update it.


> On Nov. 13, 2015, 6:15 a.m., Li Li wrote:
> > sentry-provider/sentry-provider-db/src/gen/thrift/gen-javabean/org/apache/sentry/provider/db/service/thrift/TListSentryPrivilegesForCachedResponse.java, line 2
> > <https://reviews.apache.org/r/28800/diff/4/?file=1124785#file1124785line2>
> >
> >     since this file and the other java files under thrift dir are autogenrated by Thrift compiler, should we add some comments for those manual changes?

I think we don't need to add comments here, we can add the comments in the thrift defined file if necessary.
We also can get the thrift version by the autogenerated comments.


> On Nov. 13, 2015, 6:15 a.m., Li Li wrote:
> > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java, line 1266
> > <https://reviews.apache.org/r/28800/diff/4/?file=1124786#file1124786line1266>
> >
> >     I guess you intent to add a comment for this method?

Thanks, I'll update it.


- Colin


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28800/#review106366
-----------------------------------------------------------


On Nov. 13, 2015, 4:10 a.m., Colin Ma wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/28800/
> -----------------------------------------------------------
> 
> (Updated Nov. 13, 2015, 4:10 a.m.)
> 
> 
> Review request for sentry, Lenni Kuff and Sravya Tirukkovalur.
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> Currently, when get the metadata from hive, eg, "show tables", "show databases". Sentry will filter the result and output the authorized entities. There will be many RPC calls when filtering the result. The related code is in HiveAuthzBinding, for example, in filterShowTables:
> 
> ......
> for (String tableName : queryResult) {
>   ......
>   hiveAuthzBinding.authorize(operation, tableMetaDataPrivilege, subject, inputHierarchy,
>             outputHierarchy, providedPrivileges);
>   ......
> }
> ......
> 
> hiveAuthzBinding.authorize will get the privileges from sentry service, if there are many tables in the hive, the filtering process will spend much time. Considering sentry also need to filter the column, HiveAuthzBinding should be improved to reduce the number of rpc calls when doing the filter.
> 
> 
> Diffs
> -----
> 
>   sentry-provider/sentry-provider-cache/pom.xml c67f094 
>   sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/CachedPrivilegeWrap.java PRE-CREATION 
>   sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/DBPrivilegeCache.java PRE-CREATION 
>   sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/SimpleCacheProviderBackend.java 4b98447 
>   sentry-provider/sentry-provider-cache/src/test/java/org/apache/sentry/provider/cache/PrivilegeCacheTestImpl.java a7566e7 
>   sentry-provider/sentry-provider-cache/src/test/java/org/apache/sentry/provider/cache/TestCacheProvider.java e5b29b8 
>   sentry-provider/sentry-provider-db/src/gen/thrift/gen-javabean/org/apache/sentry/provider/db/service/thrift/SentryPolicyService.java 0c24449 
>   sentry-provider/sentry-provider-db/src/gen/thrift/gen-javabean/org/apache/sentry/provider/db/service/thrift/TListSentryPrivilegesForCachedRequest.java PRE-CREATION 
>   sentry-provider/sentry-provider-db/src/gen/thrift/gen-javabean/org/apache/sentry/provider/db/service/thrift/TListSentryPrivilegesForCachedResponse.java PRE-CREATION 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java 8c9401c 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/thrift/SentryPolicyServiceClient.java cbc0aaf 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/thrift/SentryPolicyServiceClientDefaultImpl.java 74f379a 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/thrift/SentryPolicyStoreProcessor.java 4f8c834 
>   sentry-provider/sentry-provider-db/src/main/resources/sentry_policy_service.thrift 40889e8 
> 
> Diff: https://reviews.apache.org/r/28800/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Colin Ma
> 
>


Re: Review Request 28800: SENTRY-565:Improvement the performance when Sentry filter the entity

Posted by Li Li <li...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28800/#review106366
-----------------------------------------------------------



sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/DBPrivilegeCache.java (line 34)
<https://reviews.apache.org/r/28800/#comment165111>

    maybe better to add some class comments for the newly added classes?



sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/DBPrivilegeCache.java (line 71)
<https://reviews.apache.org/r/28800/#comment165103>

    maybe it is better to specify the exception type?



sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/SimpleCacheProviderBackend.java (line 35)
<https://reviews.apache.org/r/28800/#comment165104>

    make it final?
    make it public so that TestCacheProvider can use it in line 39?



sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/SimpleCacheProviderBackend.java (line 36)
<https://reviews.apache.org/r/28800/#comment165105>

    make it final?



sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/SimpleCacheProviderBackend.java (line 38)
<https://reviews.apache.org/r/28800/#comment165101>

    seems we can also remove this constructor method, since resourcePath is not used at all?



sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/SimpleCacheProviderBackend.java (line 42)
<https://reviews.apache.org/r/28800/#comment165102>

    Maybe it is better to specify the exception type, eg. ClassNotFoundException and etc.



sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/SimpleCacheProviderBackend.java (line 51)
<https://reviews.apache.org/r/28800/#comment165099>

    should we update the method comment since currently it is only initailized once, or just deleted this method?
    Besides, it is better to add an empty line between two methods.



sentry-provider/sentry-provider-db/src/gen/thrift/gen-javabean/org/apache/sentry/provider/db/service/thrift/TListSentryPrivilegesForCachedRequest.java (line 121)
<https://reviews.apache.org/r/28800/#comment165100>

    it is better to remove the trailing space here and line 123, 124, 126, 521, 529, 547, 556, and also in SentryPolicyService.java and TListSentryPrivilegesForCachedResponse.java.



sentry-provider/sentry-provider-db/src/gen/thrift/gen-javabean/org/apache/sentry/provider/db/service/thrift/TListSentryPrivilegesForCachedResponse.java (line 2)
<https://reviews.apache.org/r/28800/#comment165109>

    since this file and the other java files under thrift dir are autogenrated by Thrift compiler, should we add some comments for those manual changes?



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java (line 1266)
<https://reviews.apache.org/r/28800/#comment165110>

    I guess you intent to add a comment for this method?


- Li Li


On Nov. 13, 2015, 4:10 a.m., Colin Ma wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/28800/
> -----------------------------------------------------------
> 
> (Updated Nov. 13, 2015, 4:10 a.m.)
> 
> 
> Review request for sentry, Lenni Kuff and Sravya Tirukkovalur.
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> Currently, when get the metadata from hive, eg, "show tables", "show databases". Sentry will filter the result and output the authorized entities. There will be many RPC calls when filtering the result. The related code is in HiveAuthzBinding, for example, in filterShowTables:
> 
> ......
> for (String tableName : queryResult) {
>   ......
>   hiveAuthzBinding.authorize(operation, tableMetaDataPrivilege, subject, inputHierarchy,
>             outputHierarchy, providedPrivileges);
>   ......
> }
> ......
> 
> hiveAuthzBinding.authorize will get the privileges from sentry service, if there are many tables in the hive, the filtering process will spend much time. Considering sentry also need to filter the column, HiveAuthzBinding should be improved to reduce the number of rpc calls when doing the filter.
> 
> 
> Diffs
> -----
> 
>   sentry-provider/sentry-provider-cache/pom.xml c67f094 
>   sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/CachedPrivilegeWrap.java PRE-CREATION 
>   sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/DBPrivilegeCache.java PRE-CREATION 
>   sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/SimpleCacheProviderBackend.java 4b98447 
>   sentry-provider/sentry-provider-cache/src/test/java/org/apache/sentry/provider/cache/PrivilegeCacheTestImpl.java a7566e7 
>   sentry-provider/sentry-provider-cache/src/test/java/org/apache/sentry/provider/cache/TestCacheProvider.java e5b29b8 
>   sentry-provider/sentry-provider-db/src/gen/thrift/gen-javabean/org/apache/sentry/provider/db/service/thrift/SentryPolicyService.java 0c24449 
>   sentry-provider/sentry-provider-db/src/gen/thrift/gen-javabean/org/apache/sentry/provider/db/service/thrift/TListSentryPrivilegesForCachedRequest.java PRE-CREATION 
>   sentry-provider/sentry-provider-db/src/gen/thrift/gen-javabean/org/apache/sentry/provider/db/service/thrift/TListSentryPrivilegesForCachedResponse.java PRE-CREATION 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java 8c9401c 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/thrift/SentryPolicyServiceClient.java cbc0aaf 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/thrift/SentryPolicyServiceClientDefaultImpl.java 74f379a 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/thrift/SentryPolicyStoreProcessor.java 4f8c834 
>   sentry-provider/sentry-provider-db/src/main/resources/sentry_policy_service.thrift 40889e8 
> 
> Diff: https://reviews.apache.org/r/28800/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Colin Ma
> 
>


Re: Review Request 28800: SENTRY-565:Improvement the performance when Sentry filter the entity

Posted by Colin Ma <ju...@intel.com>.

> On Nov. 13, 2015, 8:34 a.m., Dapeng Sun wrote:
> > sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/SimpleCacheProviderBackend.java, line 36
> > <https://reviews.apache.org/r/28800/diff/4/?file=1124780#file1124780line36>
> >
> >     DBPrivilegeCache.class.getName() is better?

I'll update it.


> On Nov. 13, 2015, 8:34 a.m., Dapeng Sun wrote:
> > sentry-provider/sentry-provider-cache/src/test/java/org/apache/sentry/provider/cache/TestCacheProvider.java, line 44
> > <https://reviews.apache.org/r/28800/diff/4/?file=1124782#file1124782line44>
> >
> >     I think **PrivilegeCacheTestImpl.class.getName()** is better. If others want to change the class name, there will be a error at compiling.

I'll update it.


- Colin


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28800/#review106372
-----------------------------------------------------------


On Nov. 13, 2015, 4:10 a.m., Colin Ma wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/28800/
> -----------------------------------------------------------
> 
> (Updated Nov. 13, 2015, 4:10 a.m.)
> 
> 
> Review request for sentry, Lenni Kuff and Sravya Tirukkovalur.
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> Currently, when get the metadata from hive, eg, "show tables", "show databases". Sentry will filter the result and output the authorized entities. There will be many RPC calls when filtering the result. The related code is in HiveAuthzBinding, for example, in filterShowTables:
> 
> ......
> for (String tableName : queryResult) {
>   ......
>   hiveAuthzBinding.authorize(operation, tableMetaDataPrivilege, subject, inputHierarchy,
>             outputHierarchy, providedPrivileges);
>   ......
> }
> ......
> 
> hiveAuthzBinding.authorize will get the privileges from sentry service, if there are many tables in the hive, the filtering process will spend much time. Considering sentry also need to filter the column, HiveAuthzBinding should be improved to reduce the number of rpc calls when doing the filter.
> 
> 
> Diffs
> -----
> 
>   sentry-provider/sentry-provider-cache/pom.xml c67f094 
>   sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/CachedPrivilegeWrap.java PRE-CREATION 
>   sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/DBPrivilegeCache.java PRE-CREATION 
>   sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/SimpleCacheProviderBackend.java 4b98447 
>   sentry-provider/sentry-provider-cache/src/test/java/org/apache/sentry/provider/cache/PrivilegeCacheTestImpl.java a7566e7 
>   sentry-provider/sentry-provider-cache/src/test/java/org/apache/sentry/provider/cache/TestCacheProvider.java e5b29b8 
>   sentry-provider/sentry-provider-db/src/gen/thrift/gen-javabean/org/apache/sentry/provider/db/service/thrift/SentryPolicyService.java 0c24449 
>   sentry-provider/sentry-provider-db/src/gen/thrift/gen-javabean/org/apache/sentry/provider/db/service/thrift/TListSentryPrivilegesForCachedRequest.java PRE-CREATION 
>   sentry-provider/sentry-provider-db/src/gen/thrift/gen-javabean/org/apache/sentry/provider/db/service/thrift/TListSentryPrivilegesForCachedResponse.java PRE-CREATION 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java 8c9401c 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/thrift/SentryPolicyServiceClient.java cbc0aaf 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/thrift/SentryPolicyServiceClientDefaultImpl.java 74f379a 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/thrift/SentryPolicyStoreProcessor.java 4f8c834 
>   sentry-provider/sentry-provider-db/src/main/resources/sentry_policy_service.thrift 40889e8 
> 
> Diff: https://reviews.apache.org/r/28800/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Colin Ma
> 
>


Re: Review Request 28800: SENTRY-565:Improvement the performance when Sentry filter the entity

Posted by Dapeng Sun <da...@intel.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28800/#review106372
-----------------------------------------------------------



sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/SimpleCacheProviderBackend.java (line 36)
<https://reviews.apache.org/r/28800/#comment165127>

    DBPrivilegeCache.class.getName() is better?



sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/SimpleCacheProviderBackend.java (line 38)
<https://reviews.apache.org/r/28800/#comment165120>

    I remembered some methods using reflection to create ProviderBackend instance, they will use the constructor.



sentry-provider/sentry-provider-cache/src/test/java/org/apache/sentry/provider/cache/TestCacheProvider.java (line 40)
<https://reviews.apache.org/r/28800/#comment165121>

    I think **PrivilegeCacheTestImpl.class.getName()** is better. If others want to change the class name, there will be a error at compiling.


- Dapeng Sun


On 十一月 13, 2015, 12:10 p.m., Colin Ma wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/28800/
> -----------------------------------------------------------
> 
> (Updated 十一月 13, 2015, 12:10 p.m.)
> 
> 
> Review request for sentry, Lenni Kuff and Sravya Tirukkovalur.
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> Currently, when get the metadata from hive, eg, "show tables", "show databases". Sentry will filter the result and output the authorized entities. There will be many RPC calls when filtering the result. The related code is in HiveAuthzBinding, for example, in filterShowTables:
> 
> ......
> for (String tableName : queryResult) {
>   ......
>   hiveAuthzBinding.authorize(operation, tableMetaDataPrivilege, subject, inputHierarchy,
>             outputHierarchy, providedPrivileges);
>   ......
> }
> ......
> 
> hiveAuthzBinding.authorize will get the privileges from sentry service, if there are many tables in the hive, the filtering process will spend much time. Considering sentry also need to filter the column, HiveAuthzBinding should be improved to reduce the number of rpc calls when doing the filter.
> 
> 
> Diffs
> -----
> 
>   sentry-provider/sentry-provider-cache/pom.xml c67f094 
>   sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/CachedPrivilegeWrap.java PRE-CREATION 
>   sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/DBPrivilegeCache.java PRE-CREATION 
>   sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/SimpleCacheProviderBackend.java 4b98447 
>   sentry-provider/sentry-provider-cache/src/test/java/org/apache/sentry/provider/cache/PrivilegeCacheTestImpl.java a7566e7 
>   sentry-provider/sentry-provider-cache/src/test/java/org/apache/sentry/provider/cache/TestCacheProvider.java e5b29b8 
>   sentry-provider/sentry-provider-db/src/gen/thrift/gen-javabean/org/apache/sentry/provider/db/service/thrift/SentryPolicyService.java 0c24449 
>   sentry-provider/sentry-provider-db/src/gen/thrift/gen-javabean/org/apache/sentry/provider/db/service/thrift/TListSentryPrivilegesForCachedRequest.java PRE-CREATION 
>   sentry-provider/sentry-provider-db/src/gen/thrift/gen-javabean/org/apache/sentry/provider/db/service/thrift/TListSentryPrivilegesForCachedResponse.java PRE-CREATION 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java 8c9401c 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/thrift/SentryPolicyServiceClient.java cbc0aaf 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/thrift/SentryPolicyServiceClientDefaultImpl.java 74f379a 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/thrift/SentryPolicyStoreProcessor.java 4f8c834 
>   sentry-provider/sentry-provider-db/src/main/resources/sentry_policy_service.thrift 40889e8 
> 
> Diff: https://reviews.apache.org/r/28800/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Colin Ma
> 
>


Re: Review Request 28800: SENTRY-565:Improvement the performance when Sentry filter the entity

Posted by Li Li <li...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28800/#review107069
-----------------------------------------------------------



sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/DBPrivilegeCache.java (line 40)
<https://reviews.apache.org/r/28800/#comment165979>

    how about adding the time unit in the variable name, eg. DEFAULT_SENTRY_PRIVILEGE_CACHE_EXPIRE_SEC



sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/DBPrivilegeCache.java (line 48)
<https://reviews.apache.org/r/28800/#comment165980>

    I think the Concurrency issue that Anne asked is that HashMap is not thread safe.



sentry-provider/sentry-provider-db/src/gen/thrift/gen-javabean/org/apache/sentry/provider/db/service/thrift/TListSentryPrivilegesForCachedResponse.java (line 2)
<https://reviews.apache.org/r/28800/#comment165996>

    I am a little confused. So we have not moified the auto generated file actually? If it is the case, I think we do not need add these auto gen files into review, but the original thrift file.


- Li Li


On Nov. 13, 2015, 4:10 a.m., Colin Ma wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/28800/
> -----------------------------------------------------------
> 
> (Updated Nov. 13, 2015, 4:10 a.m.)
> 
> 
> Review request for sentry, Lenni Kuff and Sravya Tirukkovalur.
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> Currently, when get the metadata from hive, eg, "show tables", "show databases". Sentry will filter the result and output the authorized entities. There will be many RPC calls when filtering the result. The related code is in HiveAuthzBinding, for example, in filterShowTables:
> 
> ......
> for (String tableName : queryResult) {
>   ......
>   hiveAuthzBinding.authorize(operation, tableMetaDataPrivilege, subject, inputHierarchy,
>             outputHierarchy, providedPrivileges);
>   ......
> }
> ......
> 
> hiveAuthzBinding.authorize will get the privileges from sentry service, if there are many tables in the hive, the filtering process will spend much time. Considering sentry also need to filter the column, HiveAuthzBinding should be improved to reduce the number of rpc calls when doing the filter.
> 
> 
> Diffs
> -----
> 
>   sentry-provider/sentry-provider-cache/pom.xml c67f094 
>   sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/CachedPrivilegeWrap.java PRE-CREATION 
>   sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/DBPrivilegeCache.java PRE-CREATION 
>   sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/SimpleCacheProviderBackend.java 4b98447 
>   sentry-provider/sentry-provider-cache/src/test/java/org/apache/sentry/provider/cache/PrivilegeCacheTestImpl.java a7566e7 
>   sentry-provider/sentry-provider-cache/src/test/java/org/apache/sentry/provider/cache/TestCacheProvider.java e5b29b8 
>   sentry-provider/sentry-provider-db/src/gen/thrift/gen-javabean/org/apache/sentry/provider/db/service/thrift/SentryPolicyService.java 0c24449 
>   sentry-provider/sentry-provider-db/src/gen/thrift/gen-javabean/org/apache/sentry/provider/db/service/thrift/TListSentryPrivilegesForCachedRequest.java PRE-CREATION 
>   sentry-provider/sentry-provider-db/src/gen/thrift/gen-javabean/org/apache/sentry/provider/db/service/thrift/TListSentryPrivilegesForCachedResponse.java PRE-CREATION 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java 8c9401c 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/thrift/SentryPolicyServiceClient.java cbc0aaf 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/thrift/SentryPolicyServiceClientDefaultImpl.java 74f379a 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/thrift/SentryPolicyStoreProcessor.java 4f8c834 
>   sentry-provider/sentry-provider-db/src/main/resources/sentry_policy_service.thrift 40889e8 
> 
> Diff: https://reviews.apache.org/r/28800/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Colin Ma
> 
>


Re: Review Request 28800: SENTRY-565:Improvement the performance when Sentry filter the entity

Posted by Lenni Kuff <ls...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28800/#review107927
-----------------------------------------------------------

Ship it!


- Lenni Kuff


On Nov. 24, 2015, 5:59 a.m., Colin Ma wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/28800/
> -----------------------------------------------------------
> 
> (Updated Nov. 24, 2015, 5:59 a.m.)
> 
> 
> Review request for sentry, Lenni Kuff and Sravya Tirukkovalur.
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> Currently, when get the metadata from hive, eg, "show tables", "show databases". Sentry will filter the result and output the authorized entities. There will be many RPC calls when filtering the result. The related code is in HiveAuthzBinding, for example, in filterShowTables:
> 
> ......
> for (String tableName : queryResult) {
>   ......
>   hiveAuthzBinding.authorize(operation, tableMetaDataPrivilege, subject, inputHierarchy,
>             outputHierarchy, providedPrivileges);
>   ......
> }
> ......
> 
> hiveAuthzBinding.authorize will get the privileges from sentry service, if there are many tables in the hive, the filtering process will spend much time. Considering sentry also need to filter the column, HiveAuthzBinding should be improved to reduce the number of rpc calls when doing the filter.
> 
> 
> Diffs
> -----
> 
>   sentry-binding/sentry-binding-hive/pom.xml 6d57a58 
>   sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/HiveAuthzBindingHook.java 18b8a8f 
>   sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/authz/HiveAuthzBinding.java 3071475 
>   sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/conf/HiveAuthzConf.java 3919de7 
>   sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/UserHiveMetadataPrivilegeCache.java PRE-CREATION 
>   sentry-provider/sentry-provider-common/src/main/java/org/apache/sentry/provider/common/AuthorizationProvider.java a88d2f8 
>   sentry-provider/sentry-provider-common/src/main/java/org/apache/sentry/provider/common/NoAuthorizationProvider.java a814527 
>   sentry-provider/sentry-provider-common/src/main/java/org/apache/sentry/provider/common/ResourceAuthorizationProvider.java 06573b7 
>   sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestDbConnections.java ae790f0 
> 
> Diff: https://reviews.apache.org/r/28800/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Colin Ma
> 
>


Re: Review Request 28800: SENTRY-565:Improvement the performance when Sentry filter the entity

Posted by Lenni Kuff <ls...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28800/#review107730
-----------------------------------------------------------

Ship it!


Just a few comments. Overall looks great.

- Lenni Kuff


On Nov. 24, 2015, 5:59 a.m., Colin Ma wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/28800/
> -----------------------------------------------------------
> 
> (Updated Nov. 24, 2015, 5:59 a.m.)
> 
> 
> Review request for sentry, Lenni Kuff and Sravya Tirukkovalur.
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> Currently, when get the metadata from hive, eg, "show tables", "show databases". Sentry will filter the result and output the authorized entities. There will be many RPC calls when filtering the result. The related code is in HiveAuthzBinding, for example, in filterShowTables:
> 
> ......
> for (String tableName : queryResult) {
>   ......
>   hiveAuthzBinding.authorize(operation, tableMetaDataPrivilege, subject, inputHierarchy,
>             outputHierarchy, providedPrivileges);
>   ......
> }
> ......
> 
> hiveAuthzBinding.authorize will get the privileges from sentry service, if there are many tables in the hive, the filtering process will spend much time. Considering sentry also need to filter the column, HiveAuthzBinding should be improved to reduce the number of rpc calls when doing the filter.
> 
> 
> Diffs
> -----
> 
>   sentry-binding/sentry-binding-hive/pom.xml 6d57a58 
>   sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/HiveAuthzBindingHook.java 18b8a8f 
>   sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/authz/HiveAuthzBinding.java 3071475 
>   sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/conf/HiveAuthzConf.java 3919de7 
>   sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/UserHiveMetadataPrivilegeCache.java PRE-CREATION 
>   sentry-provider/sentry-provider-common/src/main/java/org/apache/sentry/provider/common/AuthorizationProvider.java a88d2f8 
>   sentry-provider/sentry-provider-common/src/main/java/org/apache/sentry/provider/common/NoAuthorizationProvider.java a814527 
>   sentry-provider/sentry-provider-common/src/main/java/org/apache/sentry/provider/common/ResourceAuthorizationProvider.java 06573b7 
>   sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestDbConnections.java ae790f0 
> 
> Diff: https://reviews.apache.org/r/28800/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Colin Ma
> 
>


Re: Review Request 28800: SENTRY-565:Improvement the performance when Sentry filter the entity

Posted by Lenni Kuff <ls...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28800/#review107728
-----------------------------------------------------------



sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/HiveAuthzBindingHook.java (line 946)
<https://reviews.apache.org/r/28800/#comment167004>

    typo - "privilege"



sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/authz/HiveAuthzBinding.java (line 244)
<https://reviews.apache.org/r/28800/#comment167003>

    Can you just create a new instance of SimpleCacheProviderBackend directly?



sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/conf/HiveAuthzConf.java (line 62)
<https://reviews.apache.org/r/28800/#comment167002>

    Do we need to expose this config value? Seems like we can just instantiate the class directly since we always expect it to be the same.



sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/UserHiveMetadataPrivilegeCache.java (line 28)
<https://reviews.apache.org/r/28800/#comment167001>

    The name makes me think this is caching Hive metadata. Consider renaming to something like "SimplePrivilegeCache"



sentry-provider/sentry-provider-common/src/main/java/org/apache/sentry/provider/common/NoAuthorizationProvider.java (line 74)
<https://reviews.apache.org/r/28800/#comment166998>

    is null okay to return here? Comment on why you return null.


Looks much better now. Just to confirm - you are still getting the same performance improvements after making this change?

- Lenni Kuff


On Nov. 24, 2015, 5:59 a.m., Colin Ma wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/28800/
> -----------------------------------------------------------
> 
> (Updated Nov. 24, 2015, 5:59 a.m.)
> 
> 
> Review request for sentry, Lenni Kuff and Sravya Tirukkovalur.
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> Currently, when get the metadata from hive, eg, "show tables", "show databases". Sentry will filter the result and output the authorized entities. There will be many RPC calls when filtering the result. The related code is in HiveAuthzBinding, for example, in filterShowTables:
> 
> ......
> for (String tableName : queryResult) {
>   ......
>   hiveAuthzBinding.authorize(operation, tableMetaDataPrivilege, subject, inputHierarchy,
>             outputHierarchy, providedPrivileges);
>   ......
> }
> ......
> 
> hiveAuthzBinding.authorize will get the privileges from sentry service, if there are many tables in the hive, the filtering process will spend much time. Considering sentry also need to filter the column, HiveAuthzBinding should be improved to reduce the number of rpc calls when doing the filter.
> 
> 
> Diffs
> -----
> 
>   sentry-binding/sentry-binding-hive/pom.xml 6d57a58 
>   sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/HiveAuthzBindingHook.java 18b8a8f 
>   sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/authz/HiveAuthzBinding.java 3071475 
>   sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/conf/HiveAuthzConf.java 3919de7 
>   sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/UserHiveMetadataPrivilegeCache.java PRE-CREATION 
>   sentry-provider/sentry-provider-common/src/main/java/org/apache/sentry/provider/common/AuthorizationProvider.java a88d2f8 
>   sentry-provider/sentry-provider-common/src/main/java/org/apache/sentry/provider/common/NoAuthorizationProvider.java a814527 
>   sentry-provider/sentry-provider-common/src/main/java/org/apache/sentry/provider/common/ResourceAuthorizationProvider.java 06573b7 
>   sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestDbConnections.java ae790f0 
> 
> Diff: https://reviews.apache.org/r/28800/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Colin Ma
> 
>


Re: Review Request 28800: SENTRY-565:Improvement the performance when Sentry filter the entity

Posted by Colin Ma <ju...@intel.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28800/
-----------------------------------------------------------

(Updated Nov. 24, 2015, 5:59 a.m.)


Review request for sentry, Lenni Kuff and Sravya Tirukkovalur.


Repository: sentry


Description
-------

Currently, when get the metadata from hive, eg, "show tables", "show databases". Sentry will filter the result and output the authorized entities. There will be many RPC calls when filtering the result. The related code is in HiveAuthzBinding, for example, in filterShowTables:

......
for (String tableName : queryResult) {
  ......
  hiveAuthzBinding.authorize(operation, tableMetaDataPrivilege, subject, inputHierarchy,
            outputHierarchy, providedPrivileges);
  ......
}
......

hiveAuthzBinding.authorize will get the privileges from sentry service, if there are many tables in the hive, the filtering process will spend much time. Considering sentry also need to filter the column, HiveAuthzBinding should be improved to reduce the number of rpc calls when doing the filter.


Diffs (updated)
-----

  sentry-binding/sentry-binding-hive/pom.xml 6d57a58 
  sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/HiveAuthzBindingHook.java 18b8a8f 
  sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/authz/HiveAuthzBinding.java 3071475 
  sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/conf/HiveAuthzConf.java 3919de7 
  sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/UserHiveMetadataPrivilegeCache.java PRE-CREATION 
  sentry-provider/sentry-provider-common/src/main/java/org/apache/sentry/provider/common/AuthorizationProvider.java a88d2f8 
  sentry-provider/sentry-provider-common/src/main/java/org/apache/sentry/provider/common/NoAuthorizationProvider.java a814527 
  sentry-provider/sentry-provider-common/src/main/java/org/apache/sentry/provider/common/ResourceAuthorizationProvider.java 06573b7 
  sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/dbprovider/TestDbConnections.java ae790f0 

Diff: https://reviews.apache.org/r/28800/diff/


Testing
-------


Thanks,

Colin Ma


Re: Review Request 28800: SENTRY-565:Improvement the performance when Sentry filter the entity

Posted by Colin Ma <ju...@intel.com>.

> On Nov. 16, 2015, 7:50 p.m., Anne Yu wrote:
> > sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/DBPrivilegeCache.java, line 18
> > <https://reviews.apache.org/r/28800/diff/4/?file=1124779#file1124779line18>
> >
> >     Since it's about cache and will impact performance, I have some general design questions (or it would be better you can put into a wiki or just javadoc this class):
> >     
> >     1. How to retire cached items? The general strateg. 
> >     If the load is very heavy, Expire time is not specified at a reasonable value, there would be too many cached items coming in in a very short time period so that the hash object will keep increasing fast and dramatically. 
> >     
> >     Do we have a machemism to retire oldest items when the object size is large enough? Or limit cache size.
> >     
> >     We've seen server has an issue which full GC keeps increasing and young GC keeps propogating data into full GC. This impact performance much.
> >     
> >     2. Concurrency.
> >     If there are multiple clients trying to make change about the same group, do we consider concurrency here?
> 
> Anne Yu wrote:
>     Client will be responsible for synchronize the cache; think of multiple concurrent workflows from different organizations. Client might not even realize to synchornize the data;
>     
>     If there is size limit, we count on machine has a lot of memory: It will be difficult to predict how big the cache will become along the time; So for user it's difficult to configure jvm heap size at the beginning; if user has to re-tune, he will have to restart service.

Right, current implementation will cause this problem. For this feature, I should attach a design, and it'll be easy for review and discuss.


- Colin


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28800/#review106716
-----------------------------------------------------------


On Nov. 13, 2015, 4:10 a.m., Colin Ma wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/28800/
> -----------------------------------------------------------
> 
> (Updated Nov. 13, 2015, 4:10 a.m.)
> 
> 
> Review request for sentry, Lenni Kuff and Sravya Tirukkovalur.
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> Currently, when get the metadata from hive, eg, "show tables", "show databases". Sentry will filter the result and output the authorized entities. There will be many RPC calls when filtering the result. The related code is in HiveAuthzBinding, for example, in filterShowTables:
> 
> ......
> for (String tableName : queryResult) {
>   ......
>   hiveAuthzBinding.authorize(operation, tableMetaDataPrivilege, subject, inputHierarchy,
>             outputHierarchy, providedPrivileges);
>   ......
> }
> ......
> 
> hiveAuthzBinding.authorize will get the privileges from sentry service, if there are many tables in the hive, the filtering process will spend much time. Considering sentry also need to filter the column, HiveAuthzBinding should be improved to reduce the number of rpc calls when doing the filter.
> 
> 
> Diffs
> -----
> 
>   sentry-provider/sentry-provider-cache/pom.xml c67f094 
>   sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/CachedPrivilegeWrap.java PRE-CREATION 
>   sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/DBPrivilegeCache.java PRE-CREATION 
>   sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/SimpleCacheProviderBackend.java 4b98447 
>   sentry-provider/sentry-provider-cache/src/test/java/org/apache/sentry/provider/cache/PrivilegeCacheTestImpl.java a7566e7 
>   sentry-provider/sentry-provider-cache/src/test/java/org/apache/sentry/provider/cache/TestCacheProvider.java e5b29b8 
>   sentry-provider/sentry-provider-db/src/gen/thrift/gen-javabean/org/apache/sentry/provider/db/service/thrift/SentryPolicyService.java 0c24449 
>   sentry-provider/sentry-provider-db/src/gen/thrift/gen-javabean/org/apache/sentry/provider/db/service/thrift/TListSentryPrivilegesForCachedRequest.java PRE-CREATION 
>   sentry-provider/sentry-provider-db/src/gen/thrift/gen-javabean/org/apache/sentry/provider/db/service/thrift/TListSentryPrivilegesForCachedResponse.java PRE-CREATION 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java 8c9401c 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/thrift/SentryPolicyServiceClient.java cbc0aaf 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/thrift/SentryPolicyServiceClientDefaultImpl.java 74f379a 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/thrift/SentryPolicyStoreProcessor.java 4f8c834 
>   sentry-provider/sentry-provider-db/src/main/resources/sentry_policy_service.thrift 40889e8 
> 
> Diff: https://reviews.apache.org/r/28800/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Colin Ma
> 
>


Re: Review Request 28800: SENTRY-565:Improvement the performance when Sentry filter the entity

Posted by Anne Yu <an...@cloudera.com>.

> On Nov. 16, 2015, 7:50 p.m., Anne Yu wrote:
> > sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/DBPrivilegeCache.java, line 18
> > <https://reviews.apache.org/r/28800/diff/4/?file=1124779#file1124779line18>
> >
> >     Since it's about cache and will impact performance, I have some general design questions (or it would be better you can put into a wiki or just javadoc this class):
> >     
> >     1. How to retire cached items? The general strateg. 
> >     If the load is very heavy, Expire time is not specified at a reasonable value, there would be too many cached items coming in in a very short time period so that the hash object will keep increasing fast and dramatically. 
> >     
> >     Do we have a machemism to retire oldest items when the object size is large enough? Or limit cache size.
> >     
> >     We've seen server has an issue which full GC keeps increasing and young GC keeps propogating data into full GC. This impact performance much.
> >     
> >     2. Concurrency.
> >     If there are multiple clients trying to make change about the same group, do we consider concurrency here?

Client will be responsible for synchronize the cache; think of multiple concurrent workflows from different organizations. Client might not even realize to synchornize the data;

If there is size limit, we count on machine has a lot of memory: It will be difficult to predict how big the cache will become along the time; So for user it's difficult to configure jvm heap size at the beginning; if user has to re-tune, he will have to restart service.


- Anne


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28800/#review106716
-----------------------------------------------------------


On Nov. 13, 2015, 4:10 a.m., Colin Ma wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/28800/
> -----------------------------------------------------------
> 
> (Updated Nov. 13, 2015, 4:10 a.m.)
> 
> 
> Review request for sentry, Lenni Kuff and Sravya Tirukkovalur.
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> Currently, when get the metadata from hive, eg, "show tables", "show databases". Sentry will filter the result and output the authorized entities. There will be many RPC calls when filtering the result. The related code is in HiveAuthzBinding, for example, in filterShowTables:
> 
> ......
> for (String tableName : queryResult) {
>   ......
>   hiveAuthzBinding.authorize(operation, tableMetaDataPrivilege, subject, inputHierarchy,
>             outputHierarchy, providedPrivileges);
>   ......
> }
> ......
> 
> hiveAuthzBinding.authorize will get the privileges from sentry service, if there are many tables in the hive, the filtering process will spend much time. Considering sentry also need to filter the column, HiveAuthzBinding should be improved to reduce the number of rpc calls when doing the filter.
> 
> 
> Diffs
> -----
> 
>   sentry-provider/sentry-provider-cache/pom.xml c67f094 
>   sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/CachedPrivilegeWrap.java PRE-CREATION 
>   sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/DBPrivilegeCache.java PRE-CREATION 
>   sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/SimpleCacheProviderBackend.java 4b98447 
>   sentry-provider/sentry-provider-cache/src/test/java/org/apache/sentry/provider/cache/PrivilegeCacheTestImpl.java a7566e7 
>   sentry-provider/sentry-provider-cache/src/test/java/org/apache/sentry/provider/cache/TestCacheProvider.java e5b29b8 
>   sentry-provider/sentry-provider-db/src/gen/thrift/gen-javabean/org/apache/sentry/provider/db/service/thrift/SentryPolicyService.java 0c24449 
>   sentry-provider/sentry-provider-db/src/gen/thrift/gen-javabean/org/apache/sentry/provider/db/service/thrift/TListSentryPrivilegesForCachedRequest.java PRE-CREATION 
>   sentry-provider/sentry-provider-db/src/gen/thrift/gen-javabean/org/apache/sentry/provider/db/service/thrift/TListSentryPrivilegesForCachedResponse.java PRE-CREATION 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java 8c9401c 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/thrift/SentryPolicyServiceClient.java cbc0aaf 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/thrift/SentryPolicyServiceClientDefaultImpl.java 74f379a 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/thrift/SentryPolicyStoreProcessor.java 4f8c834 
>   sentry-provider/sentry-provider-db/src/main/resources/sentry_policy_service.thrift 40889e8 
> 
> Diff: https://reviews.apache.org/r/28800/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Colin Ma
> 
>


Re: Review Request 28800: SENTRY-565:Improvement the performance when Sentry filter the entity

Posted by Anne Yu <an...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28800/#review106716
-----------------------------------------------------------



sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/DBPrivilegeCache.java (line 18)
<https://reviews.apache.org/r/28800/#comment165452>

    Since it's about cache and will impact performance, I have some general design questions (or it would be better you can put into a wiki or just javadoc this class):
    
    1. How to retire cached items? The general strateg. 
    If the load is very heavy, Expire time is not specified at a reasonable value, there would be too many cached items coming in in a very short time period so that the hash object will keep increasing fast and dramatically. 
    
    Do we have a machemism to retire oldest items when the object size is large enough? Or limit cache size.
    
    We've seen server has an issue which full GC keeps increasing and young GC keeps propogating data into full GC. This impact performance much.
    
    2. Concurrency.
    If there are multiple clients trying to make change about the same group, do we consider concurrency here?



sentry-provider/sentry-provider-db/src/gen/thrift/gen-javabean/org/apache/sentry/provider/db/service/thrift/SentryPolicyService.java (line 12345)
<https://reviews.apache.org/r/28800/#comment165455>

    Minor issues: please take care of white spaces.


- Anne Yu


On Nov. 13, 2015, 4:10 a.m., Colin Ma wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/28800/
> -----------------------------------------------------------
> 
> (Updated Nov. 13, 2015, 4:10 a.m.)
> 
> 
> Review request for sentry, Lenni Kuff and Sravya Tirukkovalur.
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> Currently, when get the metadata from hive, eg, "show tables", "show databases". Sentry will filter the result and output the authorized entities. There will be many RPC calls when filtering the result. The related code is in HiveAuthzBinding, for example, in filterShowTables:
> 
> ......
> for (String tableName : queryResult) {
>   ......
>   hiveAuthzBinding.authorize(operation, tableMetaDataPrivilege, subject, inputHierarchy,
>             outputHierarchy, providedPrivileges);
>   ......
> }
> ......
> 
> hiveAuthzBinding.authorize will get the privileges from sentry service, if there are many tables in the hive, the filtering process will spend much time. Considering sentry also need to filter the column, HiveAuthzBinding should be improved to reduce the number of rpc calls when doing the filter.
> 
> 
> Diffs
> -----
> 
>   sentry-provider/sentry-provider-cache/pom.xml c67f094 
>   sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/CachedPrivilegeWrap.java PRE-CREATION 
>   sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/DBPrivilegeCache.java PRE-CREATION 
>   sentry-provider/sentry-provider-cache/src/main/java/org/apache/sentry/provider/cache/SimpleCacheProviderBackend.java 4b98447 
>   sentry-provider/sentry-provider-cache/src/test/java/org/apache/sentry/provider/cache/PrivilegeCacheTestImpl.java a7566e7 
>   sentry-provider/sentry-provider-cache/src/test/java/org/apache/sentry/provider/cache/TestCacheProvider.java e5b29b8 
>   sentry-provider/sentry-provider-db/src/gen/thrift/gen-javabean/org/apache/sentry/provider/db/service/thrift/SentryPolicyService.java 0c24449 
>   sentry-provider/sentry-provider-db/src/gen/thrift/gen-javabean/org/apache/sentry/provider/db/service/thrift/TListSentryPrivilegesForCachedRequest.java PRE-CREATION 
>   sentry-provider/sentry-provider-db/src/gen/thrift/gen-javabean/org/apache/sentry/provider/db/service/thrift/TListSentryPrivilegesForCachedResponse.java PRE-CREATION 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/persistent/SentryStore.java 8c9401c 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/thrift/SentryPolicyServiceClient.java cbc0aaf 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/thrift/SentryPolicyServiceClientDefaultImpl.java 74f379a 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/provider/db/service/thrift/SentryPolicyStoreProcessor.java 4f8c834 
>   sentry-provider/sentry-provider-db/src/main/resources/sentry_policy_service.thrift 40889e8 
> 
> Diff: https://reviews.apache.org/r/28800/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Colin Ma
> 
>