You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sentry.apache.org by Dapeng Sun <da...@intel.com> on 2016/01/15 10:21:40 UTC

Review Request 42344: SENTRY-1007: Sentry column-level performance for wide tables

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/42344/
-----------------------------------------------------------

Review request for sentry.


Bugs: SENTRY-1007
    https://issues.apache.org/jira/browse/SENTRY-1007


Repository: sentry


Description
-------

Since current solution will do a authorization for every entity, the sql script like **select col1,col2,col3,.....,colN from test_tb1** will authorize all the query columns.

This patch will reuse the CachedHiveBinding at SENTRY-565. 
If entity > maxQueryNumber, it will query all user's privileges to local, and use the local privilege the authorzation.


Diffs
-----

  sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/HiveAuthzBindingHook.java 57e4689 
  sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/conf/HiveAuthzConf.java e76fad1 

Diff: https://reviews.apache.org/r/42344/diff/


Testing
-------


Thanks,

Dapeng Sun


Re: Review Request 42344: SENTRY-1007: Sentry column-level performance for wide tables

Posted by Dapeng Sun <da...@intel.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/42344/
-----------------------------------------------------------

(Updated 一月 21, 2016, 3:54 p.m.)


Review request for sentry.


Changes
-------

Update patch according Lenni and Colin's feedback.


Bugs: SENTRY-1007
    https://issues.apache.org/jira/browse/SENTRY-1007


Repository: sentry


Description (updated)
-------

Since current architecture will do one time authorization for every entity, the sql script like **select col1,col2,col3,.....,colN from test_tb1** will authorize all the query columns.

This patch will reuse the CachedHiveBinding at SENTRY-565. Querying all user's privileges to local, and use the local privilege for authorzation.


Diffs (updated)
-----

  sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/HiveAuthzBindingHook.java 57e4689 

Diff: https://reviews.apache.org/r/42344/diff/


Testing
-------


Thanks,

Dapeng Sun


Re: Review Request 42344: SENTRY-1007: Sentry column-level performance for wide tables

Posted by Dapeng Sun <da...@intel.com>.

> On 一月 16, 2016, 4:15 p.m., Lenni Kuff wrote:
> > sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/HiveAuthzBindingHook.java, line 605
> > <https://reviews.apache.org/r/42344/diff/2/?file=1198197#file1198197line605>
> >
> >     Why not just always get a cache binding? How much does this improve things versus the previous approach?
> 
> Colin Ma wrote:
>     Agree with Lenni, and we should have a performance test to check the performance impact if always get a cache binding.

I don't think **always getting a cache binding** may be the best solution, since **getting a cache binding** will obtaining all privileges of current user per session. Some hierarchical queries will be happened at database: group->role->privilege. If there are thousands of privileges for user at database, even for the command like switch database: **use database1** will get the thousands of privilege to local.
If we make it configurable, users could balance the two solutions with their cluster.


> On 一月 16, 2016, 4:15 p.m., Lenni Kuff wrote:
> > sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/conf/HiveAuthzConf.java, line 100
> > <https://reviews.apache.org/r/42344/diff/2/?file=1198198#file1198198line100>
> >
> >     Add a comment on what this configuration does.

Good suggestion, I will fix it on next update.


- Dapeng


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/42344/#review114850
-----------------------------------------------------------


On 一月 15, 2016, 7:47 p.m., Dapeng Sun wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/42344/
> -----------------------------------------------------------
> 
> (Updated 一月 15, 2016, 7:47 p.m.)
> 
> 
> Review request for sentry.
> 
> 
> Bugs: SENTRY-1007
>     https://issues.apache.org/jira/browse/SENTRY-1007
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> Since current architecture will do one time authorization for every entity, the sql script like **select col1,col2,col3,.....,colN from test_tb1** will authorize all the query columns.
> 
> This patch will reuse the CachedHiveBinding at SENTRY-565. 
> If entity > maxQueryNumber, it will query all user's privileges to local, and use the local privilege for authorzation.
> 
> 
> Diffs
> -----
> 
>   sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/HiveAuthzBindingHook.java 57e4689 
>   sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/conf/HiveAuthzConf.java e76fad1 
> 
> Diff: https://reviews.apache.org/r/42344/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Dapeng Sun
> 
>


Re: Review Request 42344: SENTRY-1007: Sentry column-level performance for wide tables

Posted by Colin Ma <ju...@intel.com>.

> On Jan. 16, 2016, 8:15 a.m., Lenni Kuff wrote:
> > sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/HiveAuthzBindingHook.java, line 605
> > <https://reviews.apache.org/r/42344/diff/2/?file=1198197#file1198197line605>
> >
> >     Why not just always get a cache binding? How much does this improve things versus the previous approach?

Agree with Lenni, and we should have a performance test to check the performance impact if always get a cache binding.


- Colin


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/42344/#review114850
-----------------------------------------------------------


On Jan. 15, 2016, 11:47 a.m., Dapeng Sun wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/42344/
> -----------------------------------------------------------
> 
> (Updated Jan. 15, 2016, 11:47 a.m.)
> 
> 
> Review request for sentry.
> 
> 
> Bugs: SENTRY-1007
>     https://issues.apache.org/jira/browse/SENTRY-1007
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> Since current architecture will do one time authorization for every entity, the sql script like **select col1,col2,col3,.....,colN from test_tb1** will authorize all the query columns.
> 
> This patch will reuse the CachedHiveBinding at SENTRY-565. 
> If entity > maxQueryNumber, it will query all user's privileges to local, and use the local privilege for authorzation.
> 
> 
> Diffs
> -----
> 
>   sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/HiveAuthzBindingHook.java 57e4689 
>   sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/conf/HiveAuthzConf.java e76fad1 
> 
> Diff: https://reviews.apache.org/r/42344/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Dapeng Sun
> 
>


Re: Review Request 42344: SENTRY-1007: Sentry column-level performance for wide tables

Posted by Dapeng Sun <da...@intel.com>.

> On 一月 16, 2016, 4:15 p.m., Lenni Kuff wrote:
> > sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/HiveAuthzBindingHook.java, line 605
> > <https://reviews.apache.org/r/42344/diff/2/?file=1198197#file1198197line605>
> >
> >     Why not just always get a cache binding? How much does this improve things versus the previous approach?
> 
> Colin Ma wrote:
>     Agree with Lenni, and we should have a performance test to check the performance impact if always get a cache binding.
> 
> Dapeng Sun wrote:
>     I don't think **always getting a cache binding** may be the best solution, since **getting a cache binding** will obtaining all privileges of current user per session. Some hierarchical queries will be happened at database: group->role->privilege. If there are thousands of privileges for user at database, even for the command like switch database: **use database1** will get the thousands of privilege to local.
>     If we make it configurable, users could balance the two solutions with their cluster.
> 
> Lenni Kuff wrote:
>     What privileges are loaded without the cached binding? Is the subset of privileges loaded because we stop at the first positive or because we loaded only privileges for a specific object? The problem with having a separate configuration is that users are not going to have any idea what value to set max.query.num to and it makes configuration more complex.
> 
> Dapeng Sun wrote:
>     Hi Lenni and Colin,
>     
>     **What privileges are loaded without the cached binding?** it will load the only privileges relate to authorizable hierarchy for users, not all the privilege for users.
>     
>     I do a performance test with **use database** 10 times, here is the result. The database is derby, it will have about 10%~40% improvement, but because the latency is not big (less than 1s), I think it is also okay to remove the configuration to make it simple to use and always use the cached binding, do you have any thoughts?
>     
>     
>     Here is test result.
>     number of privileges in database -> total cost time with cached -> total cost time without cached 
>     10 -> 1021ms -> 919ms
>     100 -> 1803ms -> 1285ms
>     1000 -> 5590ms -> 3732ms
> 
> Lenni Kuff wrote:
>     Can we build the cache to only contain the objects in the authorizable hierarchy rather than building a cache for all privileges?

Because of the limitation on PolicyEngine, only one entity could be queried at (listPrivilegesForProvider), I think that's why SENTRY-565 cached all privilege(under the type Server) for user. A safe improvment is passing the common authorizable to cache, but it isn't suit for current case, I filed SENTRY-1019 to track it.


- Dapeng


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/42344/#review114850
-----------------------------------------------------------


On 一月 15, 2016, 7:47 p.m., Dapeng Sun wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/42344/
> -----------------------------------------------------------
> 
> (Updated 一月 15, 2016, 7:47 p.m.)
> 
> 
> Review request for sentry.
> 
> 
> Bugs: SENTRY-1007
>     https://issues.apache.org/jira/browse/SENTRY-1007
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> Since current architecture will do one time authorization for every entity, the sql script like **select col1,col2,col3,.....,colN from test_tb1** will authorize all the query columns.
> 
> This patch will reuse the CachedHiveBinding at SENTRY-565. 
> If entity > maxQueryNumber, it will query all user's privileges to local, and use the local privilege for authorzation.
> 
> 
> Diffs
> -----
> 
>   sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/HiveAuthzBindingHook.java 57e4689 
>   sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/conf/HiveAuthzConf.java e76fad1 
> 
> Diff: https://reviews.apache.org/r/42344/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Dapeng Sun
> 
>


Re: Review Request 42344: SENTRY-1007: Sentry column-level performance for wide tables

Posted by Dapeng Sun <da...@intel.com>.

> On 一月 16, 2016, 4:15 p.m., Lenni Kuff wrote:
> > sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/HiveAuthzBindingHook.java, line 605
> > <https://reviews.apache.org/r/42344/diff/2/?file=1198197#file1198197line605>
> >
> >     Why not just always get a cache binding? How much does this improve things versus the previous approach?
> 
> Colin Ma wrote:
>     Agree with Lenni, and we should have a performance test to check the performance impact if always get a cache binding.
> 
> Dapeng Sun wrote:
>     I don't think **always getting a cache binding** may be the best solution, since **getting a cache binding** will obtaining all privileges of current user per session. Some hierarchical queries will be happened at database: group->role->privilege. If there are thousands of privileges for user at database, even for the command like switch database: **use database1** will get the thousands of privilege to local.
>     If we make it configurable, users could balance the two solutions with their cluster.
> 
> Lenni Kuff wrote:
>     What privileges are loaded without the cached binding? Is the subset of privileges loaded because we stop at the first positive or because we loaded only privileges for a specific object? The problem with having a separate configuration is that users are not going to have any idea what value to set max.query.num to and it makes configuration more complex.

Hi Lenni and Colin,

**What privileges are loaded without the cached binding?** it will load the only privileges relate to authorizable hierarchy for users, not all the privilege for users.

I do a performance test with **use database** 10 times, here is the result. The database is derby, it will have about 10%~40% improvement, but because the latency is not big (less than 1s), I think it is also okay to remove the configuration to make it simple to use and always use the cached binding, do you have any thoughts?


Here is test result.
number of privileges in database -> total cost time with cached -> total cost time without cached 
10 -> 1021ms -> 919ms
100 -> 1803ms -> 1285ms
1000 -> 5590ms -> 3732ms


- Dapeng


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/42344/#review114850
-----------------------------------------------------------


On 一月 15, 2016, 7:47 p.m., Dapeng Sun wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/42344/
> -----------------------------------------------------------
> 
> (Updated 一月 15, 2016, 7:47 p.m.)
> 
> 
> Review request for sentry.
> 
> 
> Bugs: SENTRY-1007
>     https://issues.apache.org/jira/browse/SENTRY-1007
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> Since current architecture will do one time authorization for every entity, the sql script like **select col1,col2,col3,.....,colN from test_tb1** will authorize all the query columns.
> 
> This patch will reuse the CachedHiveBinding at SENTRY-565. 
> If entity > maxQueryNumber, it will query all user's privileges to local, and use the local privilege for authorzation.
> 
> 
> Diffs
> -----
> 
>   sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/HiveAuthzBindingHook.java 57e4689 
>   sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/conf/HiveAuthzConf.java e76fad1 
> 
> Diff: https://reviews.apache.org/r/42344/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Dapeng Sun
> 
>


Re: Review Request 42344: SENTRY-1007: Sentry column-level performance for wide tables

Posted by Lenni Kuff <ls...@cloudera.com>.

> On Jan. 16, 2016, 8:15 a.m., Lenni Kuff wrote:
> > sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/HiveAuthzBindingHook.java, line 605
> > <https://reviews.apache.org/r/42344/diff/2/?file=1198197#file1198197line605>
> >
> >     Why not just always get a cache binding? How much does this improve things versus the previous approach?
> 
> Colin Ma wrote:
>     Agree with Lenni, and we should have a performance test to check the performance impact if always get a cache binding.
> 
> Dapeng Sun wrote:
>     I don't think **always getting a cache binding** may be the best solution, since **getting a cache binding** will obtaining all privileges of current user per session. Some hierarchical queries will be happened at database: group->role->privilege. If there are thousands of privileges for user at database, even for the command like switch database: **use database1** will get the thousands of privilege to local.
>     If we make it configurable, users could balance the two solutions with their cluster.
> 
> Lenni Kuff wrote:
>     What privileges are loaded without the cached binding? Is the subset of privileges loaded because we stop at the first positive or because we loaded only privileges for a specific object? The problem with having a separate configuration is that users are not going to have any idea what value to set max.query.num to and it makes configuration more complex.
> 
> Dapeng Sun wrote:
>     Hi Lenni and Colin,
>     
>     **What privileges are loaded without the cached binding?** it will load the only privileges relate to authorizable hierarchy for users, not all the privilege for users.
>     
>     I do a performance test with **use database** 10 times, here is the result. The database is derby, it will have about 10%~40% improvement, but because the latency is not big (less than 1s), I think it is also okay to remove the configuration to make it simple to use and always use the cached binding, do you have any thoughts?
>     
>     
>     Here is test result.
>     number of privileges in database -> total cost time with cached -> total cost time without cached 
>     10 -> 1021ms -> 919ms
>     100 -> 1803ms -> 1285ms
>     1000 -> 5590ms -> 3732ms

Can we build the cache to only contain the objects in the authorizable hierarchy rather than building a cache for all privileges?


- Lenni


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/42344/#review114850
-----------------------------------------------------------


On Jan. 15, 2016, 11:47 a.m., Dapeng Sun wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/42344/
> -----------------------------------------------------------
> 
> (Updated Jan. 15, 2016, 11:47 a.m.)
> 
> 
> Review request for sentry.
> 
> 
> Bugs: SENTRY-1007
>     https://issues.apache.org/jira/browse/SENTRY-1007
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> Since current architecture will do one time authorization for every entity, the sql script like **select col1,col2,col3,.....,colN from test_tb1** will authorize all the query columns.
> 
> This patch will reuse the CachedHiveBinding at SENTRY-565. 
> If entity > maxQueryNumber, it will query all user's privileges to local, and use the local privilege for authorzation.
> 
> 
> Diffs
> -----
> 
>   sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/HiveAuthzBindingHook.java 57e4689 
>   sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/conf/HiveAuthzConf.java e76fad1 
> 
> Diff: https://reviews.apache.org/r/42344/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Dapeng Sun
> 
>


Re: Review Request 42344: SENTRY-1007: Sentry column-level performance for wide tables

Posted by Lenni Kuff <ls...@cloudera.com>.

> On Jan. 16, 2016, 8:15 a.m., Lenni Kuff wrote:
> > sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/HiveAuthzBindingHook.java, line 605
> > <https://reviews.apache.org/r/42344/diff/2/?file=1198197#file1198197line605>
> >
> >     Why not just always get a cache binding? How much does this improve things versus the previous approach?
> 
> Colin Ma wrote:
>     Agree with Lenni, and we should have a performance test to check the performance impact if always get a cache binding.
> 
> Dapeng Sun wrote:
>     I don't think **always getting a cache binding** may be the best solution, since **getting a cache binding** will obtaining all privileges of current user per session. Some hierarchical queries will be happened at database: group->role->privilege. If there are thousands of privileges for user at database, even for the command like switch database: **use database1** will get the thousands of privilege to local.
>     If we make it configurable, users could balance the two solutions with their cluster.

What privileges are loaded without the cached binding? Is the subset of privileges loaded because we stop at the first positive or because we loaded only privileges for a specific object? The problem with having a separate configuration is that users are not going to have any idea what value to set max.query.num to and it makes configuration more complex.


- Lenni


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/42344/#review114850
-----------------------------------------------------------


On Jan. 15, 2016, 11:47 a.m., Dapeng Sun wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/42344/
> -----------------------------------------------------------
> 
> (Updated Jan. 15, 2016, 11:47 a.m.)
> 
> 
> Review request for sentry.
> 
> 
> Bugs: SENTRY-1007
>     https://issues.apache.org/jira/browse/SENTRY-1007
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> Since current architecture will do one time authorization for every entity, the sql script like **select col1,col2,col3,.....,colN from test_tb1** will authorize all the query columns.
> 
> This patch will reuse the CachedHiveBinding at SENTRY-565. 
> If entity > maxQueryNumber, it will query all user's privileges to local, and use the local privilege for authorzation.
> 
> 
> Diffs
> -----
> 
>   sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/HiveAuthzBindingHook.java 57e4689 
>   sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/conf/HiveAuthzConf.java e76fad1 
> 
> Diff: https://reviews.apache.org/r/42344/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Dapeng Sun
> 
>


Re: Review Request 42344: SENTRY-1007: Sentry column-level performance for wide tables

Posted by Dapeng Sun <da...@intel.com>.

> On 一月 16, 2016, 4:15 p.m., Lenni Kuff wrote:
> > sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/HiveAuthzBindingHook.java, line 605
> > <https://reviews.apache.org/r/42344/diff/2/?file=1198197#file1198197line605>
> >
> >     Why not just always get a cache binding? How much does this improve things versus the previous approach?
> 
> Colin Ma wrote:
>     Agree with Lenni, and we should have a performance test to check the performance impact if always get a cache binding.
> 
> Dapeng Sun wrote:
>     I don't think **always getting a cache binding** may be the best solution, since **getting a cache binding** will obtaining all privileges of current user per session. Some hierarchical queries will be happened at database: group->role->privilege. If there are thousands of privileges for user at database, even for the command like switch database: **use database1** will get the thousands of privilege to local.
>     If we make it configurable, users could balance the two solutions with their cluster.
> 
> Lenni Kuff wrote:
>     What privileges are loaded without the cached binding? Is the subset of privileges loaded because we stop at the first positive or because we loaded only privileges for a specific object? The problem with having a separate configuration is that users are not going to have any idea what value to set max.query.num to and it makes configuration more complex.
> 
> Dapeng Sun wrote:
>     Hi Lenni and Colin,
>     
>     **What privileges are loaded without the cached binding?** it will load the only privileges relate to authorizable hierarchy for users, not all the privilege for users.
>     
>     I do a performance test with **use database** 10 times, here is the result. The database is derby, it will have about 10%~40% improvement, but because the latency is not big (less than 1s), I think it is also okay to remove the configuration to make it simple to use and always use the cached binding, do you have any thoughts?
>     
>     
>     Here is test result.
>     number of privileges in database -> total cost time with cached -> total cost time without cached 
>     10 -> 1021ms -> 919ms
>     100 -> 1803ms -> 1285ms
>     1000 -> 5590ms -> 3732ms
> 
> Lenni Kuff wrote:
>     Can we build the cache to only contain the objects in the authorizable hierarchy rather than building a cache for all privileges?
> 
> Dapeng Sun wrote:
>     Because of the limitation on PolicyEngine, only one entity could be queried at (listPrivilegesForProvider), I think that's why SENTRY-565 cached all privilege(under the type Server) for user. A safe improvment is passing the common authorizable to cache, but it isn't suit for current case, I filed SENTRY-1019 to track it.

I will update the patch with **always getting a cache binding**


- Dapeng


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/42344/#review114850
-----------------------------------------------------------


On 一月 15, 2016, 7:47 p.m., Dapeng Sun wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/42344/
> -----------------------------------------------------------
> 
> (Updated 一月 15, 2016, 7:47 p.m.)
> 
> 
> Review request for sentry.
> 
> 
> Bugs: SENTRY-1007
>     https://issues.apache.org/jira/browse/SENTRY-1007
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> Since current architecture will do one time authorization for every entity, the sql script like **select col1,col2,col3,.....,colN from test_tb1** will authorize all the query columns.
> 
> This patch will reuse the CachedHiveBinding at SENTRY-565. 
> If entity > maxQueryNumber, it will query all user's privileges to local, and use the local privilege for authorzation.
> 
> 
> Diffs
> -----
> 
>   sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/HiveAuthzBindingHook.java 57e4689 
>   sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/conf/HiveAuthzConf.java e76fad1 
> 
> Diff: https://reviews.apache.org/r/42344/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Dapeng Sun
> 
>


Re: Review Request 42344: SENTRY-1007: Sentry column-level performance for wide tables

Posted by Lenni Kuff <ls...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/42344/#review114850
-----------------------------------------------------------



sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/HiveAuthzBindingHook.java (line 605)
<https://reviews.apache.org/r/42344/#comment175695>

    Why not just always get a cache binding? How much does this improve things versus the previous approach?



sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/conf/HiveAuthzConf.java (line 100)
<https://reviews.apache.org/r/42344/#comment175696>

    Add a comment on what this configuration does.


- Lenni Kuff


On Jan. 15, 2016, 11:47 a.m., Dapeng Sun wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/42344/
> -----------------------------------------------------------
> 
> (Updated Jan. 15, 2016, 11:47 a.m.)
> 
> 
> Review request for sentry.
> 
> 
> Bugs: SENTRY-1007
>     https://issues.apache.org/jira/browse/SENTRY-1007
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> Since current architecture will do one time authorization for every entity, the sql script like **select col1,col2,col3,.....,colN from test_tb1** will authorize all the query columns.
> 
> This patch will reuse the CachedHiveBinding at SENTRY-565. 
> If entity > maxQueryNumber, it will query all user's privileges to local, and use the local privilege for authorzation.
> 
> 
> Diffs
> -----
> 
>   sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/HiveAuthzBindingHook.java 57e4689 
>   sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/conf/HiveAuthzConf.java e76fad1 
> 
> Diff: https://reviews.apache.org/r/42344/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Dapeng Sun
> 
>


Re: Review Request 42344: SENTRY-1007: Sentry column-level performance for wide tables

Posted by Dapeng Sun <da...@intel.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/42344/
-----------------------------------------------------------

(Updated 一月 15, 2016, 7:47 p.m.)


Review request for sentry.


Bugs: SENTRY-1007
    https://issues.apache.org/jira/browse/SENTRY-1007


Repository: sentry


Description (updated)
-------

Since current architecture will do one time authorization for every entity, the sql script like **select col1,col2,col3,.....,colN from test_tb1** will authorize all the query columns.

This patch will reuse the CachedHiveBinding at SENTRY-565. 
If entity > maxQueryNumber, it will query all user's privileges to local, and use the local privilege for authorzation.


Diffs (updated)
-----

  sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/HiveAuthzBindingHook.java 57e4689 
  sentry-binding/sentry-binding-hive/src/main/java/org/apache/sentry/binding/hive/conf/HiveAuthzConf.java e76fad1 

Diff: https://reviews.apache.org/r/42344/diff/


Testing
-------


Thanks,

Dapeng Sun