You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sentry.apache.org by Gregory Chanan <gc...@cloudera.com> on 2015/04/04 03:11:03 UTC

Review Request 32847: SENTRY-678: Sentry-Solr Binding may not load group mapping service correctly

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32847/
-----------------------------------------------------------

Review request for sentry, shen guoquan, Prasad Mujumdar, and Vamsee Yarlagadda.


Repository: sentry


Description
-------

Adds a unit test for using different group mappings that fails without the patch and passes with the patch.

The basic issue is that in solr we don't have hadoop Configuration files on the classpath and the hadoop group inevitably get set to "new Configuration()" (the jira has an example where calling FileSystem.get(conf) sets up the group mappings to "new Configuration" -- not the conf you pass in.  But this is just one example, I've seen it all over the place in hadoop code.

So, this patch does two things:
1) sets the configuration earlier in the binding so we get the group mappings before the provider backend constructor (which reads from hdfs, so can call FileSystem.get(...)
2) changes the HadoopGroupResourceAuthorizationProvider from using Groups.getUserToGroupsMappingService(...) to using new Groups(...).  That is, the groups now use the actual configuration you passed in rather than whatever Groups were originally set up as (possibly mistakenly, as in the example above).  In the case where Groups.getUserToGroupMappingService(...) hasn't been called yet, the behavior is the same.  This is definitely an improvement in Solr, and I don't think it should affect hive because from my understanding Hive has the configuration files on the classpath, so the behavior should be identical.  It would be good if someone could confirm that though.


Diffs
-----

  sentry-binding/sentry-binding-solr/src/main/java/org/apache/sentry/binding/solr/authz/SolrAuthzBinding.java 373ee8c7fd0b3bc569de39dd664c8876d5597b30 
  sentry-binding/sentry-binding-solr/src/test/java/org/apache/sentry/binding/solr/TestSolrAuthzBinding.java 1bc01a2d837f0afc78547c1667f701cdbd1f6193 
  sentry-provider/sentry-provider-common/src/main/java/org/apache/sentry/provider/common/HadoopGroupMappingService.java 14e2d05c919ec540acf845056ed8972239a27768 
  sentry-provider/sentry-provider-common/src/main/java/org/apache/sentry/provider/common/HadoopGroupResourceAuthorizationProvider.java 626fd909cac9630dc11a2386886423bde4fba190 

Diff: https://reviews.apache.org/r/32847/diff/


Testing
-------

Ran the changed unit test.  We'll see what the QA bot says.


Thanks,

Gregory Chanan


Re: Review Request 32847: SENTRY-678: Sentry-Solr Binding may not load group mapping service correctly

Posted by Prasad Mujumdar <pr...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32847/#review81279
-----------------------------------------------------------

Ship it!


- Prasad Mujumdar


On April 23, 2015, 1:51 a.m., Gregory Chanan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/32847/
> -----------------------------------------------------------
> 
> (Updated April 23, 2015, 1:51 a.m.)
> 
> 
> Review request for sentry, shen guoquan, Prasad Mujumdar, and Vamsee Yarlagadda.
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> Adds a unit test for using different group mappings that fails without the patch and passes with the patch.
> 
> The basic issue is that in solr we don't have hadoop Configuration files on the classpath and the hadoop group inevitably get set to "new Configuration()" (the jira has an example where calling FileSystem.get(conf) sets up the group mappings to "new Configuration" -- not the conf you pass in.  But this is just one example, I've seen it all over the place in hadoop code.
> 
> So, this patch does two things:
> 1) sets the configuration earlier in the binding so we get the group mappings before the provider backend constructor (which reads from hdfs, so can call FileSystem.get(...)
> 2) changes the HadoopGroupResourceAuthorizationProvider from using Groups.getUserToGroupsMappingService(...) to using new Groups(...).  That is, the groups now use the actual configuration you passed in rather than whatever Groups were originally set up as (possibly mistakenly, as in the example above).  In the case where Groups.getUserToGroupMappingService(...) hasn't been called yet, the behavior is the same.  This is definitely an improvement in Solr, and I don't think it should affect hive because from my understanding Hive has the configuration files on the classpath, so the behavior should be identical.  It would be good if someone could confirm that though.
> 
> 
> Diffs
> -----
> 
>   sentry-binding/sentry-binding-solr/src/main/java/org/apache/sentry/binding/solr/authz/SolrAuthzBinding.java 373ee8c7fd0b3bc569de39dd664c8876d5597b30 
>   sentry-binding/sentry-binding-solr/src/test/java/org/apache/sentry/binding/solr/TestSolrAuthzBinding.java 1bc01a2d837f0afc78547c1667f701cdbd1f6193 
>   sentry-provider/sentry-provider-common/src/main/java/org/apache/sentry/provider/common/HadoopGroupMappingService.java 14e2d05c919ec540acf845056ed8972239a27768 
>   sentry-provider/sentry-provider-common/src/main/java/org/apache/sentry/provider/common/HadoopGroupResourceAuthorizationProvider.java 626fd909cac9630dc11a2386886423bde4fba190 
> 
> Diff: https://reviews.apache.org/r/32847/diff/
> 
> 
> Testing
> -------
> 
> Ran the changed unit test.  We'll see what the QA bot says.
> 
> 
> Thanks,
> 
> Gregory Chanan
> 
>


Re: Review Request 32847: SENTRY-678: Sentry-Solr Binding may not load group mapping service correctly

Posted by Gregory Chanan <gc...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32847/
-----------------------------------------------------------

(Updated April 23, 2015, 1:51 a.m.)


Review request for sentry, shen guoquan, Prasad Mujumdar, and Vamsee Yarlagadda.


Repository: sentry


Description
-------

Adds a unit test for using different group mappings that fails without the patch and passes with the patch.

The basic issue is that in solr we don't have hadoop Configuration files on the classpath and the hadoop group inevitably get set to "new Configuration()" (the jira has an example where calling FileSystem.get(conf) sets up the group mappings to "new Configuration" -- not the conf you pass in.  But this is just one example, I've seen it all over the place in hadoop code.

So, this patch does two things:
1) sets the configuration earlier in the binding so we get the group mappings before the provider backend constructor (which reads from hdfs, so can call FileSystem.get(...)
2) changes the HadoopGroupResourceAuthorizationProvider from using Groups.getUserToGroupsMappingService(...) to using new Groups(...).  That is, the groups now use the actual configuration you passed in rather than whatever Groups were originally set up as (possibly mistakenly, as in the example above).  In the case where Groups.getUserToGroupMappingService(...) hasn't been called yet, the behavior is the same.  This is definitely an improvement in Solr, and I don't think it should affect hive because from my understanding Hive has the configuration files on the classpath, so the behavior should be identical.  It would be good if someone could confirm that though.


Diffs (updated)
-----

  sentry-binding/sentry-binding-solr/src/main/java/org/apache/sentry/binding/solr/authz/SolrAuthzBinding.java 373ee8c7fd0b3bc569de39dd664c8876d5597b30 
  sentry-binding/sentry-binding-solr/src/test/java/org/apache/sentry/binding/solr/TestSolrAuthzBinding.java 1bc01a2d837f0afc78547c1667f701cdbd1f6193 
  sentry-provider/sentry-provider-common/src/main/java/org/apache/sentry/provider/common/HadoopGroupMappingService.java 14e2d05c919ec540acf845056ed8972239a27768 
  sentry-provider/sentry-provider-common/src/main/java/org/apache/sentry/provider/common/HadoopGroupResourceAuthorizationProvider.java 626fd909cac9630dc11a2386886423bde4fba190 

Diff: https://reviews.apache.org/r/32847/diff/


Testing
-------

Ran the changed unit test.  We'll see what the QA bot says.


Thanks,

Gregory Chanan


Re: Review Request 32847: SENTRY-678: Sentry-Solr Binding may not load group mapping service correctly

Posted by Vamsee Yarlagadda <va...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32847/#review79605
-----------------------------------------------------------

Ship it!


Ship It!

- Vamsee Yarlagadda


On April 4, 2015, 1:11 a.m., Gregory Chanan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/32847/
> -----------------------------------------------------------
> 
> (Updated April 4, 2015, 1:11 a.m.)
> 
> 
> Review request for sentry, shen guoquan, Prasad Mujumdar, and Vamsee Yarlagadda.
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> Adds a unit test for using different group mappings that fails without the patch and passes with the patch.
> 
> The basic issue is that in solr we don't have hadoop Configuration files on the classpath and the hadoop group inevitably get set to "new Configuration()" (the jira has an example where calling FileSystem.get(conf) sets up the group mappings to "new Configuration" -- not the conf you pass in.  But this is just one example, I've seen it all over the place in hadoop code.
> 
> So, this patch does two things:
> 1) sets the configuration earlier in the binding so we get the group mappings before the provider backend constructor (which reads from hdfs, so can call FileSystem.get(...)
> 2) changes the HadoopGroupResourceAuthorizationProvider from using Groups.getUserToGroupsMappingService(...) to using new Groups(...).  That is, the groups now use the actual configuration you passed in rather than whatever Groups were originally set up as (possibly mistakenly, as in the example above).  In the case where Groups.getUserToGroupMappingService(...) hasn't been called yet, the behavior is the same.  This is definitely an improvement in Solr, and I don't think it should affect hive because from my understanding Hive has the configuration files on the classpath, so the behavior should be identical.  It would be good if someone could confirm that though.
> 
> 
> Diffs
> -----
> 
>   sentry-binding/sentry-binding-solr/src/main/java/org/apache/sentry/binding/solr/authz/SolrAuthzBinding.java 373ee8c7fd0b3bc569de39dd664c8876d5597b30 
>   sentry-binding/sentry-binding-solr/src/test/java/org/apache/sentry/binding/solr/TestSolrAuthzBinding.java 1bc01a2d837f0afc78547c1667f701cdbd1f6193 
>   sentry-provider/sentry-provider-common/src/main/java/org/apache/sentry/provider/common/HadoopGroupMappingService.java 14e2d05c919ec540acf845056ed8972239a27768 
>   sentry-provider/sentry-provider-common/src/main/java/org/apache/sentry/provider/common/HadoopGroupResourceAuthorizationProvider.java 626fd909cac9630dc11a2386886423bde4fba190 
> 
> Diff: https://reviews.apache.org/r/32847/diff/
> 
> 
> Testing
> -------
> 
> Ran the changed unit test.  We'll see what the QA bot says.
> 
> 
> Thanks,
> 
> Gregory Chanan
> 
>


Re: Review Request 32847: SENTRY-678: Sentry-Solr Binding may not load group mapping service correctly

Posted by Gregory Chanan <gc...@cloudera.com>.

> On April 21, 2015, 5:47 a.m., Prasad Mujumdar wrote:
> > sentry-provider/sentry-provider-common/src/main/java/org/apache/sentry/provider/common/HadoopGroupResourceAuthorizationProvider.java, line 36
> > <https://reviews.apache.org/r/32847/diff/1/?file=915089#file915089line36>
> >
> >     This will make Sentry to intantiate a new group object  instead of using a static cached instance. My concern is the increased network and session requirement for non-default group mappings like Hadoop's LDAPGroupMapping. With this patch, we'll keep opening new connection to LDAP server every time.
> >     
> >     How about caching that group object in HadoopGroupResourceAuthorizationProvider instead of creating a new one every time ?

I'm not sure I understand the use case.  With solr, we only create the authorization provider once.  Is that not the case with the service?

What exactly would you propose caching?  The group object is dependent on the configuration.  Just caching one group object necessarily ignores the Configuration and gives you the same broken interface as the Hadoop groups; it creates the groups once, based on the first configuration passed in, and silently ignores subsequent attempts to have groups based on different Configurations.  Or would we have a hashmap of Configuration or Name to group mappings?


- Gregory


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32847/#review80912
-----------------------------------------------------------


On April 4, 2015, 1:11 a.m., Gregory Chanan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/32847/
> -----------------------------------------------------------
> 
> (Updated April 4, 2015, 1:11 a.m.)
> 
> 
> Review request for sentry, shen guoquan, Prasad Mujumdar, and Vamsee Yarlagadda.
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> Adds a unit test for using different group mappings that fails without the patch and passes with the patch.
> 
> The basic issue is that in solr we don't have hadoop Configuration files on the classpath and the hadoop group inevitably get set to "new Configuration()" (the jira has an example where calling FileSystem.get(conf) sets up the group mappings to "new Configuration" -- not the conf you pass in.  But this is just one example, I've seen it all over the place in hadoop code.
> 
> So, this patch does two things:
> 1) sets the configuration earlier in the binding so we get the group mappings before the provider backend constructor (which reads from hdfs, so can call FileSystem.get(...)
> 2) changes the HadoopGroupResourceAuthorizationProvider from using Groups.getUserToGroupsMappingService(...) to using new Groups(...).  That is, the groups now use the actual configuration you passed in rather than whatever Groups were originally set up as (possibly mistakenly, as in the example above).  In the case where Groups.getUserToGroupMappingService(...) hasn't been called yet, the behavior is the same.  This is definitely an improvement in Solr, and I don't think it should affect hive because from my understanding Hive has the configuration files on the classpath, so the behavior should be identical.  It would be good if someone could confirm that though.
> 
> 
> Diffs
> -----
> 
>   sentry-binding/sentry-binding-solr/src/main/java/org/apache/sentry/binding/solr/authz/SolrAuthzBinding.java 373ee8c7fd0b3bc569de39dd664c8876d5597b30 
>   sentry-binding/sentry-binding-solr/src/test/java/org/apache/sentry/binding/solr/TestSolrAuthzBinding.java 1bc01a2d837f0afc78547c1667f701cdbd1f6193 
>   sentry-provider/sentry-provider-common/src/main/java/org/apache/sentry/provider/common/HadoopGroupMappingService.java 14e2d05c919ec540acf845056ed8972239a27768 
>   sentry-provider/sentry-provider-common/src/main/java/org/apache/sentry/provider/common/HadoopGroupResourceAuthorizationProvider.java 626fd909cac9630dc11a2386886423bde4fba190 
> 
> Diff: https://reviews.apache.org/r/32847/diff/
> 
> 
> Testing
> -------
> 
> Ran the changed unit test.  We'll see what the QA bot says.
> 
> 
> Thanks,
> 
> Gregory Chanan
> 
>


Re: Review Request 32847: SENTRY-678: Sentry-Solr Binding may not load group mapping service correctly

Posted by shen guoquan <gu...@intel.com>.

> On 四月 21, 2015, 5:47 a.m., Prasad Mujumdar wrote:
> > sentry-provider/sentry-provider-common/src/main/java/org/apache/sentry/provider/common/HadoopGroupResourceAuthorizationProvider.java, line 36
> > <https://reviews.apache.org/r/32847/diff/1/?file=915089#file915089line36>
> >
> >     This will make Sentry to intantiate a new group object  instead of using a static cached instance. My concern is the increased network and session requirement for non-default group mappings like Hadoop's LDAPGroupMapping. With this patch, we'll keep opening new connection to LDAP server every time.
> >     
> >     How about caching that group object in HadoopGroupResourceAuthorizationProvider instead of creating a new one every time ?
> 
> Gregory Chanan wrote:
>     I'm not sure I understand the use case.  With solr, we only create the authorization provider once.  Is that not the case with the service?
>     
>     What exactly would you propose caching?  The group object is dependent on the configuration.  Just caching one group object necessarily ignores the Configuration and gives you the same broken interface as the Hadoop groups; it creates the groups once, based on the first configuration passed in, and silently ignores subsequent attempts to have groups based on different Configurations.  Or would we have a hashmap of Configuration or Name to group mappings?

hi Prasad, Chanan. I know Prasad's performance concern about groupMappingService like Hadoop LDAPGroupMapping. But as the Chanan said, the Solr binding authorzaiton isn't the same as hive. Every session in hive will generate a new groupMappingService, this may be exist performance problem. It is better to cache the groupMappingService. With Solr, when the Solr service start, the authorization provider was created only once. So I think it is not needed to cache the groupMappingService. That's all I know about the Solr authorization binding. If I make any mistake, please feel free to correct me Thanks.


- shen


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32847/#review80912
-----------------------------------------------------------


On 四月 4, 2015, 1:11 a.m., Gregory Chanan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/32847/
> -----------------------------------------------------------
> 
> (Updated 四月 4, 2015, 1:11 a.m.)
> 
> 
> Review request for sentry, shen guoquan, Prasad Mujumdar, and Vamsee Yarlagadda.
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> Adds a unit test for using different group mappings that fails without the patch and passes with the patch.
> 
> The basic issue is that in solr we don't have hadoop Configuration files on the classpath and the hadoop group inevitably get set to "new Configuration()" (the jira has an example where calling FileSystem.get(conf) sets up the group mappings to "new Configuration" -- not the conf you pass in.  But this is just one example, I've seen it all over the place in hadoop code.
> 
> So, this patch does two things:
> 1) sets the configuration earlier in the binding so we get the group mappings before the provider backend constructor (which reads from hdfs, so can call FileSystem.get(...)
> 2) changes the HadoopGroupResourceAuthorizationProvider from using Groups.getUserToGroupsMappingService(...) to using new Groups(...).  That is, the groups now use the actual configuration you passed in rather than whatever Groups were originally set up as (possibly mistakenly, as in the example above).  In the case where Groups.getUserToGroupMappingService(...) hasn't been called yet, the behavior is the same.  This is definitely an improvement in Solr, and I don't think it should affect hive because from my understanding Hive has the configuration files on the classpath, so the behavior should be identical.  It would be good if someone could confirm that though.
> 
> 
> Diffs
> -----
> 
>   sentry-binding/sentry-binding-solr/src/main/java/org/apache/sentry/binding/solr/authz/SolrAuthzBinding.java 373ee8c7fd0b3bc569de39dd664c8876d5597b30 
>   sentry-binding/sentry-binding-solr/src/test/java/org/apache/sentry/binding/solr/TestSolrAuthzBinding.java 1bc01a2d837f0afc78547c1667f701cdbd1f6193 
>   sentry-provider/sentry-provider-common/src/main/java/org/apache/sentry/provider/common/HadoopGroupMappingService.java 14e2d05c919ec540acf845056ed8972239a27768 
>   sentry-provider/sentry-provider-common/src/main/java/org/apache/sentry/provider/common/HadoopGroupResourceAuthorizationProvider.java 626fd909cac9630dc11a2386886423bde4fba190 
> 
> Diff: https://reviews.apache.org/r/32847/diff/
> 
> 
> Testing
> -------
> 
> Ran the changed unit test.  We'll see what the QA bot says.
> 
> 
> Thanks,
> 
> Gregory Chanan
> 
>


Re: Review Request 32847: SENTRY-678: Sentry-Solr Binding may not load group mapping service correctly

Posted by Gregory Chanan <gc...@cloudera.com>.

> On April 21, 2015, 5:47 a.m., Prasad Mujumdar wrote:
> > sentry-provider/sentry-provider-common/src/main/java/org/apache/sentry/provider/common/HadoopGroupResourceAuthorizationProvider.java, line 36
> > <https://reviews.apache.org/r/32847/diff/1/?file=915089#file915089line36>
> >
> >     This will make Sentry to intantiate a new group object  instead of using a static cached instance. My concern is the increased network and session requirement for non-default group mappings like Hadoop's LDAPGroupMapping. With this patch, we'll keep opening new connection to LDAP server every time.
> >     
> >     How about caching that group object in HadoopGroupResourceAuthorizationProvider instead of creating a new one every time ?
> 
> Gregory Chanan wrote:
>     I'm not sure I understand the use case.  With solr, we only create the authorization provider once.  Is that not the case with the service?
>     
>     What exactly would you propose caching?  The group object is dependent on the configuration.  Just caching one group object necessarily ignores the Configuration and gives you the same broken interface as the Hadoop groups; it creates the groups once, based on the first configuration passed in, and silently ignores subsequent attempts to have groups based on different Configurations.  Or would we have a hashmap of Configuration or Name to group mappings?
> 
> shen guoquan wrote:
>     hi Prasad, Chanan. I know Prasad's performance concern about groupMappingService like Hadoop LDAPGroupMapping. But as the Chanan said, the Solr binding authorzaiton isn't the same as hive. Every session in hive will generate a new groupMappingService, this may be exist performance problem. It is better to cache the groupMappingService. With Solr, when the Solr service start, the authorization provider was created only once. So I think it is not needed to cache the groupMappingService. That's all I know about the Solr authorization binding. If I make any mistake, please feel free to correct me Thanks.
> 
> Gregory Chanan wrote:
>     Interesting.  Maybe the correct solution is to use the constructor that is currently marked @VisibleForTesting that takes a GroupMappingService?  Solr would create the group mapping service for its own purposes and the hive implementation would be unaffected.  We would turn off @VisibleForTesting as well.  Thoughts, Prasad?
> 
> Prasad Mujumdar wrote:
>     I was suggesting that we save a static group object that's created only once, similar to what Hadoop does in the Groups. That we it will get intialized correctly with the right config and still doesn't have to be instanciated for each hive access.

I don't want to do that for a couple of reasons:
1) It won't work with the tests, which try different configurations
2) We then have to deal with synchronization

Instead, I just added a config that says whether we should use new groups or not.  Solr will set that conf by default so we get the behavior we want and hive will be unaffected.


- Gregory


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32847/#review80912
-----------------------------------------------------------


On April 4, 2015, 1:11 a.m., Gregory Chanan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/32847/
> -----------------------------------------------------------
> 
> (Updated April 4, 2015, 1:11 a.m.)
> 
> 
> Review request for sentry, shen guoquan, Prasad Mujumdar, and Vamsee Yarlagadda.
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> Adds a unit test for using different group mappings that fails without the patch and passes with the patch.
> 
> The basic issue is that in solr we don't have hadoop Configuration files on the classpath and the hadoop group inevitably get set to "new Configuration()" (the jira has an example where calling FileSystem.get(conf) sets up the group mappings to "new Configuration" -- not the conf you pass in.  But this is just one example, I've seen it all over the place in hadoop code.
> 
> So, this patch does two things:
> 1) sets the configuration earlier in the binding so we get the group mappings before the provider backend constructor (which reads from hdfs, so can call FileSystem.get(...)
> 2) changes the HadoopGroupResourceAuthorizationProvider from using Groups.getUserToGroupsMappingService(...) to using new Groups(...).  That is, the groups now use the actual configuration you passed in rather than whatever Groups were originally set up as (possibly mistakenly, as in the example above).  In the case where Groups.getUserToGroupMappingService(...) hasn't been called yet, the behavior is the same.  This is definitely an improvement in Solr, and I don't think it should affect hive because from my understanding Hive has the configuration files on the classpath, so the behavior should be identical.  It would be good if someone could confirm that though.
> 
> 
> Diffs
> -----
> 
>   sentry-binding/sentry-binding-solr/src/main/java/org/apache/sentry/binding/solr/authz/SolrAuthzBinding.java 373ee8c7fd0b3bc569de39dd664c8876d5597b30 
>   sentry-binding/sentry-binding-solr/src/test/java/org/apache/sentry/binding/solr/TestSolrAuthzBinding.java 1bc01a2d837f0afc78547c1667f701cdbd1f6193 
>   sentry-provider/sentry-provider-common/src/main/java/org/apache/sentry/provider/common/HadoopGroupMappingService.java 14e2d05c919ec540acf845056ed8972239a27768 
>   sentry-provider/sentry-provider-common/src/main/java/org/apache/sentry/provider/common/HadoopGroupResourceAuthorizationProvider.java 626fd909cac9630dc11a2386886423bde4fba190 
> 
> Diff: https://reviews.apache.org/r/32847/diff/
> 
> 
> Testing
> -------
> 
> Ran the changed unit test.  We'll see what the QA bot says.
> 
> 
> Thanks,
> 
> Gregory Chanan
> 
>


Re: Review Request 32847: SENTRY-678: Sentry-Solr Binding may not load group mapping service correctly

Posted by Prasad Mujumdar <pr...@cloudera.com>.

> On April 21, 2015, 5:47 a.m., Prasad Mujumdar wrote:
> > sentry-provider/sentry-provider-common/src/main/java/org/apache/sentry/provider/common/HadoopGroupResourceAuthorizationProvider.java, line 36
> > <https://reviews.apache.org/r/32847/diff/1/?file=915089#file915089line36>
> >
> >     This will make Sentry to intantiate a new group object  instead of using a static cached instance. My concern is the increased network and session requirement for non-default group mappings like Hadoop's LDAPGroupMapping. With this patch, we'll keep opening new connection to LDAP server every time.
> >     
> >     How about caching that group object in HadoopGroupResourceAuthorizationProvider instead of creating a new one every time ?
> 
> Gregory Chanan wrote:
>     I'm not sure I understand the use case.  With solr, we only create the authorization provider once.  Is that not the case with the service?
>     
>     What exactly would you propose caching?  The group object is dependent on the configuration.  Just caching one group object necessarily ignores the Configuration and gives you the same broken interface as the Hadoop groups; it creates the groups once, based on the first configuration passed in, and silently ignores subsequent attempts to have groups based on different Configurations.  Or would we have a hashmap of Configuration or Name to group mappings?
> 
> shen guoquan wrote:
>     hi Prasad, Chanan. I know Prasad's performance concern about groupMappingService like Hadoop LDAPGroupMapping. But as the Chanan said, the Solr binding authorzaiton isn't the same as hive. Every session in hive will generate a new groupMappingService, this may be exist performance problem. It is better to cache the groupMappingService. With Solr, when the Solr service start, the authorization provider was created only once. So I think it is not needed to cache the groupMappingService. That's all I know about the Solr authorization binding. If I make any mistake, please feel free to correct me Thanks.
> 
> Gregory Chanan wrote:
>     Interesting.  Maybe the correct solution is to use the constructor that is currently marked @VisibleForTesting that takes a GroupMappingService?  Solr would create the group mapping service for its own purposes and the hive implementation would be unaffected.  We would turn off @VisibleForTesting as well.  Thoughts, Prasad?

I was suggesting that we save a static group object that's created only once, similar to what Hadoop does in the Groups. That we it will get intialized correctly with the right config and still doesn't have to be instanciated for each hive access.


- Prasad


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32847/#review80912
-----------------------------------------------------------


On April 4, 2015, 1:11 a.m., Gregory Chanan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/32847/
> -----------------------------------------------------------
> 
> (Updated April 4, 2015, 1:11 a.m.)
> 
> 
> Review request for sentry, shen guoquan, Prasad Mujumdar, and Vamsee Yarlagadda.
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> Adds a unit test for using different group mappings that fails without the patch and passes with the patch.
> 
> The basic issue is that in solr we don't have hadoop Configuration files on the classpath and the hadoop group inevitably get set to "new Configuration()" (the jira has an example where calling FileSystem.get(conf) sets up the group mappings to "new Configuration" -- not the conf you pass in.  But this is just one example, I've seen it all over the place in hadoop code.
> 
> So, this patch does two things:
> 1) sets the configuration earlier in the binding so we get the group mappings before the provider backend constructor (which reads from hdfs, so can call FileSystem.get(...)
> 2) changes the HadoopGroupResourceAuthorizationProvider from using Groups.getUserToGroupsMappingService(...) to using new Groups(...).  That is, the groups now use the actual configuration you passed in rather than whatever Groups were originally set up as (possibly mistakenly, as in the example above).  In the case where Groups.getUserToGroupMappingService(...) hasn't been called yet, the behavior is the same.  This is definitely an improvement in Solr, and I don't think it should affect hive because from my understanding Hive has the configuration files on the classpath, so the behavior should be identical.  It would be good if someone could confirm that though.
> 
> 
> Diffs
> -----
> 
>   sentry-binding/sentry-binding-solr/src/main/java/org/apache/sentry/binding/solr/authz/SolrAuthzBinding.java 373ee8c7fd0b3bc569de39dd664c8876d5597b30 
>   sentry-binding/sentry-binding-solr/src/test/java/org/apache/sentry/binding/solr/TestSolrAuthzBinding.java 1bc01a2d837f0afc78547c1667f701cdbd1f6193 
>   sentry-provider/sentry-provider-common/src/main/java/org/apache/sentry/provider/common/HadoopGroupMappingService.java 14e2d05c919ec540acf845056ed8972239a27768 
>   sentry-provider/sentry-provider-common/src/main/java/org/apache/sentry/provider/common/HadoopGroupResourceAuthorizationProvider.java 626fd909cac9630dc11a2386886423bde4fba190 
> 
> Diff: https://reviews.apache.org/r/32847/diff/
> 
> 
> Testing
> -------
> 
> Ran the changed unit test.  We'll see what the QA bot says.
> 
> 
> Thanks,
> 
> Gregory Chanan
> 
>


Re: Review Request 32847: SENTRY-678: Sentry-Solr Binding may not load group mapping service correctly

Posted by Gregory Chanan <gc...@cloudera.com>.

> On April 21, 2015, 5:47 a.m., Prasad Mujumdar wrote:
> > sentry-provider/sentry-provider-common/src/main/java/org/apache/sentry/provider/common/HadoopGroupResourceAuthorizationProvider.java, line 36
> > <https://reviews.apache.org/r/32847/diff/1/?file=915089#file915089line36>
> >
> >     This will make Sentry to intantiate a new group object  instead of using a static cached instance. My concern is the increased network and session requirement for non-default group mappings like Hadoop's LDAPGroupMapping. With this patch, we'll keep opening new connection to LDAP server every time.
> >     
> >     How about caching that group object in HadoopGroupResourceAuthorizationProvider instead of creating a new one every time ?
> 
> Gregory Chanan wrote:
>     I'm not sure I understand the use case.  With solr, we only create the authorization provider once.  Is that not the case with the service?
>     
>     What exactly would you propose caching?  The group object is dependent on the configuration.  Just caching one group object necessarily ignores the Configuration and gives you the same broken interface as the Hadoop groups; it creates the groups once, based on the first configuration passed in, and silently ignores subsequent attempts to have groups based on different Configurations.  Or would we have a hashmap of Configuration or Name to group mappings?
> 
> shen guoquan wrote:
>     hi Prasad, Chanan. I know Prasad's performance concern about groupMappingService like Hadoop LDAPGroupMapping. But as the Chanan said, the Solr binding authorzaiton isn't the same as hive. Every session in hive will generate a new groupMappingService, this may be exist performance problem. It is better to cache the groupMappingService. With Solr, when the Solr service start, the authorization provider was created only once. So I think it is not needed to cache the groupMappingService. That's all I know about the Solr authorization binding. If I make any mistake, please feel free to correct me Thanks.

Interesting.  Maybe the correct solution is to use the constructor that is currently marked @VisibleForTesting that takes a GroupMappingService?  Solr would create the group mapping service for its own purposes and the hive implementation would be unaffected.  We would turn off @VisibleForTesting as well.  Thoughts, Prasad?


- Gregory


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32847/#review80912
-----------------------------------------------------------


On April 4, 2015, 1:11 a.m., Gregory Chanan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/32847/
> -----------------------------------------------------------
> 
> (Updated April 4, 2015, 1:11 a.m.)
> 
> 
> Review request for sentry, shen guoquan, Prasad Mujumdar, and Vamsee Yarlagadda.
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> Adds a unit test for using different group mappings that fails without the patch and passes with the patch.
> 
> The basic issue is that in solr we don't have hadoop Configuration files on the classpath and the hadoop group inevitably get set to "new Configuration()" (the jira has an example where calling FileSystem.get(conf) sets up the group mappings to "new Configuration" -- not the conf you pass in.  But this is just one example, I've seen it all over the place in hadoop code.
> 
> So, this patch does two things:
> 1) sets the configuration earlier in the binding so we get the group mappings before the provider backend constructor (which reads from hdfs, so can call FileSystem.get(...)
> 2) changes the HadoopGroupResourceAuthorizationProvider from using Groups.getUserToGroupsMappingService(...) to using new Groups(...).  That is, the groups now use the actual configuration you passed in rather than whatever Groups were originally set up as (possibly mistakenly, as in the example above).  In the case where Groups.getUserToGroupMappingService(...) hasn't been called yet, the behavior is the same.  This is definitely an improvement in Solr, and I don't think it should affect hive because from my understanding Hive has the configuration files on the classpath, so the behavior should be identical.  It would be good if someone could confirm that though.
> 
> 
> Diffs
> -----
> 
>   sentry-binding/sentry-binding-solr/src/main/java/org/apache/sentry/binding/solr/authz/SolrAuthzBinding.java 373ee8c7fd0b3bc569de39dd664c8876d5597b30 
>   sentry-binding/sentry-binding-solr/src/test/java/org/apache/sentry/binding/solr/TestSolrAuthzBinding.java 1bc01a2d837f0afc78547c1667f701cdbd1f6193 
>   sentry-provider/sentry-provider-common/src/main/java/org/apache/sentry/provider/common/HadoopGroupMappingService.java 14e2d05c919ec540acf845056ed8972239a27768 
>   sentry-provider/sentry-provider-common/src/main/java/org/apache/sentry/provider/common/HadoopGroupResourceAuthorizationProvider.java 626fd909cac9630dc11a2386886423bde4fba190 
> 
> Diff: https://reviews.apache.org/r/32847/diff/
> 
> 
> Testing
> -------
> 
> Ran the changed unit test.  We'll see what the QA bot says.
> 
> 
> Thanks,
> 
> Gregory Chanan
> 
>


Re: Review Request 32847: SENTRY-678: Sentry-Solr Binding may not load group mapping service correctly

Posted by Prasad Mujumdar <pr...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32847/#review80912
-----------------------------------------------------------



sentry-provider/sentry-provider-common/src/main/java/org/apache/sentry/provider/common/HadoopGroupResourceAuthorizationProvider.java
<https://reviews.apache.org/r/32847/#comment131088>

    This will make Sentry to intantiate a new group object  instead of using a static cached instance. My concern is the increased network and session requirement for non-default group mappings like Hadoop's LDAPGroupMapping. With this patch, we'll keep opening new connection to LDAP server every time.
    
    How about caching that group object in HadoopGroupResourceAuthorizationProvider instead of creating a new one every time ?


- Prasad Mujumdar


On April 4, 2015, 1:11 a.m., Gregory Chanan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/32847/
> -----------------------------------------------------------
> 
> (Updated April 4, 2015, 1:11 a.m.)
> 
> 
> Review request for sentry, shen guoquan, Prasad Mujumdar, and Vamsee Yarlagadda.
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> Adds a unit test for using different group mappings that fails without the patch and passes with the patch.
> 
> The basic issue is that in solr we don't have hadoop Configuration files on the classpath and the hadoop group inevitably get set to "new Configuration()" (the jira has an example where calling FileSystem.get(conf) sets up the group mappings to "new Configuration" -- not the conf you pass in.  But this is just one example, I've seen it all over the place in hadoop code.
> 
> So, this patch does two things:
> 1) sets the configuration earlier in the binding so we get the group mappings before the provider backend constructor (which reads from hdfs, so can call FileSystem.get(...)
> 2) changes the HadoopGroupResourceAuthorizationProvider from using Groups.getUserToGroupsMappingService(...) to using new Groups(...).  That is, the groups now use the actual configuration you passed in rather than whatever Groups were originally set up as (possibly mistakenly, as in the example above).  In the case where Groups.getUserToGroupMappingService(...) hasn't been called yet, the behavior is the same.  This is definitely an improvement in Solr, and I don't think it should affect hive because from my understanding Hive has the configuration files on the classpath, so the behavior should be identical.  It would be good if someone could confirm that though.
> 
> 
> Diffs
> -----
> 
>   sentry-binding/sentry-binding-solr/src/main/java/org/apache/sentry/binding/solr/authz/SolrAuthzBinding.java 373ee8c7fd0b3bc569de39dd664c8876d5597b30 
>   sentry-binding/sentry-binding-solr/src/test/java/org/apache/sentry/binding/solr/TestSolrAuthzBinding.java 1bc01a2d837f0afc78547c1667f701cdbd1f6193 
>   sentry-provider/sentry-provider-common/src/main/java/org/apache/sentry/provider/common/HadoopGroupMappingService.java 14e2d05c919ec540acf845056ed8972239a27768 
>   sentry-provider/sentry-provider-common/src/main/java/org/apache/sentry/provider/common/HadoopGroupResourceAuthorizationProvider.java 626fd909cac9630dc11a2386886423bde4fba190 
> 
> Diff: https://reviews.apache.org/r/32847/diff/
> 
> 
> Testing
> -------
> 
> Ran the changed unit test.  We'll see what the QA bot says.
> 
> 
> Thanks,
> 
> Gregory Chanan
> 
>