You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sentry.apache.org by Sravya Tirukkovalur <sr...@cloudera.com> on 2015/12/01 02:44:54 UTC

Review Request 40807: SENTRY-953: External Partitions which are referenced by more than one table can cause some unexpected behavior with Sentry HDFS sync

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/40807/
-----------------------------------------------------------

Review request for sentry.


Bugs: SENTRY-953
    https://issues.apache.org/jira/browse/SENTRY-953


Repository: sentry


Description
-------

In the current design we assume a path can be associated with only one hive object. But it is possible where a path can be associated with multiple hive objects: tables/partitions. I removed the thrift generated code form the review to avoid noise.


Diffs
-----

  sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/AuthzPaths.java ba16f4ab09df3c997b0ec87c8187b5ded001376b 
  sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/HMSPaths.java d52e3617a9d793e6df141d495f7badf3aa754c56 
  sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/HMSPathsDumper.java 8f7bb0f61cadffa2390d6915f25cab3a0b406e6d 
  sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/UpdateableAuthzPaths.java b74f9541fc4f48b9e94aa8161eb0a4d2466a468b 
  sentry-hdfs/sentry-hdfs-common/src/main/resources/sentry_hdfs_service.thrift fb60855741e9bff9b841d7e8cb86e6823c4e315f 
  sentry-hdfs/sentry-hdfs-common/src/test/java/org/apache/sentry/hdfs/TestHMSPaths.java 29868ae26d512f78b1c8500eabd544f2c4e30e77 
  sentry-hdfs/sentry-hdfs-common/src/test/java/org/apache/sentry/hdfs/TestUpdateableAuthzPaths.java 4b8a058138f42688512b8d5a9860da90a16d6265 
  sentry-hdfs/sentry-hdfs-namenode-plugin/src/main/java/org/apache/sentry/hdfs/SentryAuthorizationInfo.java c9accc116213ce48625ffdd220e00ba634a00d1d 
  sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/hdfs/TestHDFSIntegration.java 208c93b77b8d46d87170b2665c5b7e2557812dc7 

Diff: https://reviews.apache.org/r/40807/diff/


Testing
-------

Added a simple test, but need more test coverage. Following is the test plan:
1.1. Two partitions of different tables pointing to same location with different grants => ACLS should have union (no duplicates) of both rules.
1.2. Drop first table => should still have second table permissions
1.3. Drop second table => should still have first table permissions
1.4. Do 1.2 but drop partition instead
1.5. Do 1.3 but drop partition instead
2.1. Two partitions of same table pointing to same location => ACLS should not be repeated.
2.2. Drop first partition => Should still have acls
2.3. Same as 2.2, but drop second partition
3.1. Two tables pointing to same location => union of rules.
3.2. Drop first table
3.3. Drop second table
One thing I cannot test on pseudo cluster is initialization is happening correctly when there are multiple objects pointing to the same path as there is no way to persist meta store and restart HMS. I will try to mock is some how.


Thanks,

Sravya Tirukkovalur


Re: Review Request 40807: SENTRY-953: External Partitions which are referenced by more than one table can cause some unexpected behavior with Sentry HDFS sync

Posted by Li Li <li...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/40807/#review108769
-----------------------------------------------------------



sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/HMSPaths.java (line 122)
<https://reviews.apache.org/r/40807/#comment168223>

    do we also need to exclude the empty string here?



sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/HMSPaths.java (line 123)
<https://reviews.apache.org/r/40807/#comment168224>

    how about move (authzObj != null) to addAuthzObjs function and just call addAuthzObjs(authzObj)?



sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/HMSPaths.java (line 133)
<https://reviews.apache.org/r/40807/#comment168226>

    how about move the verification logic to addAuthzObjs function and just call addAuthzObjs(authzObjs)?



sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/HMSPaths.java (line 272)
<https://reviews.apache.org/r/40807/#comment168240>

    Is it safe to return the original set in public method here? Should we return a copy of authzObjs?



sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/HMSPaths.java (line 315)
<https://reviews.apache.org/r/40807/#comment168242>

    is it possible that authzObjs is null here and also in line 490?
    eg. clearAuthzObjs() had been called



sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/HMSPathsDumper.java (line 73)
<https://reviews.apache.org/r/40807/#comment168244>

    how about add a public method for the size of authzObjs in Entry?



sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/hdfs/TestHDFSIntegration.java (line 1178)
<https://reviews.apache.org/r/40807/#comment168249>

    should we delete the hdfs dir and tbls/db after this test?


- Li Li


On Dec. 1, 2015, 1:44 a.m., Sravya Tirukkovalur wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/40807/
> -----------------------------------------------------------
> 
> (Updated Dec. 1, 2015, 1:44 a.m.)
> 
> 
> Review request for sentry.
> 
> 
> Bugs: SENTRY-953
>     https://issues.apache.org/jira/browse/SENTRY-953
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> In the current design we assume a path can be associated with only one hive object. But it is possible where a path can be associated with multiple hive objects: tables/partitions. I removed the thrift generated code form the review to avoid noise.
> 
> 
> Diffs
> -----
> 
>   sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/AuthzPaths.java ba16f4ab09df3c997b0ec87c8187b5ded001376b 
>   sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/HMSPaths.java d52e3617a9d793e6df141d495f7badf3aa754c56 
>   sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/HMSPathsDumper.java 8f7bb0f61cadffa2390d6915f25cab3a0b406e6d 
>   sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/UpdateableAuthzPaths.java b74f9541fc4f48b9e94aa8161eb0a4d2466a468b 
>   sentry-hdfs/sentry-hdfs-common/src/main/resources/sentry_hdfs_service.thrift fb60855741e9bff9b841d7e8cb86e6823c4e315f 
>   sentry-hdfs/sentry-hdfs-common/src/test/java/org/apache/sentry/hdfs/TestHMSPaths.java 29868ae26d512f78b1c8500eabd544f2c4e30e77 
>   sentry-hdfs/sentry-hdfs-common/src/test/java/org/apache/sentry/hdfs/TestUpdateableAuthzPaths.java 4b8a058138f42688512b8d5a9860da90a16d6265 
>   sentry-hdfs/sentry-hdfs-namenode-plugin/src/main/java/org/apache/sentry/hdfs/SentryAuthorizationInfo.java c9accc116213ce48625ffdd220e00ba634a00d1d 
>   sentry-tests/sentry-tests-hive/src/test/java/org/apache/sentry/tests/e2e/hdfs/TestHDFSIntegration.java 208c93b77b8d46d87170b2665c5b7e2557812dc7 
> 
> Diff: https://reviews.apache.org/r/40807/diff/
> 
> 
> Testing
> -------
> 
> Added a simple test, but need more test coverage. Following is the test plan:
> 1.1. Two partitions of different tables pointing to same location with different grants => ACLS should have union (no duplicates) of both rules.
> 1.2. Drop first table => should still have second table permissions
> 1.3. Drop second table => should still have first table permissions
> 1.4. Do 1.2 but drop partition instead
> 1.5. Do 1.3 but drop partition instead
> 2.1. Two partitions of same table pointing to same location => ACLS should not be repeated.
> 2.2. Drop first partition => Should still have acls
> 2.3. Same as 2.2, but drop second partition
> 3.1. Two tables pointing to same location => union of rules.
> 3.2. Drop first table
> 3.3. Drop second table
> One thing I cannot test on pseudo cluster is initialization is happening correctly when there are multiple objects pointing to the same path as there is no way to persist meta store and restart HMS. I will try to mock is some how.
> 
> 
> Thanks,
> 
> Sravya Tirukkovalur
> 
>