You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sentry.apache.org by Na Li <li...@cloudera.com> on 2017/06/01 02:56:55 UTC

Re: Review Request 59566: SENTRY-1784: DBUpdateForwarder returns empty update list to HDFS instead of full update

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59566/
-----------------------------------------------------------

(Updated June 1, 2017, 2:56 a.m.)


Review request for sentry, Alexander Kolbasov, kalyan kumar kalvagadda, Sergio Pena, and Vamsee Yarlagadda.


Bugs: SENTRY-1784
    https://issues.apache.org/jira/browse/SENTRY-1784


Repository: sentry


Description
-------

HDFS starts requesting changeID = 0 instead of 1. DBUpdateForwarder returns full update if the request changeID <= 0.  After first full update, the request changeID = 1, so only delta update is sent unless clean up removes changes that are not sent to HDFS. This fixes both issues in this Jira.


Diffs (updated)
-----

  sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/ServiceConstants.java 0741ebc 
  sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/UpdateableAuthzPaths.java ad7f8c9 
  sentry-hdfs/sentry-hdfs-namenode-plugin/src/main/java/org/apache/sentry/hdfs/SentryAuthorizationInfo.java 90ba721 
  sentry-hdfs/sentry-hdfs-namenode-plugin/src/main/java/org/apache/sentry/hdfs/SentryUpdater.java 34caa0e 
  sentry-hdfs/sentry-hdfs-namenode-plugin/src/main/java/org/apache/sentry/hdfs/UpdateableAuthzPermissions.java 431c7fe 
  sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java b8542b3 
  sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryService.java 9beb07b 


Diff: https://reviews.apache.org/r/59566/diff/3/

Changes: https://reviews.apache.org/r/59566/diff/2-3/


Testing
-------

TestHDFSIntegrationEnd2End


Thanks,

Na Li


Re: Review Request 59566: SENTRY-1784: DBUpdateForwarder returns empty update list to HDFS instead of full update

Posted by Na Li <li...@cloudera.com>.

> On June 4, 2017, 3:53 a.m., Alexander Kolbasov wrote:
> > sentry-hdfs/sentry-hdfs-namenode-plugin/src/main/java/org/apache/sentry/hdfs/SentryAuthorizationInfo.java
> > Lines 39 (patched)
> > <https://reviews.apache.org/r/59566/diff/4/?file=1739146#file1739146line39>
> >
> >     The only changes in this file are comments and extra logging - right?

Yes.


- Na


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59566/#review176876
-----------------------------------------------------------


On June 1, 2017, 10:51 p.m., Na Li wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/59566/
> -----------------------------------------------------------
> 
> (Updated June 1, 2017, 10:51 p.m.)
> 
> 
> Review request for sentry, Alexander Kolbasov, kalyan kumar kalvagadda, Sergio Pena, and Vamsee Yarlagadda.
> 
> 
> Bugs: SENTRY-1784
>     https://issues.apache.org/jira/browse/SENTRY-1784
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> HDFS starts requesting changeID = 0 instead of 1. DBUpdateForwarder returns full update if the request changeID <= 0.  After first full update, the request changeID = 1, so only delta update is sent unless clean up removes changes that are not sent to HDFS. This fixes both issues in this Jira.
> 
> 1) What happens when NN just starts
> NN sends requestedId = 0 to sentry server. Sentry server sends full update back. The latest changeID for perm and path are returned to NN. NN saves the latest changeID for perm and path.
> NN initial changeID for permission and path is -1. The requestedId is NN changeID + 1. So initial requestedId from NN to sentry server is 0.
> 
> 2) What happens once NN gets a full snapshot
> Once NN gets a full snapshot, it creats new UpdateableAuthzPermissions for full perm update; it creates new UpdateableAuthzPaths for full path update;
> 
> It saves the latest changeID for perm and path (referred as latestChangeID_path and latestChangeID_perm). Next interval, NN sends requestedId (latestChangeID_path + 1, latestChangeID_perm + 1) to Sentry server. Since the lowest latestChangeID is 0, this requestedId >= 1.
> 
> If Sentry server has changes equal or newer than requestedId, it sends back delta changes to NN. Otherwise, it sends back empty list to NN. The exceptions are: 1) the requestedId does not exist in SentryStore (maybe it is cleaned up) or 2) the returned delta change list from SentryStore is empty (maybe the change table is corrupted). If exception happens, full snapshot is sent back to NN.
> 
> 3) What happens when a delta is received from HMS.
> NN updates the privilege and path based on the delta changes. It update save the latest changeID for perm and path.
> 
> 
> Diffs
> -----
> 
>   sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/ServiceConstants.java 0741ebc 
>   sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/UpdateableAuthzPaths.java ad7f8c9 
>   sentry-hdfs/sentry-hdfs-namenode-plugin/src/main/java/org/apache/sentry/hdfs/SentryAuthorizationInfo.java 90ba721 
>   sentry-hdfs/sentry-hdfs-namenode-plugin/src/main/java/org/apache/sentry/hdfs/SentryUpdater.java 34caa0e 
>   sentry-hdfs/sentry-hdfs-namenode-plugin/src/main/java/org/apache/sentry/hdfs/UpdateableAuthzPermissions.java 431c7fe 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java b8542b3 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryService.java 9beb07b 
> 
> 
> Diff: https://reviews.apache.org/r/59566/diff/4/
> 
> 
> Testing
> -------
> 
> TestHDFSIntegrationEnd2End
> 
> 
> Thanks,
> 
> Na Li
> 
>


Re: Review Request 59566: SENTRY-1784: DBUpdateForwarder returns empty update list to HDFS instead of full update

Posted by Alexander Kolbasov <ak...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59566/#review176876
-----------------------------------------------------------




sentry-hdfs/sentry-hdfs-namenode-plugin/src/main/java/org/apache/sentry/hdfs/SentryAuthorizationInfo.java
Lines 39 (patched)
<https://reviews.apache.org/r/59566/#comment250370>

    The only changes in this file are comments and extra logging - right?



sentry-hdfs/sentry-hdfs-namenode-plugin/src/main/java/org/apache/sentry/hdfs/SentryAuthorizationInfo.java
Line 129 (original), 133 (patched)
<https://reviews.apache.org/r/59566/#comment250365>

    The code would be simpler if you add
    
    if (updates == null)) {
      return
    }
    
    ...



sentry-hdfs/sentry-hdfs-namenode-plugin/src/main/java/org/apache/sentry/hdfs/SentryAuthorizationInfo.java
Line 136 (original), 140 (patched)
<https://reviews.apache.org/r/59566/#comment250367>

    Please add clarifying parenteses:
    
    if ((newAuthzPaths != authzPaths) || (newAuthzPerms != authzPermissions)) {



sentry-hdfs/sentry-hdfs-namenode-plugin/src/main/java/org/apache/sentry/hdfs/SentryAuthorizationInfo.java
Lines 141 (patched)
<https://reviews.apache.org/r/59566/#comment250368>

    Use static constants for these strings.
    
    Also these only should be defined if debug logging is enabled - you can move these closer to the use and add condition for debug loggin enabled.



sentry-hdfs/sentry-hdfs-namenode-plugin/src/main/java/org/apache/sentry/hdfs/SentryAuthorizationInfo.java
Lines 142 (patched)
<https://reviews.apache.org/r/59566/#comment250369>

    you can use 
    
    String permUpdateType = newAuthzPaths == authzPaths ? "delta" : "full"



sentry-hdfs/sentry-hdfs-namenode-plugin/src/main/java/org/apache/sentry/hdfs/SentryUpdater.java
Line 47 (original), 47 (patched)
<https://reviews.apache.org/r/59566/#comment250372>

    The only changes in this file are extra log messages - right?



sentry-hdfs/sentry-hdfs-namenode-plugin/src/main/java/org/apache/sentry/hdfs/SentryUpdater.java
Lines 52 (patched)
<https://reviews.apache.org/r/59566/#comment250371>

    You don't need to put log in the try-block



sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java
Lines 46 (patched)
<https://reviews.apache.org/r/59566/#comment250373>

    Please add a comment explaining what this is



sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java
Line 63 (original), 65 (patched)
<https://reviews.apache.org/r/59566/#comment250374>

    What is changeID? the code uses requestId
    
    What do you mean by "instead of 1"? You can just say that the initial request is for requestId zero.
    Can you use negative request IDs? (you mention <= 0)
    If it starts with 0, ho can it request negative values?



sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java
Line 64 (original), 66 (patched)
<https://reviews.apache.org/r/59566/#comment250375>

    This is confusing. 
    Do you mean that subsequent changeID are always incremented? 
    You may say that the Sentry server may send either the delta update if it has deltas or the full update.



sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java
Lines 70 (patched)
<https://reviews.apache.org/r/59566/#comment250376>

    You start from zero - how can you get requestId -1?



sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java
Lines 80 (patched)
<https://reviews.apache.org/r/59566/#comment250377>

    Can you remove these useless ### things from the log?



sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java
Lines 81 (patched)
<https://reviews.apache.org/r/59566/#comment250378>

    This is a special debugging code - It doesn't belong here - it is Ok to put it in some private branch and test with it, but it shouldn't be part of production code.
    
    Also note that if you exit from this function via exception you may never decrement it and get misleading niformation about concurrent requests.



sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java
Lines 86 (patched)
<https://reviews.apache.org/r/59566/#comment250379>

    If it starts with 0 - how can it be less then zero?
    I guess what you want to say here is 
    
    if (requestedId == 0) {
    ...
    }
    
    Or it isn't the case?
    
    Also This is the code in HDFS client - it shouldn't link with SentryStore since SentryStore is the server code. Please use constant from client package instead.
    
    Side note - is SentrySTore linked to the client?
    
    Also, why wouldn't the SentryServer ust send the full update when the requested sequence number iz sero? Why do we need to have this special code here?



sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java
Lines 89 (patched)
<https://reviews.apache.org/r/59566/#comment250380>

    Should it be strictly greater then the empty change ID?



sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java
Lines 92 (patched)
<https://reviews.apache.org/r/59566/#comment250381>

    Can we use 
    return new LinkedList<>(mageRetriever.retrieveFullImage()) instead?



sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java
Lines 94 (patched)
<https://reviews.apache.org/r/59566/#comment250382>

    The seqnum here is exactly zero, right?



sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java
Line 75 (original), 111 (patched)
<https://reviews.apache.org/r/59566/#comment250385>

    Are all the changes below related or it is somerhing else?



sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java
Line 76 (original), 112 (patched)
<https://reviews.apache.org/r/59566/#comment250383>

    We don't need this - we can just inline isEmpty)_



sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java
Lines 116 (patched)
<https://reviews.apache.org/r/59566/#comment250384>

    Here - we canjust use if (!deltas.isEmpty())


- Alexander Kolbasov


On June 1, 2017, 10:51 p.m., Na Li wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/59566/
> -----------------------------------------------------------
> 
> (Updated June 1, 2017, 10:51 p.m.)
> 
> 
> Review request for sentry, Alexander Kolbasov, kalyan kumar kalvagadda, Sergio Pena, and Vamsee Yarlagadda.
> 
> 
> Bugs: SENTRY-1784
>     https://issues.apache.org/jira/browse/SENTRY-1784
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> HDFS starts requesting changeID = 0 instead of 1. DBUpdateForwarder returns full update if the request changeID <= 0.  After first full update, the request changeID = 1, so only delta update is sent unless clean up removes changes that are not sent to HDFS. This fixes both issues in this Jira.
> 
> 1) What happens when NN just starts
> NN sends requestedId = 0 to sentry server. Sentry server sends full update back. The latest changeID for perm and path are returned to NN. NN saves the latest changeID for perm and path.
> NN initial changeID for permission and path is -1. The requestedId is NN changeID + 1. So initial requestedId from NN to sentry server is 0.
> 
> 2) What happens once NN gets a full snapshot
> Once NN gets a full snapshot, it creats new UpdateableAuthzPermissions for full perm update; it creates new UpdateableAuthzPaths for full path update;
> 
> It saves the latest changeID for perm and path (referred as latestChangeID_path and latestChangeID_perm). Next interval, NN sends requestedId (latestChangeID_path + 1, latestChangeID_perm + 1) to Sentry server. Since the lowest latestChangeID is 0, this requestedId >= 1.
> 
> If Sentry server has changes equal or newer than requestedId, it sends back delta changes to NN. Otherwise, it sends back empty list to NN. The exceptions are: 1) the requestedId does not exist in SentryStore (maybe it is cleaned up) or 2) the returned delta change list from SentryStore is empty (maybe the change table is corrupted). If exception happens, full snapshot is sent back to NN.
> 
> 3) What happens when a delta is received from HMS.
> NN updates the privilege and path based on the delta changes. It update save the latest changeID for perm and path.
> 
> 
> Diffs
> -----
> 
>   sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/ServiceConstants.java 0741ebc 
>   sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/UpdateableAuthzPaths.java ad7f8c9 
>   sentry-hdfs/sentry-hdfs-namenode-plugin/src/main/java/org/apache/sentry/hdfs/SentryAuthorizationInfo.java 90ba721 
>   sentry-hdfs/sentry-hdfs-namenode-plugin/src/main/java/org/apache/sentry/hdfs/SentryUpdater.java 34caa0e 
>   sentry-hdfs/sentry-hdfs-namenode-plugin/src/main/java/org/apache/sentry/hdfs/UpdateableAuthzPermissions.java 431c7fe 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java b8542b3 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryService.java 9beb07b 
> 
> 
> Diff: https://reviews.apache.org/r/59566/diff/4/
> 
> 
> Testing
> -------
> 
> TestHDFSIntegrationEnd2End
> 
> 
> Thanks,
> 
> Na Li
> 
>


Re: Review Request 59566: SENTRY-1784: DBUpdateForwarder returns empty update list to HDFS instead of full update

Posted by Na Li <li...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59566/
-----------------------------------------------------------

(Updated June 1, 2017, 10:51 p.m.)


Review request for sentry, Alexander Kolbasov, kalyan kumar kalvagadda, Sergio Pena, and Vamsee Yarlagadda.


Bugs: SENTRY-1784
    https://issues.apache.org/jira/browse/SENTRY-1784


Repository: sentry


Description
-------

HDFS starts requesting changeID = 0 instead of 1. DBUpdateForwarder returns full update if the request changeID <= 0.  After first full update, the request changeID = 1, so only delta update is sent unless clean up removes changes that are not sent to HDFS. This fixes both issues in this Jira.

1) What happens when NN just starts
NN sends requestedId = 0 to sentry server. Sentry server sends full update back. The latest changeID for perm and path are returned to NN. NN saves the latest changeID for perm and path.
NN initial changeID for permission and path is -1. The requestedId is NN changeID + 1. So initial requestedId from NN to sentry server is 0.

2) What happens once NN gets a full snapshot
Once NN gets a full snapshot, it creats new UpdateableAuthzPermissions for full perm update; it creates new UpdateableAuthzPaths for full path update;

It saves the latest changeID for perm and path (referred as latestChangeID_path and latestChangeID_perm). Next interval, NN sends requestedId (latestChangeID_path + 1, latestChangeID_perm + 1) to Sentry server. Since the lowest latestChangeID is 0, this requestedId >= 1.

If Sentry server has changes equal or newer than requestedId, it sends back delta changes to NN. Otherwise, it sends back empty list to NN. The exceptions are: 1) the requestedId does not exist in SentryStore (maybe it is cleaned up) or 2) the returned delta change list from SentryStore is empty (maybe the change table is corrupted). If exception happens, full snapshot is sent back to NN.

3) What happens when a delta is received from HMS.
NN updates the privilege and path based on the delta changes. It update save the latest changeID for perm and path.


Diffs (updated)
-----

  sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/ServiceConstants.java 0741ebc 
  sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/UpdateableAuthzPaths.java ad7f8c9 
  sentry-hdfs/sentry-hdfs-namenode-plugin/src/main/java/org/apache/sentry/hdfs/SentryAuthorizationInfo.java 90ba721 
  sentry-hdfs/sentry-hdfs-namenode-plugin/src/main/java/org/apache/sentry/hdfs/SentryUpdater.java 34caa0e 
  sentry-hdfs/sentry-hdfs-namenode-plugin/src/main/java/org/apache/sentry/hdfs/UpdateableAuthzPermissions.java 431c7fe 
  sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java b8542b3 
  sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryService.java 9beb07b 


Diff: https://reviews.apache.org/r/59566/diff/4/

Changes: https://reviews.apache.org/r/59566/diff/3-4/


Testing
-------

TestHDFSIntegrationEnd2End


Thanks,

Na Li


Re: Review Request 59566: SENTRY-1784: DBUpdateForwarder returns empty update list to HDFS instead of full update

Posted by Alexander Kolbasov <ak...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59566/#review176591
-----------------------------------------------------------



You have great comments here, why don't you put these comments in the code?

- Alexander Kolbasov


On June 1, 2017, 3:17 a.m., Na Li wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/59566/
> -----------------------------------------------------------
> 
> (Updated June 1, 2017, 3:17 a.m.)
> 
> 
> Review request for sentry, Alexander Kolbasov, kalyan kumar kalvagadda, Sergio Pena, and Vamsee Yarlagadda.
> 
> 
> Bugs: SENTRY-1784
>     https://issues.apache.org/jira/browse/SENTRY-1784
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> HDFS starts requesting changeID = 0 instead of 1. DBUpdateForwarder returns full update if the request changeID <= 0.  After first full update, the request changeID = 1, so only delta update is sent unless clean up removes changes that are not sent to HDFS. This fixes both issues in this Jira.
> 
> 1) What happens when NN just starts
> NN sends requestedId = 0 to sentry server. Sentry server sends full update back. The latest changeID for perm and path are returned to NN. NN saves the latest changeID for perm and path.
> NN initial changeID for permission and path is -1. The requestedId is NN changeID + 1. So initial requestedId from NN to sentry server is 0.
> 
> 2) What happens once NN gets a full snapshot
> Once NN gets a full snapshot, it creats new UpdateableAuthzPermissions for full perm update; it creates new UpdateableAuthzPaths for full path update;
> 
> It saves the latest changeID for perm and path (referred as latestChangeID_path and latestChangeID_perm). Next interval, NN sends requestedId (latestChangeID_path + 1, latestChangeID_perm + 1) to Sentry server. Since the lowest latestChangeID is 0, this requestedId >= 1.
> 
> If Sentry server has changes equal or newer than requestedId, it sends back delta changes to NN. Otherwise, it sends back empty list to NN. The exceptions are: 1) the requestedId does not exist in SentryStore (maybe it is cleaned up) or 2) the returned delta change list from SentryStore is empty (maybe the change table is corrupted). If exception happens, full snapshot is sent back to NN.
> 
> 3) What happens when a delta is received from HMS.
> NN updates the privilege and path based on the delta changes. It update save the latest changeID for perm and path.
> 
> 
> Diffs
> -----
> 
>   sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/ServiceConstants.java 0741ebc 
>   sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/UpdateableAuthzPaths.java ad7f8c9 
>   sentry-hdfs/sentry-hdfs-namenode-plugin/src/main/java/org/apache/sentry/hdfs/SentryAuthorizationInfo.java 90ba721 
>   sentry-hdfs/sentry-hdfs-namenode-plugin/src/main/java/org/apache/sentry/hdfs/SentryUpdater.java 34caa0e 
>   sentry-hdfs/sentry-hdfs-namenode-plugin/src/main/java/org/apache/sentry/hdfs/UpdateableAuthzPermissions.java 431c7fe 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java b8542b3 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryService.java 9beb07b 
> 
> 
> Diff: https://reviews.apache.org/r/59566/diff/3/
> 
> 
> Testing
> -------
> 
> TestHDFSIntegrationEnd2End
> 
> 
> Thanks,
> 
> Na Li
> 
>


Re: Review Request 59566: SENTRY-1784: DBUpdateForwarder returns empty update list to HDFS instead of full update

Posted by Na Li <li...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59566/
-----------------------------------------------------------

(Updated June 1, 2017, 3:17 a.m.)


Review request for sentry, Alexander Kolbasov, kalyan kumar kalvagadda, Sergio Pena, and Vamsee Yarlagadda.


Bugs: SENTRY-1784
    https://issues.apache.org/jira/browse/SENTRY-1784


Repository: sentry


Description (updated)
-------

HDFS starts requesting changeID = 0 instead of 1. DBUpdateForwarder returns full update if the request changeID <= 0.  After first full update, the request changeID = 1, so only delta update is sent unless clean up removes changes that are not sent to HDFS. This fixes both issues in this Jira.

1) What happens when NN just starts
NN sends requestedId = 0 to sentry server. Sentry server sends full update back. The latest changeID for perm and path are returned to NN. NN saves the latest changeID for perm and path.
NN initial changeID for permission and path is -1. The requestedId is NN changeID + 1. So initial requestedId from NN to sentry server is 0.

2) What happens once NN gets a full snapshot
Once NN gets a full snapshot, it creats new UpdateableAuthzPermissions for full perm update; it creates new UpdateableAuthzPaths for full path update;

It saves the latest changeID for perm and path (referred as latestChangeID_path and latestChangeID_perm). Next interval, NN sends requestedId (latestChangeID_path + 1, latestChangeID_perm + 1) to Sentry server. Since the lowest latestChangeID is 0, this requestedId >= 1.

If Sentry server has changes equal or newer than requestedId, it sends back delta changes to NN. Otherwise, it sends back empty list to NN. The exceptions are: 1) the requestedId does not exist in SentryStore (maybe it is cleaned up) or 2) the returned delta change list from SentryStore is empty (maybe the change table is corrupted). If exception happens, full snapshot is sent back to NN.

3) What happens when a delta is received from HMS.
NN updates the privilege and path based on the delta changes. It update save the latest changeID for perm and path.


Diffs
-----

  sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/ServiceConstants.java 0741ebc 
  sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/UpdateableAuthzPaths.java ad7f8c9 
  sentry-hdfs/sentry-hdfs-namenode-plugin/src/main/java/org/apache/sentry/hdfs/SentryAuthorizationInfo.java 90ba721 
  sentry-hdfs/sentry-hdfs-namenode-plugin/src/main/java/org/apache/sentry/hdfs/SentryUpdater.java 34caa0e 
  sentry-hdfs/sentry-hdfs-namenode-plugin/src/main/java/org/apache/sentry/hdfs/UpdateableAuthzPermissions.java 431c7fe 
  sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java b8542b3 
  sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryService.java 9beb07b 


Diff: https://reviews.apache.org/r/59566/diff/3/


Testing
-------

TestHDFSIntegrationEnd2End


Thanks,

Na Li