You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sentry.apache.org by Arjun Mishra via Review Board <no...@reviews.apache.org> on 2018/09/11 18:00:49 UTC

Re: Review Request 68547: SENTRY-2370: Create a cache of PathUpdates to send to NN

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68547/
-----------------------------------------------------------

(Updated Sept. 11, 2018, 6 p.m.)


Review request for sentry, kalyan kumar kalvagadda, Na Li, and Sergio Pena.


Changes
-------

Added logic to invalidate cache


Bugs: SENTRY-2370
    https://issues.apache.org/jira/browse/SENTRY-2370


Repository: sentry


Description
-------

When NN requests path updates from sentry and if it exceeds the time threshold, on consecutive attempts sentry will attempt to fetch the full update from scratch. Instead it should cache it and update the cache before sending it to NN


Diffs (updated)
-----

  sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/ImageCache.java PRE-CREATION 
  sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/ServiceConstants.java 2d21411e2 
  sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java 08b16a4df 
  sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/PathsUpdateImageCache.java PRE-CREATION 
  sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/PermissionsUpdateImageCache.java PRE-CREATION 
  sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/SentryPlugin.java b8f5ce7db 
  sentry-hdfs/sentry-hdfs-service/src/test/java/org/apache/sentry/hdfs/TestDBUpdateForwarder.java f86ce6f83 


Diff: https://reviews.apache.org/r/68547/diff/3/

Changes: https://reviews.apache.org/r/68547/diff/2-3/


Testing
-------


Thanks,

Arjun Mishra


Re: Review Request 68547: SENTRY-2370: Create a cache of PathUpdates to send to NN

Posted by Arjun Mishra via Review Board <no...@reviews.apache.org>.

> On Oct. 3, 2018, 6:55 p.m., Sergio Pena wrote:
> > sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/PathImageRetriever.java
> > Lines 97-107 (patched)
> > <https://reviews.apache.org/r/68547/diff/9/?file=2093712#file2093712line97>
> >
> >     Why do we need the sequence# and image# here? Those are already checked in the DbUpdateForward class. I think the cache invalidation here should be done by looking at the current snapshot ID on the SentryStore and comparing it with the one store in the cache.

We are not invalidating cache here. We are just checking if the cache is current. If it isn't, we fetch a fresh full update from Sentry Db, and update the cache


- Arjun


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68547/#review209184
-----------------------------------------------------------


On Oct. 2, 2018, 9:57 p.m., Arjun Mishra wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68547/
> -----------------------------------------------------------
> 
> (Updated Oct. 2, 2018, 9:57 p.m.)
> 
> 
> Review request for sentry, kalyan kumar kalvagadda, Na Li, and Sergio Pena.
> 
> 
> Bugs: SENTRY-2370
>     https://issues.apache.org/jira/browse/SENTRY-2370
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> When NN requests path updates from sentry and if it exceeds the time threshold, on consecutive attempts sentry will attempt to fetch the full update from scratch. Instead it should cache it and update the cache before sending it to NN
> 
> 
> Diffs
> -----
> 
>   sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/ImageRetriever.java 11b75411d 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java 08b16a4df 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/PathImageRetriever.java 3532ef33d 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/PermImageRetriever.java 443434127 
>   sentry-hdfs/sentry-hdfs-service/src/test/java/org/apache/sentry/hdfs/TestDBUpdateForwarder.java f86ce6f83 
>   sentry-hdfs/sentry-hdfs-service/src/test/java/org/apache/sentry/hdfs/TestImageRetriever.java fb42b279a 
> 
> 
> Diff: https://reviews.apache.org/r/68547/diff/9/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Arjun Mishra
> 
>


Re: Review Request 68547: SENTRY-2370: Create a cache of PathUpdates to send to NN

Posted by Sergio Pena via Review Board <no...@reviews.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68547/#review209184
-----------------------------------------------------------



The same PathImageRetriever feedback applies to PermissionImageRetriever.

Btw, while looking at the code completely, I think it makes more sense to keep the cache in the DBUpdateForwarder class. That class handles the seqNum and imgNum that can be used to make your logic of whether fetch a new snapshot or just return the cache. Also the code will be re-used for paths and permissions.


sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/ImageRetriever.java
Lines 65 (patched)
<https://reviews.apache.org/r/68547/#comment293510>

    Seems this method is used only internally on each class. Is it needed to be part of the interface and public?



sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java
Line 131 (original), 131 (patched)
<https://reviews.apache.org/r/68547/#comment293509>

    Does this 'Clear cache if present' help for supportability? How do you know the cache was free or not?



sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java
Lines 132 (patched)
<https://reviews.apache.org/r/68547/#comment293508>

    Leave a comment on this call why you're releasing the cache from memory, and why at this point of the call.



sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/PathImageRetriever.java
Lines 44 (patched)
<https://reviews.apache.org/r/68547/#comment293511>

    Code convention: Need a space between the variable type and name.



sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/PathImageRetriever.java
Lines 62 (patched)
<https://reviews.apache.org/r/68547/#comment293512>

    Code convention: if () instead of if()



sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/PathImageRetriever.java
Lines 65 (patched)
<https://reviews.apache.org/r/68547/#comment293522>

    Could you call clearCache() before creating another SoftReference? Just to make sure the cache is cleared and can be claimed by GC. We want to make sure there are not hard references leaked that causes a memory leak.



sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/PathImageRetriever.java
Lines 79 (patched)
<https://reviews.apache.org/r/68547/#comment293520>

    Code convention: If () instead of if()



sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/PathImageRetriever.java
Lines 80 (patched)
<https://reviews.apache.org/r/68547/#comment293519>

    The SoftReference object has a clear(). I think we should call it too before setting the variable to null.



sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/PathImageRetriever.java
Lines 97-107 (patched)
<https://reviews.apache.org/r/68547/#comment293521>

    Why do we need the sequence# and image# here? Those are already checked in the DbUpdateForward class. I think the cache invalidation here should be done by looking at the current snapshot ID on the SentryStore and comparing it with the one store in the cache.


- Sergio Pena


On Oct. 2, 2018, 9:57 p.m., Arjun Mishra wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68547/
> -----------------------------------------------------------
> 
> (Updated Oct. 2, 2018, 9:57 p.m.)
> 
> 
> Review request for sentry, kalyan kumar kalvagadda, Na Li, and Sergio Pena.
> 
> 
> Bugs: SENTRY-2370
>     https://issues.apache.org/jira/browse/SENTRY-2370
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> When NN requests path updates from sentry and if it exceeds the time threshold, on consecutive attempts sentry will attempt to fetch the full update from scratch. Instead it should cache it and update the cache before sending it to NN
> 
> 
> Diffs
> -----
> 
>   sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/ImageRetriever.java 11b75411d 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java 08b16a4df 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/PathImageRetriever.java 3532ef33d 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/PermImageRetriever.java 443434127 
>   sentry-hdfs/sentry-hdfs-service/src/test/java/org/apache/sentry/hdfs/TestDBUpdateForwarder.java f86ce6f83 
>   sentry-hdfs/sentry-hdfs-service/src/test/java/org/apache/sentry/hdfs/TestImageRetriever.java fb42b279a 
> 
> 
> Diff: https://reviews.apache.org/r/68547/diff/9/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Arjun Mishra
> 
>


Re: Review Request 68547: SENTRY-2370: Create a cache of PathUpdates to send to NN

Posted by Arjun Mishra via Review Board <no...@reviews.apache.org>.

> On Oct. 8, 2018, 4:18 p.m., Sergio Pena wrote:
> > sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java
> > Lines 222 (patched)
> > <https://reviews.apache.org/r/68547/diff/11/?file=2094968#file2094968line222>
> >
> >     Just check if they're different. It is a simple case.  A cache is not more valid if the seq num is different (higher or lower, it does not matter). This logic is assumming HDFS will never request a lower value. We should not assume that.

cache.seqNum needs to be strictly less than for it to request a fresh full update. If cache.seqNum is greater than requestedSeqNum, we can simply return that cache and the subsequent request NN will ask for deltas  greater than cache.seqNum so it will be able to catch up. If that sequence number doesn't not exist, we will do a fresh fetch from Sentry db


- Arjun


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68547/#review209315
-----------------------------------------------------------


On Oct. 5, 2018, 7:11 p.m., Arjun Mishra wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68547/
> -----------------------------------------------------------
> 
> (Updated Oct. 5, 2018, 7:11 p.m.)
> 
> 
> Review request for sentry, kalyan kumar kalvagadda, Na Li, and Sergio Pena.
> 
> 
> Bugs: SENTRY-2370
>     https://issues.apache.org/jira/browse/SENTRY-2370
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> When NN requests path updates from sentry and if it exceeds the time threshold, on consecutive attempts sentry will attempt to fetch the full update from scratch. Instead it should cache it and update the cache before sending it to NN
> 
> 
> Diffs
> -----
> 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java 08b16a4df 
>   sentry-hdfs/sentry-hdfs-service/src/test/java/org/apache/sentry/hdfs/TestDBUpdateForwarder.java f86ce6f83 
> 
> 
> Diff: https://reviews.apache.org/r/68547/diff/11/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Arjun Mishra
> 
>


Re: Review Request 68547: SENTRY-2370: Create a cache of PathUpdates to send to NN

Posted by Arjun Mishra via Review Board <no...@reviews.apache.org>.

> On Oct. 8, 2018, 4:18 p.m., Sergio Pena wrote:
> > sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java
> > Lines 139 (patched)
> > <https://reviews.apache.org/r/68547/diff/11/?file=2094968#file2094968line139>
> >
> >     If imageCache.get() is not null, then it might be null after the if condition. The reason is GC could remove it from memory if it is not a hard-reference.
> 
> Arjun Mishra wrote:
>     Will change it

Actually I just realized an issue with GC cleaning up the cache. The problem is we have variables cacheImageNum, and cacheSequenceNum that need to be reset everytime the cache is cleared. So if GC clears the cache, we will be in a state where cacheImageNum, and cacheSequenceNum are holding invalid values.


- Arjun


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68547/#review209315
-----------------------------------------------------------


On Oct. 5, 2018, 7:11 p.m., Arjun Mishra wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68547/
> -----------------------------------------------------------
> 
> (Updated Oct. 5, 2018, 7:11 p.m.)
> 
> 
> Review request for sentry, kalyan kumar kalvagadda, Na Li, and Sergio Pena.
> 
> 
> Bugs: SENTRY-2370
>     https://issues.apache.org/jira/browse/SENTRY-2370
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> When NN requests path updates from sentry and if it exceeds the time threshold, on consecutive attempts sentry will attempt to fetch the full update from scratch. Instead it should cache it and update the cache before sending it to NN
> 
> 
> Diffs
> -----
> 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java 08b16a4df 
>   sentry-hdfs/sentry-hdfs-service/src/test/java/org/apache/sentry/hdfs/TestDBUpdateForwarder.java f86ce6f83 
> 
> 
> Diff: https://reviews.apache.org/r/68547/diff/11/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Arjun Mishra
> 
>


Re: Review Request 68547: SENTRY-2370: Create a cache of PathUpdates to send to NN

Posted by Arjun Mishra via Review Board <no...@reviews.apache.org>.

> On Oct. 8, 2018, 4:18 p.m., Sergio Pena wrote:
> > sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java
> > Lines 140-145 (patched)
> > <https://reviews.apache.org/r/68547/diff/11/?file=2094968#file2094968line140>
> >
> >     Invalidate or cleanUp? I like to separate the verbs. Invalidate means the current cache is not valid and it will be replaced (but not necessary removed from memory). Clean is what is says, clean from memory.
> >     
> >     Also, I think we should see the future of HDFS Federation. We can get a benefit if we avoid cleaning the cache when multiple NN request images from Sentry. This way Sentry will return the cache image.

I am fine with that. Some of the team members were against using too much heap


- Arjun


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68547/#review209315
-----------------------------------------------------------


On Oct. 5, 2018, 7:11 p.m., Arjun Mishra wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68547/
> -----------------------------------------------------------
> 
> (Updated Oct. 5, 2018, 7:11 p.m.)
> 
> 
> Review request for sentry, kalyan kumar kalvagadda, Na Li, and Sergio Pena.
> 
> 
> Bugs: SENTRY-2370
>     https://issues.apache.org/jira/browse/SENTRY-2370
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> When NN requests path updates from sentry and if it exceeds the time threshold, on consecutive attempts sentry will attempt to fetch the full update from scratch. Instead it should cache it and update the cache before sending it to NN
> 
> 
> Diffs
> -----
> 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java 08b16a4df 
>   sentry-hdfs/sentry-hdfs-service/src/test/java/org/apache/sentry/hdfs/TestDBUpdateForwarder.java f86ce6f83 
> 
> 
> Diff: https://reviews.apache.org/r/68547/diff/11/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Arjun Mishra
> 
>


Re: Review Request 68547: SENTRY-2370: Create a cache of PathUpdates to send to NN

Posted by Arjun Mishra via Review Board <no...@reviews.apache.org>.

> On Oct. 8, 2018, 4:18 p.m., Sergio Pena wrote:
> > sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java
> > Lines 139 (patched)
> > <https://reviews.apache.org/r/68547/diff/11/?file=2094968#file2094968line139>
> >
> >     If imageCache.get() is not null, then it might be null after the if condition. The reason is GC could remove it from memory if it is not a hard-reference.

Will change it


- Arjun


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68547/#review209315
-----------------------------------------------------------


On Oct. 5, 2018, 7:11 p.m., Arjun Mishra wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68547/
> -----------------------------------------------------------
> 
> (Updated Oct. 5, 2018, 7:11 p.m.)
> 
> 
> Review request for sentry, kalyan kumar kalvagadda, Na Li, and Sergio Pena.
> 
> 
> Bugs: SENTRY-2370
>     https://issues.apache.org/jira/browse/SENTRY-2370
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> When NN requests path updates from sentry and if it exceeds the time threshold, on consecutive attempts sentry will attempt to fetch the full update from scratch. Instead it should cache it and update the cache before sending it to NN
> 
> 
> Diffs
> -----
> 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java 08b16a4df 
>   sentry-hdfs/sentry-hdfs-service/src/test/java/org/apache/sentry/hdfs/TestDBUpdateForwarder.java f86ce6f83 
> 
> 
> Diff: https://reviews.apache.org/r/68547/diff/11/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Arjun Mishra
> 
>


Re: Review Request 68547: SENTRY-2370: Create a cache of PathUpdates to send to NN

Posted by Arjun Mishra via Review Board <no...@reviews.apache.org>.

> On Oct. 8, 2018, 4:18 p.m., Sergio Pena wrote:
> > sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java
> > Lines 248-266 (patched)
> > <https://reviews.apache.org/r/68547/diff/11/?file=2094968#file2094968line248>
> >
> >     Is there a way to mock the cache objects instead of having new methods to get its internal value? Or mock the imageRetriever and verify it was called when a cache is not valid.

I would then have to encapsulate these attributes cacheImgNum, cacheSequenceNum into their own class. Then they can be mocked. In your previous code review, you were against created unnecessary classes so it wasn't done


- Arjun


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68547/#review209315
-----------------------------------------------------------


On Oct. 5, 2018, 7:11 p.m., Arjun Mishra wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68547/
> -----------------------------------------------------------
> 
> (Updated Oct. 5, 2018, 7:11 p.m.)
> 
> 
> Review request for sentry, kalyan kumar kalvagadda, Na Li, and Sergio Pena.
> 
> 
> Bugs: SENTRY-2370
>     https://issues.apache.org/jira/browse/SENTRY-2370
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> When NN requests path updates from sentry and if it exceeds the time threshold, on consecutive attempts sentry will attempt to fetch the full update from scratch. Instead it should cache it and update the cache before sending it to NN
> 
> 
> Diffs
> -----
> 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java 08b16a4df 
>   sentry-hdfs/sentry-hdfs-service/src/test/java/org/apache/sentry/hdfs/TestDBUpdateForwarder.java f86ce6f83 
> 
> 
> Diff: https://reviews.apache.org/r/68547/diff/11/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Arjun Mishra
> 
>


Re: Review Request 68547: SENTRY-2370: Create a cache of PathUpdates to send to NN

Posted by Arjun Mishra via Review Board <no...@reviews.apache.org>.

> On Oct. 8, 2018, 4:18 p.m., Sergio Pena wrote:
> > sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java
> > Line 159 (original), 216-220 (patched)
> > <https://reviews.apache.org/r/68547/diff/11/?file=2094968#file2094968line216>
> >
> >     In the if() above, if imageNum != than the cache, then that will return null. When is this if going to be valid? imageNum will be teh same than cache.

There is some confusion. I think you are talking about this condition "if (updateCache.getImgNum() != UNUSED_PATH_UPDATE_IMG_NUM && updateCache.getImgNum() != latestImageNum) {". Here "latestImageNum" is the value of MAX(SNAPSHOT_ID) in the sentry database. The two condititions are checking if cache.imageNum != latestImageNum, then return null, else if cache.imageNum is less than the requestedImgNum then return null.


- Arjun


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68547/#review209315
-----------------------------------------------------------


On Oct. 5, 2018, 7:11 p.m., Arjun Mishra wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68547/
> -----------------------------------------------------------
> 
> (Updated Oct. 5, 2018, 7:11 p.m.)
> 
> 
> Review request for sentry, kalyan kumar kalvagadda, Na Li, and Sergio Pena.
> 
> 
> Bugs: SENTRY-2370
>     https://issues.apache.org/jira/browse/SENTRY-2370
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> When NN requests path updates from sentry and if it exceeds the time threshold, on consecutive attempts sentry will attempt to fetch the full update from scratch. Instead it should cache it and update the cache before sending it to NN
> 
> 
> Diffs
> -----
> 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java 08b16a4df 
>   sentry-hdfs/sentry-hdfs-service/src/test/java/org/apache/sentry/hdfs/TestDBUpdateForwarder.java f86ce6f83 
> 
> 
> Diff: https://reviews.apache.org/r/68547/diff/11/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Arjun Mishra
> 
>


Re: Review Request 68547: SENTRY-2370: Create a cache of PathUpdates to send to NN

Posted by Sergio Pena via Review Board <no...@reviews.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68547/#review209315
-----------------------------------------------------------



Did you look at the Guava cache?

Read this part of using soft values in the Guava cache (btw, it has invalidate vs cleanUp methods which means different things):
http://blog.jessitron.com/2011/10/keeping-your-cache-down-to-size.html


I was looking at the Guava cache and it will be valuable resource for using it in this cache instead of doing our own thing. It is much simpler and less prone to errors. Really useful.


sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java
Lines 139 (patched)
<https://reviews.apache.org/r/68547/#comment293627>

    If imageCache.get() is not null, then it might be null after the if condition. The reason is GC could remove it from memory if it is not a hard-reference.



sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java
Lines 140-145 (patched)
<https://reviews.apache.org/r/68547/#comment293628>

    Invalidate or cleanUp? I like to separate the verbs. Invalidate means the current cache is not valid and it will be replaced (but not necessary removed from memory). Clean is what is says, clean from memory.
    
    Also, I think we should see the future of HDFS Federation. We can get a benefit if we avoid cleaning the cache when multiple NN request images from Sentry. This way Sentry will return the cache image.



sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java
Line 159 (original), 216-220 (patched)
<https://reviews.apache.org/r/68547/#comment293630>

    In the if() above, if imageNum != than the cache, then that will return null. When is this if going to be valid? imageNum will be teh same than cache.



sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java
Lines 222 (patched)
<https://reviews.apache.org/r/68547/#comment293631>

    Just check if they're different. It is a simple case.  A cache is not more valid if the seq num is different (higher or lower, it does not matter). This logic is assumming HDFS will never request a lower value. We should not assume that.



sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java
Lines 248-266 (patched)
<https://reviews.apache.org/r/68547/#comment293629>

    Is there a way to mock the cache objects instead of having new methods to get its internal value? Or mock the imageRetriever and verify it was called when a cache is not valid.


- Sergio Pena


On Oct. 5, 2018, 7:11 p.m., Arjun Mishra wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68547/
> -----------------------------------------------------------
> 
> (Updated Oct. 5, 2018, 7:11 p.m.)
> 
> 
> Review request for sentry, kalyan kumar kalvagadda, Na Li, and Sergio Pena.
> 
> 
> Bugs: SENTRY-2370
>     https://issues.apache.org/jira/browse/SENTRY-2370
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> When NN requests path updates from sentry and if it exceeds the time threshold, on consecutive attempts sentry will attempt to fetch the full update from scratch. Instead it should cache it and update the cache before sending it to NN
> 
> 
> Diffs
> -----
> 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java 08b16a4df 
>   sentry-hdfs/sentry-hdfs-service/src/test/java/org/apache/sentry/hdfs/TestDBUpdateForwarder.java f86ce6f83 
> 
> 
> Diff: https://reviews.apache.org/r/68547/diff/11/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Arjun Mishra
> 
>


Re: Review Request 68547: SENTRY-2370: Create a cache of PathUpdates to send to NN

Posted by Arjun Mishra via Review Board <no...@reviews.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68547/
-----------------------------------------------------------

(Updated Oct. 5, 2018, 7:11 p.m.)


Review request for sentry, kalyan kumar kalvagadda, Na Li, and Sergio Pena.


Bugs: SENTRY-2370
    https://issues.apache.org/jira/browse/SENTRY-2370


Repository: sentry


Description
-------

When NN requests path updates from sentry and if it exceeds the time threshold, on consecutive attempts sentry will attempt to fetch the full update from scratch. Instead it should cache it and update the cache before sending it to NN


Diffs (updated)
-----

  sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java 08b16a4df 
  sentry-hdfs/sentry-hdfs-service/src/test/java/org/apache/sentry/hdfs/TestDBUpdateForwarder.java f86ce6f83 


Diff: https://reviews.apache.org/r/68547/diff/11/

Changes: https://reviews.apache.org/r/68547/diff/10-11/


Testing
-------


Thanks,

Arjun Mishra


Re: Review Request 68547: SENTRY-2370: Create a cache of PathUpdates to send to NN

Posted by Arjun Mishra via Review Board <no...@reviews.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68547/
-----------------------------------------------------------

(Updated Oct. 5, 2018, 7:08 p.m.)


Review request for sentry, kalyan kumar kalvagadda, Na Li, and Sergio Pena.


Bugs: SENTRY-2370
    https://issues.apache.org/jira/browse/SENTRY-2370


Repository: sentry


Description
-------

When NN requests path updates from sentry and if it exceeds the time threshold, on consecutive attempts sentry will attempt to fetch the full update from scratch. Instead it should cache it and update the cache before sending it to NN


Diffs (updated)
-----

  sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java 08b16a4df 
  sentry-hdfs/sentry-hdfs-service/src/test/java/org/apache/sentry/hdfs/TestDBUpdateForwarder.java f86ce6f83 


Diff: https://reviews.apache.org/r/68547/diff/10/

Changes: https://reviews.apache.org/r/68547/diff/9-10/


Testing
-------


Thanks,

Arjun Mishra


Re: Review Request 68547: SENTRY-2370: Create a cache of PathUpdates to send to NN

Posted by Arjun Mishra via Review Board <no...@reviews.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68547/
-----------------------------------------------------------

(Updated Oct. 2, 2018, 9:57 p.m.)


Review request for sentry, kalyan kumar kalvagadda, Na Li, and Sergio Pena.


Bugs: SENTRY-2370
    https://issues.apache.org/jira/browse/SENTRY-2370


Repository: sentry


Description
-------

When NN requests path updates from sentry and if it exceeds the time threshold, on consecutive attempts sentry will attempt to fetch the full update from scratch. Instead it should cache it and update the cache before sending it to NN


Diffs (updated)
-----

  sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/ImageRetriever.java 11b75411d 
  sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java 08b16a4df 
  sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/PathImageRetriever.java 3532ef33d 
  sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/PermImageRetriever.java 443434127 
  sentry-hdfs/sentry-hdfs-service/src/test/java/org/apache/sentry/hdfs/TestDBUpdateForwarder.java f86ce6f83 
  sentry-hdfs/sentry-hdfs-service/src/test/java/org/apache/sentry/hdfs/TestImageRetriever.java fb42b279a 


Diff: https://reviews.apache.org/r/68547/diff/9/

Changes: https://reviews.apache.org/r/68547/diff/8-9/


Testing
-------


Thanks,

Arjun Mishra


Re: Review Request 68547: SENTRY-2370: Create a cache of PathUpdates to send to NN

Posted by Arjun Mishra via Review Board <no...@reviews.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68547/
-----------------------------------------------------------

(Updated Oct. 2, 2018, 9:47 p.m.)


Review request for sentry, kalyan kumar kalvagadda, Na Li, and Sergio Pena.


Changes
-------

Post Sergio's feedback


Bugs: SENTRY-2370
    https://issues.apache.org/jira/browse/SENTRY-2370


Repository: sentry


Description
-------

When NN requests path updates from sentry and if it exceeds the time threshold, on consecutive attempts sentry will attempt to fetch the full update from scratch. Instead it should cache it and update the cache before sending it to NN


Diffs (updated)
-----

  sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/ImageRetriever.java 11b75411d 
  sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java 08b16a4df 
  sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/PathImageRetriever.java 3532ef33d 
  sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/PermImageRetriever.java 443434127 
  sentry-hdfs/sentry-hdfs-service/src/test/java/org/apache/sentry/hdfs/TestDBUpdateForwarder.java f86ce6f83 
  sentry-hdfs/sentry-hdfs-service/src/test/java/org/apache/sentry/hdfs/TestImageRetriever.java fb42b279a 


Diff: https://reviews.apache.org/r/68547/diff/8/

Changes: https://reviews.apache.org/r/68547/diff/7-8/


Testing
-------


Thanks,

Arjun Mishra


Re: Review Request 68547: SENTRY-2370: Create a cache of PathUpdates to send to NN

Posted by Arjun Mishra via Review Board <no...@reviews.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68547/
-----------------------------------------------------------

(Updated Sept. 28, 2018, 7:56 p.m.)


Review request for sentry, kalyan kumar kalvagadda, Na Li, and Sergio Pena.


Changes
-------

Removed configuration and accommodated Kalyan's suggestions


Bugs: SENTRY-2370
    https://issues.apache.org/jira/browse/SENTRY-2370


Repository: sentry


Description
-------

When NN requests path updates from sentry and if it exceeds the time threshold, on consecutive attempts sentry will attempt to fetch the full update from scratch. Instead it should cache it and update the cache before sending it to NN


Diffs (updated)
-----

  sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/ImageCache.java PRE-CREATION 
  sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java 08b16a4df 
  sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/PathsUpdateImageCache.java PRE-CREATION 
  sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/PermissionsUpdateImageCache.java PRE-CREATION 
  sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/SentryPlugin.java b8f5ce7db 
  sentry-hdfs/sentry-hdfs-service/src/test/java/org/apache/sentry/hdfs/TestDBUpdateForwarder.java f86ce6f83 


Diff: https://reviews.apache.org/r/68547/diff/7/

Changes: https://reviews.apache.org/r/68547/diff/6-7/


Testing
-------


Thanks,

Arjun Mishra


Re: Review Request 68547: SENTRY-2370: Create a cache of PathUpdates to send to NN

Posted by kalyan kumar kalvagadda via Review Board <no...@reviews.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68547/#review209100
-----------------------------------------------------------


Fix it, then Ship it!




Fix it


sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java
Lines 191-194 (patched)
<https://reviews.apache.org/r/68547/#comment293359>

    nit. There is no need for invalidateCache API in DBUpdateForwarder.


- kalyan kumar kalvagadda


On Sept. 27, 2018, 10:29 p.m., Arjun Mishra wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68547/
> -----------------------------------------------------------
> 
> (Updated Sept. 27, 2018, 10:29 p.m.)
> 
> 
> Review request for sentry, kalyan kumar kalvagadda, Na Li, and Sergio Pena.
> 
> 
> Bugs: SENTRY-2370
>     https://issues.apache.org/jira/browse/SENTRY-2370
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> When NN requests path updates from sentry and if it exceeds the time threshold, on consecutive attempts sentry will attempt to fetch the full update from scratch. Instead it should cache it and update the cache before sending it to NN
> 
> 
> Diffs
> -----
> 
>   sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/ImageCache.java PRE-CREATION 
>   sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/ServiceConstants.java 2d21411e2 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java 08b16a4df 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/PathsUpdateImageCache.java PRE-CREATION 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/PermissionsUpdateImageCache.java PRE-CREATION 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/SentryPlugin.java b8f5ce7db 
>   sentry-hdfs/sentry-hdfs-service/src/test/java/org/apache/sentry/hdfs/TestDBUpdateForwarder.java f86ce6f83 
> 
> 
> Diff: https://reviews.apache.org/r/68547/diff/6/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Arjun Mishra
> 
>


Re: Review Request 68547: SENTRY-2370: Create a cache of PathUpdates to send to NN

Posted by Arjun Mishra via Review Board <no...@reviews.apache.org>.

> On Sept. 28, 2018, 4:56 p.m., Sergio Pena wrote:
> > What if NN does not request full images anymore? How can the cache be released from memory to avoid unused cache for the rest of the process life?
> 
> Arjun Mishra wrote:
>     Sergio, the cache is releases as soon as Deltas are being sent to NN
> 
> Sergio Pena wrote:
>     what if there are no Deltas available to send in the next NN requests? The cache will live on memory until then.
>     Look for Java soft or weak references (and the use of SoftReferences or WeakReferences). These objects might help on letting the GC to clean your cache if it is not needed anymore.

Deltas don't have to be sent they only have to be requested by NN. As soon as NN gets a full update, it will start requesting delta updates and just the trigger for requesting delta updates will invalidate cache
I will research soft references


- Arjun


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68547/#review209098
-----------------------------------------------------------


On Sept. 28, 2018, 7:56 p.m., Arjun Mishra wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68547/
> -----------------------------------------------------------
> 
> (Updated Sept. 28, 2018, 7:56 p.m.)
> 
> 
> Review request for sentry, kalyan kumar kalvagadda, Na Li, and Sergio Pena.
> 
> 
> Bugs: SENTRY-2370
>     https://issues.apache.org/jira/browse/SENTRY-2370
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> When NN requests path updates from sentry and if it exceeds the time threshold, on consecutive attempts sentry will attempt to fetch the full update from scratch. Instead it should cache it and update the cache before sending it to NN
> 
> 
> Diffs
> -----
> 
>   sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/ImageCache.java PRE-CREATION 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java 08b16a4df 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/PathsUpdateImageCache.java PRE-CREATION 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/PermissionsUpdateImageCache.java PRE-CREATION 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/SentryPlugin.java b8f5ce7db 
>   sentry-hdfs/sentry-hdfs-service/src/test/java/org/apache/sentry/hdfs/TestDBUpdateForwarder.java f86ce6f83 
> 
> 
> Diff: https://reviews.apache.org/r/68547/diff/7/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Arjun Mishra
> 
>


Re: Review Request 68547: SENTRY-2370: Create a cache of PathUpdates to send to NN

Posted by Sergio Pena via Review Board <no...@reviews.apache.org>.

> On Sept. 28, 2018, 4:56 p.m., Sergio Pena wrote:
> > What if NN does not request full images anymore? How can the cache be released from memory to avoid unused cache for the rest of the process life?
> 
> Arjun Mishra wrote:
>     Sergio, the cache is releases as soon as Deltas are being sent to NN

what if there are no Deltas available to send in the next NN requests? The cache will live on memory until then.
Look for Java soft or weak references (and the use of SoftReferences or WeakReferences). These objects might help on letting the GC to clean your cache if it is not needed anymore.


> On Sept. 28, 2018, 4:56 p.m., Sergio Pena wrote:
> > sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java
> > Line 158 (original), 168-172 (patched)
> > <https://reviews.apache.org/r/68547/diff/6/?file=2092470#file2092470line168>
> >
> >     I don't think this logic whether use a cache or not should be in this clas. The retriever should call the cache instead (or call the SentryStore if it needs to invalidate the current cache image).
> 
> Arjun Mishra wrote:
>     This is because the logic to invalidate cache is dependent on whether delta updates are being sent or not. Since the decisision to send delta updates is done in DBUpdateForwarder this logic sits in this piece of code

Do you need the delta number to invalidate metadata for a snapshot? The PathImageRetriever can keep a cache of the current snapshot and invalidate its cache if the next snapshot ID is different, otherwise return the full path image from the cache (read a SoftReference example on how help clean the cache if it is not used).


> On Sept. 28, 2018, 4:56 p.m., Sergio Pena wrote:
> > sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/SentryPlugin.java
> > Lines 136-137 (patched)
> > <https://reviews.apache.org/r/68547/diff/6/?file=2092473#file2092473line136>
> >
> >     Why is the cache needed to be instantiated and passed as aparameter instead of instatiating the cache inside the PathImageRetriever and PermImageRetriever instead?
> >     
> >     A retriever returns the paths, but if they're cache, then it returns the ones from the cache.
> 
> Arjun Mishra wrote:
>     Sergio we haev 2 path classes PathImageRetriever, PathDeltaRetriever. Even though we need cache to be in PathImageRetriever, we need the cache to be visible to PathDeltaRetriever so it can be invalidated.

Is a cache needed for deltas as well? I think the cache should be handled internally on each retriever.


- Sergio


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68547/#review209098
-----------------------------------------------------------


On Sept. 28, 2018, 7:56 p.m., Arjun Mishra wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68547/
> -----------------------------------------------------------
> 
> (Updated Sept. 28, 2018, 7:56 p.m.)
> 
> 
> Review request for sentry, kalyan kumar kalvagadda, Na Li, and Sergio Pena.
> 
> 
> Bugs: SENTRY-2370
>     https://issues.apache.org/jira/browse/SENTRY-2370
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> When NN requests path updates from sentry and if it exceeds the time threshold, on consecutive attempts sentry will attempt to fetch the full update from scratch. Instead it should cache it and update the cache before sending it to NN
> 
> 
> Diffs
> -----
> 
>   sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/ImageCache.java PRE-CREATION 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java 08b16a4df 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/PathsUpdateImageCache.java PRE-CREATION 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/PermissionsUpdateImageCache.java PRE-CREATION 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/SentryPlugin.java b8f5ce7db 
>   sentry-hdfs/sentry-hdfs-service/src/test/java/org/apache/sentry/hdfs/TestDBUpdateForwarder.java f86ce6f83 
> 
> 
> Diff: https://reviews.apache.org/r/68547/diff/7/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Arjun Mishra
> 
>


Re: Review Request 68547: SENTRY-2370: Create a cache of PathUpdates to send to NN

Posted by Arjun Mishra via Review Board <no...@reviews.apache.org>.

> On Sept. 28, 2018, 4:56 p.m., Sergio Pena wrote:
> > sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/SentryPlugin.java
> > Lines 136-137 (patched)
> > <https://reviews.apache.org/r/68547/diff/6/?file=2092473#file2092473line136>
> >
> >     Why is the cache needed to be instantiated and passed as aparameter instead of instatiating the cache inside the PathImageRetriever and PermImageRetriever instead?
> >     
> >     A retriever returns the paths, but if they're cache, then it returns the ones from the cache.
> 
> Arjun Mishra wrote:
>     Sergio we haev 2 path classes PathImageRetriever, PathDeltaRetriever. Even though we need cache to be in PathImageRetriever, we need the cache to be visible to PathDeltaRetriever so it can be invalidated.
> 
> Sergio Pena wrote:
>     Is a cache needed for deltas as well? I think the cache should be handled internally on each retriever.

Yes but that is a different ticket


- Arjun


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68547/#review209098
-----------------------------------------------------------


On Oct. 5, 2018, 7:11 p.m., Arjun Mishra wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68547/
> -----------------------------------------------------------
> 
> (Updated Oct. 5, 2018, 7:11 p.m.)
> 
> 
> Review request for sentry, kalyan kumar kalvagadda, Na Li, and Sergio Pena.
> 
> 
> Bugs: SENTRY-2370
>     https://issues.apache.org/jira/browse/SENTRY-2370
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> When NN requests path updates from sentry and if it exceeds the time threshold, on consecutive attempts sentry will attempt to fetch the full update from scratch. Instead it should cache it and update the cache before sending it to NN
> 
> 
> Diffs
> -----
> 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java 08b16a4df 
>   sentry-hdfs/sentry-hdfs-service/src/test/java/org/apache/sentry/hdfs/TestDBUpdateForwarder.java f86ce6f83 
> 
> 
> Diff: https://reviews.apache.org/r/68547/diff/11/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Arjun Mishra
> 
>


Re: Review Request 68547: SENTRY-2370: Create a cache of PathUpdates to send to NN

Posted by Arjun Mishra via Review Board <no...@reviews.apache.org>.

> On Sept. 28, 2018, 4:56 p.m., Sergio Pena wrote:
> > sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/ServiceConstants.java
> > Lines 53-55 (patched)
> > <https://reviews.apache.org/r/68547/diff/6/?file=2092469#file2092469line53>
> >
> >     Why is a configuration needed to whether invalidate the cache or not?
> >     
> >     Cache invalidation should be done by a certain condition that happens on the server. For instance, if the snapshot ID read from the DB is different from the cache, then it needs to invalidate the whole snapshot in the cache, or if the latest notification in the DB is newer than the cache, then it invalidates.
> >     
> >     I prefer not to add a new configuration for this.

Sure I can remove this configuration


- Arjun


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68547/#review209098
-----------------------------------------------------------


On Sept. 27, 2018, 10:29 p.m., Arjun Mishra wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68547/
> -----------------------------------------------------------
> 
> (Updated Sept. 27, 2018, 10:29 p.m.)
> 
> 
> Review request for sentry, kalyan kumar kalvagadda, Na Li, and Sergio Pena.
> 
> 
> Bugs: SENTRY-2370
>     https://issues.apache.org/jira/browse/SENTRY-2370
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> When NN requests path updates from sentry and if it exceeds the time threshold, on consecutive attempts sentry will attempt to fetch the full update from scratch. Instead it should cache it and update the cache before sending it to NN
> 
> 
> Diffs
> -----
> 
>   sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/ImageCache.java PRE-CREATION 
>   sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/ServiceConstants.java 2d21411e2 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java 08b16a4df 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/PathsUpdateImageCache.java PRE-CREATION 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/PermissionsUpdateImageCache.java PRE-CREATION 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/SentryPlugin.java b8f5ce7db 
>   sentry-hdfs/sentry-hdfs-service/src/test/java/org/apache/sentry/hdfs/TestDBUpdateForwarder.java f86ce6f83 
> 
> 
> Diff: https://reviews.apache.org/r/68547/diff/6/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Arjun Mishra
> 
>


Re: Review Request 68547: SENTRY-2370: Create a cache of PathUpdates to send to NN

Posted by Arjun Mishra via Review Board <no...@reviews.apache.org>.

> On Sept. 28, 2018, 4:56 p.m., Sergio Pena wrote:
> > sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java
> > Line 158 (original), 168-172 (patched)
> > <https://reviews.apache.org/r/68547/diff/6/?file=2092470#file2092470line168>
> >
> >     I don't think this logic whether use a cache or not should be in this clas. The retriever should call the cache instead (or call the SentryStore if it needs to invalidate the current cache image).
> 
> Arjun Mishra wrote:
>     This is because the logic to invalidate cache is dependent on whether delta updates are being sent or not. Since the decisision to send delta updates is done in DBUpdateForwarder this logic sits in this piece of code
> 
> Sergio Pena wrote:
>     Do you need the delta number to invalidate metadata for a snapshot? The PathImageRetriever can keep a cache of the current snapshot and invalidate its cache if the next snapshot ID is different, otherwise return the full path image from the cache (read a SoftReference example on how help clean the cache if it is not used).

Yes we need delta numbers. If the image id is the same but the sequence number requested is greater than cache sequence number by 1 we need to request a a full update for sequence numbers that don't exist in sentry db


- Arjun


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68547/#review209098
-----------------------------------------------------------


On Sept. 28, 2018, 7:56 p.m., Arjun Mishra wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68547/
> -----------------------------------------------------------
> 
> (Updated Sept. 28, 2018, 7:56 p.m.)
> 
> 
> Review request for sentry, kalyan kumar kalvagadda, Na Li, and Sergio Pena.
> 
> 
> Bugs: SENTRY-2370
>     https://issues.apache.org/jira/browse/SENTRY-2370
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> When NN requests path updates from sentry and if it exceeds the time threshold, on consecutive attempts sentry will attempt to fetch the full update from scratch. Instead it should cache it and update the cache before sending it to NN
> 
> 
> Diffs
> -----
> 
>   sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/ImageCache.java PRE-CREATION 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java 08b16a4df 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/PathsUpdateImageCache.java PRE-CREATION 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/PermissionsUpdateImageCache.java PRE-CREATION 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/SentryPlugin.java b8f5ce7db 
>   sentry-hdfs/sentry-hdfs-service/src/test/java/org/apache/sentry/hdfs/TestDBUpdateForwarder.java f86ce6f83 
> 
> 
> Diff: https://reviews.apache.org/r/68547/diff/7/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Arjun Mishra
> 
>


Re: Review Request 68547: SENTRY-2370: Create a cache of PathUpdates to send to NN

Posted by Arjun Mishra via Review Board <no...@reviews.apache.org>.

> On Sept. 28, 2018, 4:56 p.m., Sergio Pena wrote:
> > What if NN does not request full images anymore? How can the cache be released from memory to avoid unused cache for the rest of the process life?

Sergio, the cache is releases as soon as Deltas are being sent to NN


- Arjun


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68547/#review209098
-----------------------------------------------------------


On Sept. 27, 2018, 10:29 p.m., Arjun Mishra wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68547/
> -----------------------------------------------------------
> 
> (Updated Sept. 27, 2018, 10:29 p.m.)
> 
> 
> Review request for sentry, kalyan kumar kalvagadda, Na Li, and Sergio Pena.
> 
> 
> Bugs: SENTRY-2370
>     https://issues.apache.org/jira/browse/SENTRY-2370
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> When NN requests path updates from sentry and if it exceeds the time threshold, on consecutive attempts sentry will attempt to fetch the full update from scratch. Instead it should cache it and update the cache before sending it to NN
> 
> 
> Diffs
> -----
> 
>   sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/ImageCache.java PRE-CREATION 
>   sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/ServiceConstants.java 2d21411e2 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java 08b16a4df 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/PathsUpdateImageCache.java PRE-CREATION 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/PermissionsUpdateImageCache.java PRE-CREATION 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/SentryPlugin.java b8f5ce7db 
>   sentry-hdfs/sentry-hdfs-service/src/test/java/org/apache/sentry/hdfs/TestDBUpdateForwarder.java f86ce6f83 
> 
> 
> Diff: https://reviews.apache.org/r/68547/diff/6/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Arjun Mishra
> 
>


Re: Review Request 68547: SENTRY-2370: Create a cache of PathUpdates to send to NN

Posted by Arjun Mishra via Review Board <no...@reviews.apache.org>.

> On Sept. 28, 2018, 4:56 p.m., Sergio Pena wrote:
> > sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/SentryPlugin.java
> > Lines 136-137 (patched)
> > <https://reviews.apache.org/r/68547/diff/6/?file=2092473#file2092473line136>
> >
> >     Why is the cache needed to be instantiated and passed as aparameter instead of instatiating the cache inside the PathImageRetriever and PermImageRetriever instead?
> >     
> >     A retriever returns the paths, but if they're cache, then it returns the ones from the cache.

Sergio we haev 2 path classes PathImageRetriever, PathDeltaRetriever. Even though we need cache to be in PathImageRetriever, we need the cache to be visible to PathDeltaRetriever so it can be invalidated.


- Arjun


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68547/#review209098
-----------------------------------------------------------


On Sept. 27, 2018, 10:29 p.m., Arjun Mishra wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68547/
> -----------------------------------------------------------
> 
> (Updated Sept. 27, 2018, 10:29 p.m.)
> 
> 
> Review request for sentry, kalyan kumar kalvagadda, Na Li, and Sergio Pena.
> 
> 
> Bugs: SENTRY-2370
>     https://issues.apache.org/jira/browse/SENTRY-2370
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> When NN requests path updates from sentry and if it exceeds the time threshold, on consecutive attempts sentry will attempt to fetch the full update from scratch. Instead it should cache it and update the cache before sending it to NN
> 
> 
> Diffs
> -----
> 
>   sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/ImageCache.java PRE-CREATION 
>   sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/ServiceConstants.java 2d21411e2 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java 08b16a4df 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/PathsUpdateImageCache.java PRE-CREATION 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/PermissionsUpdateImageCache.java PRE-CREATION 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/SentryPlugin.java b8f5ce7db 
>   sentry-hdfs/sentry-hdfs-service/src/test/java/org/apache/sentry/hdfs/TestDBUpdateForwarder.java f86ce6f83 
> 
> 
> Diff: https://reviews.apache.org/r/68547/diff/6/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Arjun Mishra
> 
>


Re: Review Request 68547: SENTRY-2370: Create a cache of PathUpdates to send to NN

Posted by Arjun Mishra via Review Board <no...@reviews.apache.org>.

> On Sept. 28, 2018, 4:56 p.m., Sergio Pena wrote:
> > sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java
> > Line 158 (original), 168-172 (patched)
> > <https://reviews.apache.org/r/68547/diff/6/?file=2092470#file2092470line168>
> >
> >     I don't think this logic whether use a cache or not should be in this clas. The retriever should call the cache instead (or call the SentryStore if it needs to invalidate the current cache image).

This is because the logic to invalidate cache is dependent on whether delta updates are being sent or not. Since the decisision to send delta updates is done in DBUpdateForwarder this logic sits in this piece of code


- Arjun


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68547/#review209098
-----------------------------------------------------------


On Sept. 27, 2018, 10:29 p.m., Arjun Mishra wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68547/
> -----------------------------------------------------------
> 
> (Updated Sept. 27, 2018, 10:29 p.m.)
> 
> 
> Review request for sentry, kalyan kumar kalvagadda, Na Li, and Sergio Pena.
> 
> 
> Bugs: SENTRY-2370
>     https://issues.apache.org/jira/browse/SENTRY-2370
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> When NN requests path updates from sentry and if it exceeds the time threshold, on consecutive attempts sentry will attempt to fetch the full update from scratch. Instead it should cache it and update the cache before sending it to NN
> 
> 
> Diffs
> -----
> 
>   sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/ImageCache.java PRE-CREATION 
>   sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/ServiceConstants.java 2d21411e2 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java 08b16a4df 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/PathsUpdateImageCache.java PRE-CREATION 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/PermissionsUpdateImageCache.java PRE-CREATION 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/SentryPlugin.java b8f5ce7db 
>   sentry-hdfs/sentry-hdfs-service/src/test/java/org/apache/sentry/hdfs/TestDBUpdateForwarder.java f86ce6f83 
> 
> 
> Diff: https://reviews.apache.org/r/68547/diff/6/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Arjun Mishra
> 
>


Re: Review Request 68547: SENTRY-2370: Create a cache of PathUpdates to send to NN

Posted by Sergio Pena via Review Board <no...@reviews.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68547/#review209098
-----------------------------------------------------------



What if NN does not request full images anymore? How can the cache be released from memory to avoid unused cache for the rest of the process life?


sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/ServiceConstants.java
Lines 53-55 (patched)
<https://reviews.apache.org/r/68547/#comment293355>

    Why is a configuration needed to whether invalidate the cache or not?
    
    Cache invalidation should be done by a certain condition that happens on the server. For instance, if the snapshot ID read from the DB is different from the cache, then it needs to invalidate the whole snapshot in the cache, or if the latest notification in the DB is newer than the cache, then it invalidates.
    
    I prefer not to add a new configuration for this.



sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java
Line 158 (original), 168-172 (patched)
<https://reviews.apache.org/r/68547/#comment293360>

    I don't think this logic whether use a cache or not should be in this clas. The retriever should call the cache instead (or call the SentryStore if it needs to invalidate the current cache image).



sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/SentryPlugin.java
Lines 136-137 (patched)
<https://reviews.apache.org/r/68547/#comment293358>

    Why is the cache needed to be instantiated and passed as aparameter instead of instatiating the cache inside the PathImageRetriever and PermImageRetriever instead?
    
    A retriever returns the paths, but if they're cache, then it returns the ones from the cache.


- Sergio Pena


On Sept. 27, 2018, 10:29 p.m., Arjun Mishra wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68547/
> -----------------------------------------------------------
> 
> (Updated Sept. 27, 2018, 10:29 p.m.)
> 
> 
> Review request for sentry, kalyan kumar kalvagadda, Na Li, and Sergio Pena.
> 
> 
> Bugs: SENTRY-2370
>     https://issues.apache.org/jira/browse/SENTRY-2370
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> When NN requests path updates from sentry and if it exceeds the time threshold, on consecutive attempts sentry will attempt to fetch the full update from scratch. Instead it should cache it and update the cache before sending it to NN
> 
> 
> Diffs
> -----
> 
>   sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/ImageCache.java PRE-CREATION 
>   sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/ServiceConstants.java 2d21411e2 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java 08b16a4df 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/PathsUpdateImageCache.java PRE-CREATION 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/PermissionsUpdateImageCache.java PRE-CREATION 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/SentryPlugin.java b8f5ce7db 
>   sentry-hdfs/sentry-hdfs-service/src/test/java/org/apache/sentry/hdfs/TestDBUpdateForwarder.java f86ce6f83 
> 
> 
> Diff: https://reviews.apache.org/r/68547/diff/6/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Arjun Mishra
> 
>


Re: Review Request 68547: SENTRY-2370: Create a cache of PathUpdates to send to NN

Posted by Arjun Mishra via Review Board <no...@reviews.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68547/
-----------------------------------------------------------

(Updated Sept. 27, 2018, 10:29 p.m.)


Review request for sentry, kalyan kumar kalvagadda, Na Li, and Sergio Pena.


Changes
-------

Post feedback


Bugs: SENTRY-2370
    https://issues.apache.org/jira/browse/SENTRY-2370


Repository: sentry


Description
-------

When NN requests path updates from sentry and if it exceeds the time threshold, on consecutive attempts sentry will attempt to fetch the full update from scratch. Instead it should cache it and update the cache before sending it to NN


Diffs (updated)
-----

  sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/ImageCache.java PRE-CREATION 
  sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/ServiceConstants.java 2d21411e2 
  sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java 08b16a4df 
  sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/PathsUpdateImageCache.java PRE-CREATION 
  sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/PermissionsUpdateImageCache.java PRE-CREATION 
  sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/SentryPlugin.java b8f5ce7db 
  sentry-hdfs/sentry-hdfs-service/src/test/java/org/apache/sentry/hdfs/TestDBUpdateForwarder.java f86ce6f83 


Diff: https://reviews.apache.org/r/68547/diff/6/

Changes: https://reviews.apache.org/r/68547/diff/5-6/


Testing
-------


Thanks,

Arjun Mishra


Re: Review Request 68547: SENTRY-2370: Create a cache of PathUpdates to send to NN

Posted by kalyan kumar kalvagadda via Review Board <no...@reviews.apache.org>.

> On Sept. 12, 2018, 8:18 p.m., Na Li wrote:
> > sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java
> > Lines 137 (patched)
> > <https://reviews.apache.org/r/68547/diff/5/?file=2088592#file2088592line137>
> >
> >     This condition can be changed to
> >     if(invalidateCacheOnSendingDeltas && imageCache.hasCache())
> >     
> >     as long as imageCache has cache, we should invalidate it. It has nothing to do with if the cache is current or not.

+1 on lina's comment.

This logic can be moved above. As soon as we know that fullsnapshot need not be sent, cache can be invalidated, right.


- kalyan kumar


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68547/#review208573
-----------------------------------------------------------


On Sept. 12, 2018, 6:41 p.m., Arjun Mishra wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68547/
> -----------------------------------------------------------
> 
> (Updated Sept. 12, 2018, 6:41 p.m.)
> 
> 
> Review request for sentry, kalyan kumar kalvagadda, Na Li, and Sergio Pena.
> 
> 
> Bugs: SENTRY-2370
>     https://issues.apache.org/jira/browse/SENTRY-2370
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> When NN requests path updates from sentry and if it exceeds the time threshold, on consecutive attempts sentry will attempt to fetch the full update from scratch. Instead it should cache it and update the cache before sending it to NN
> 
> 
> Diffs
> -----
> 
>   sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/ImageCache.java PRE-CREATION 
>   sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/ServiceConstants.java 2d21411e2 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java 08b16a4df 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/PathsUpdateImageCache.java PRE-CREATION 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/PermissionsUpdateImageCache.java PRE-CREATION 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/SentryPlugin.java b8f5ce7db 
>   sentry-hdfs/sentry-hdfs-service/src/test/java/org/apache/sentry/hdfs/TestDBUpdateForwarder.java f86ce6f83 
> 
> 
> Diff: https://reviews.apache.org/r/68547/diff/5/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Arjun Mishra
> 
>


Re: Review Request 68547: SENTRY-2370: Create a cache of PathUpdates to send to NN

Posted by Arjun Mishra via Review Board <no...@reviews.apache.org>.

> On Sept. 12, 2018, 8:18 p.m., Na Li wrote:
> > sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/PathsUpdateImageCache.java
> > Lines 56 (patched)
> > <https://reviews.apache.org/r/68547/diff/5/?file=2088593#file2088593line56>
> >
> >     this should be
> >     if(pathsUpdateImageCache.getSeqNum() < sequenceNum)
> >     
> >     When system starts, NN will ask with sequenceNum = 0, and full snapshot will have pathsUpdateImageCache.getSeqNum() >= 0. 
> >     
> >     Have you tested that the cache mechanism does work as expected? Will isImageCurrent() always return false even when cache is up-to-date?

When NN restarts, its best to get a fresh full update since new notifications may have been executed.


- Arjun


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68547/#review208573
-----------------------------------------------------------


On Sept. 12, 2018, 6:41 p.m., Arjun Mishra wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68547/
> -----------------------------------------------------------
> 
> (Updated Sept. 12, 2018, 6:41 p.m.)
> 
> 
> Review request for sentry, kalyan kumar kalvagadda, Na Li, and Sergio Pena.
> 
> 
> Bugs: SENTRY-2370
>     https://issues.apache.org/jira/browse/SENTRY-2370
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> When NN requests path updates from sentry and if it exceeds the time threshold, on consecutive attempts sentry will attempt to fetch the full update from scratch. Instead it should cache it and update the cache before sending it to NN
> 
> 
> Diffs
> -----
> 
>   sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/ImageCache.java PRE-CREATION 
>   sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/ServiceConstants.java 2d21411e2 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java 08b16a4df 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/PathsUpdateImageCache.java PRE-CREATION 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/PermissionsUpdateImageCache.java PRE-CREATION 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/SentryPlugin.java b8f5ce7db 
>   sentry-hdfs/sentry-hdfs-service/src/test/java/org/apache/sentry/hdfs/TestDBUpdateForwarder.java f86ce6f83 
> 
> 
> Diff: https://reviews.apache.org/r/68547/diff/5/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Arjun Mishra
> 
>


Re: Review Request 68547: SENTRY-2370: Create a cache of PathUpdates to send to NN

Posted by Na Li via Review Board <no...@reviews.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68547/#review208573
-----------------------------------------------------------




sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java
Lines 137 (patched)
<https://reviews.apache.org/r/68547/#comment292628>

    This condition can be changed to
    if(invalidateCacheOnSendingDeltas && imageCache.hasCache())
    
    as long as imageCache has cache, we should invalidate it. It has nothing to do with if the cache is current or not.



sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/PathsUpdateImageCache.java
Lines 24 (patched)
<https://reviews.apache.org/r/68547/#comment292624>

    this should be
    
    PathsUpdateImageCache implements ImageCache<PathsUpdate>



sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/PathsUpdateImageCache.java
Lines 56 (patched)
<https://reviews.apache.org/r/68547/#comment292625>

    this should be
    if(pathsUpdateImageCache.getSeqNum() < sequenceNum)
    
    When system starts, NN will ask with sequenceNum = 0, and full snapshot will have pathsUpdateImageCache.getSeqNum() >= 0. 
    
    Have you tested that the cache mechanism does work as expected? Will isImageCurrent() always return false even when cache is up-to-date?



sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/PermissionsUpdateImageCache.java
Lines 24 (patched)
<https://reviews.apache.org/r/68547/#comment292626>

    same problem



sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/PermissionsUpdateImageCache.java
Lines 50 (patched)
<https://reviews.apache.org/r/68547/#comment292627>

    same issue


- Na Li


On Sept. 12, 2018, 6:41 p.m., Arjun Mishra wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68547/
> -----------------------------------------------------------
> 
> (Updated Sept. 12, 2018, 6:41 p.m.)
> 
> 
> Review request for sentry, kalyan kumar kalvagadda, Na Li, and Sergio Pena.
> 
> 
> Bugs: SENTRY-2370
>     https://issues.apache.org/jira/browse/SENTRY-2370
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> When NN requests path updates from sentry and if it exceeds the time threshold, on consecutive attempts sentry will attempt to fetch the full update from scratch. Instead it should cache it and update the cache before sending it to NN
> 
> 
> Diffs
> -----
> 
>   sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/ImageCache.java PRE-CREATION 
>   sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/ServiceConstants.java 2d21411e2 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java 08b16a4df 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/PathsUpdateImageCache.java PRE-CREATION 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/PermissionsUpdateImageCache.java PRE-CREATION 
>   sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/SentryPlugin.java b8f5ce7db 
>   sentry-hdfs/sentry-hdfs-service/src/test/java/org/apache/sentry/hdfs/TestDBUpdateForwarder.java f86ce6f83 
> 
> 
> Diff: https://reviews.apache.org/r/68547/diff/5/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Arjun Mishra
> 
>


Re: Review Request 68547: SENTRY-2370: Create a cache of PathUpdates to send to NN

Posted by Arjun Mishra via Review Board <no...@reviews.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68547/
-----------------------------------------------------------

(Updated Sept. 12, 2018, 6:41 p.m.)


Review request for sentry, kalyan kumar kalvagadda, Na Li, and Sergio Pena.


Bugs: SENTRY-2370
    https://issues.apache.org/jira/browse/SENTRY-2370


Repository: sentry


Description
-------

When NN requests path updates from sentry and if it exceeds the time threshold, on consecutive attempts sentry will attempt to fetch the full update from scratch. Instead it should cache it and update the cache before sending it to NN


Diffs (updated)
-----

  sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/ImageCache.java PRE-CREATION 
  sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/ServiceConstants.java 2d21411e2 
  sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java 08b16a4df 
  sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/PathsUpdateImageCache.java PRE-CREATION 
  sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/PermissionsUpdateImageCache.java PRE-CREATION 
  sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/SentryPlugin.java b8f5ce7db 
  sentry-hdfs/sentry-hdfs-service/src/test/java/org/apache/sentry/hdfs/TestDBUpdateForwarder.java f86ce6f83 


Diff: https://reviews.apache.org/r/68547/diff/5/

Changes: https://reviews.apache.org/r/68547/diff/4-5/


Testing
-------


Thanks,

Arjun Mishra


Re: Review Request 68547: SENTRY-2370: Create a cache of PathUpdates to send to NN

Posted by Arjun Mishra via Review Board <no...@reviews.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68547/
-----------------------------------------------------------

(Updated Sept. 11, 2018, 6:04 p.m.)


Review request for sentry, kalyan kumar kalvagadda, Na Li, and Sergio Pena.


Bugs: SENTRY-2370
    https://issues.apache.org/jira/browse/SENTRY-2370


Repository: sentry


Description
-------

When NN requests path updates from sentry and if it exceeds the time threshold, on consecutive attempts sentry will attempt to fetch the full update from scratch. Instead it should cache it and update the cache before sending it to NN


Diffs (updated)
-----

  sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/ImageCache.java PRE-CREATION 
  sentry-hdfs/sentry-hdfs-common/src/main/java/org/apache/sentry/hdfs/ServiceConstants.java 2d21411e2 
  sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/DBUpdateForwarder.java 08b16a4df 
  sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/PathsUpdateImageCache.java PRE-CREATION 
  sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/PermissionsUpdateImageCache.java PRE-CREATION 
  sentry-hdfs/sentry-hdfs-service/src/main/java/org/apache/sentry/hdfs/SentryPlugin.java b8f5ce7db 
  sentry-hdfs/sentry-hdfs-service/src/test/java/org/apache/sentry/hdfs/TestDBUpdateForwarder.java f86ce6f83 


Diff: https://reviews.apache.org/r/68547/diff/4/

Changes: https://reviews.apache.org/r/68547/diff/3-4/


Testing
-------


Thanks,

Arjun Mishra