You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sentry.apache.org by Sergio Pena <se...@cloudera.com> on 2017/07/17 22:10:55 UTC

Review Request 60926: SENTRY-1760: HMSFollower should detect when a full snapshot from HMS is required

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60926/
-----------------------------------------------------------

Review request for sentry, Alexander Kolbasov, kalyan kumar kalvagadda, and Na Li.


Bugs: sentry-1760
    https://issues.apache.org/jira/browse/sentry-1760


Repository: sentry


Description
-------

The patch will set the 'requestHmsSnapshot' to TRUE whenever the following cases are found:

* List of notifications received are different than expected. 
  This may happen when Sentry has been down or HDFS sync was disabled for a while (more than 24h), 
  and the HMS cleared old notifications (older than 24h) not processed by Sentry causing a gap when retrieving notifications.
* Latest Sentry notification ID processed is bigger than current HMS notification ID.
  This may happen when the HMS DB data was reset or restore from an old snapshot causing sync issues with Sentry.

New snapshots are persisted with a new generation ID (or image number), so there's is no need to clean-up older snapshots.


Diffs
-----

  sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HmsFollower.java a9d05b1f24823c79ab6d37a07b60a7e3b30f2b3c 
  sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryHmsClient.java 29a85d7181b52937e16028baf71ee5528bbd48d1 
  sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHmsFollower.java 2095469ae16ad918347a31d76bc5f22e12b8f4c3 


Diff: https://reviews.apache.org/r/60926/diff/1/


Testing
-------

Added test cases on TestHmsFollower


Thanks,

Sergio Pena


Re: Review Request 60926: SENTRY-1760: HMSFollower should detect when a full snapshot from HMS is required

Posted by Brian Towles <bt...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60926/#review180850
-----------------------------------------------------------


Ship it!




Other then LiNas stuff ship it

- Brian Towles


On July 18, 2017, 10:20 a.m., Sergio Pena wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60926/
> -----------------------------------------------------------
> 
> (Updated July 18, 2017, 10:20 a.m.)
> 
> 
> Review request for sentry, Alexander Kolbasov, kalyan kumar kalvagadda, and Na Li.
> 
> 
> Bugs: sentry-1760
>     https://issues.apache.org/jira/browse/sentry-1760
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> The patch will set the 'requestHmsSnapshot' to TRUE whenever the following cases are found:
> 
> * List of notifications received are different than expected. 
>   This may happen when Sentry has been down or HDFS sync was disabled for a while (more than 24h), 
>   and the HMS cleared old notifications (older than 24h) not processed by Sentry causing a gap when retrieving notifications.
> * Latest Sentry notification ID processed is bigger than current HMS notification ID.
>   This may happen when the HMS DB data was reset or restore from an old snapshot causing sync issues with Sentry.
> 
> New snapshots are persisted with a new generation ID (or image number), so there's is no need to clean-up older snapshots.
> 
> 
> Diffs
> -----
> 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java 083ee4c247f96d5c87b44b9785663a2783e6644d 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryHMSClient.java 05518e81f52965dc1ff102dcdd446010381b9a7a 
>   sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHMSFollower.java d67c16258c67aae997de4c0451c8b642ab05d298 
> 
> 
> Diff: https://reviews.apache.org/r/60926/diff/2/
> 
> 
> Testing
> -------
> 
> Added test cases on TestHmsFollower
> 
> 
> Thanks,
> 
> Sergio Pena
> 
>


Re: Review Request 60926: SENTRY-1760: HMSFollower should detect when a full snapshot from HMS is required

Posted by Na Li <li...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60926/#review180856
-----------------------------------------------------------




sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
Lines 168 (patched)
<https://reviews.apache.org/r/60926/#comment256158>

    You can add check to see if the current notificationID is equal to lastProcessedNotificationId. if so, do nothing.
    
    long currentNotificationId = client.getCurrentNotificationId();
    
    if (currentNotificationId == notificationId) {
          return;
        }


- Na Li


On July 18, 2017, 7:31 p.m., Sergio Pena wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60926/
> -----------------------------------------------------------
> 
> (Updated July 18, 2017, 7:31 p.m.)
> 
> 
> Review request for sentry, Alexander Kolbasov, kalyan kumar kalvagadda, and Na Li.
> 
> 
> Bugs: sentry-1760
>     https://issues.apache.org/jira/browse/sentry-1760
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> The patch will set the 'requestHmsSnapshot' to TRUE whenever the following cases are found:
> 
> * List of notifications received are different than expected. 
>   This may happen when Sentry has been down or HDFS sync was disabled for a while (more than 24h), 
>   and the HMS cleared old notifications (older than 24h) not processed by Sentry causing a gap when retrieving notifications.
> * Latest Sentry notification ID processed is bigger than current HMS notification ID.
>   This may happen when the HMS DB data was reset or restore from an old snapshot causing sync issues with Sentry.
> 
> New snapshots are persisted with a new generation ID (or image number), so there's is no need to clean-up older snapshots.
> 
> 
> Diffs
> -----
> 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java 083ee4c247f96d5c87b44b9785663a2783e6644d 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryHMSClient.java 05518e81f52965dc1ff102dcdd446010381b9a7a 
>   sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHMSFollower.java d67c16258c67aae997de4c0451c8b642ab05d298 
> 
> 
> Diff: https://reviews.apache.org/r/60926/diff/3/
> 
> 
> Testing
> -------
> 
> Added test cases on TestHmsFollower
> 
> 
> Thanks,
> 
> Sergio Pena
> 
>


Re: Review Request 60926: SENTRY-1760: HMSFollower should detect when a full snapshot from HMS is required

Posted by Alexander Kolbasov <ak...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60926/#review181048
-----------------------------------------------------------


Ship it!




Ship It!

- Alexander Kolbasov


On July 19, 2017, 8:13 p.m., Sergio Pena wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60926/
> -----------------------------------------------------------
> 
> (Updated July 19, 2017, 8:13 p.m.)
> 
> 
> Review request for sentry, Alexander Kolbasov, kalyan kumar kalvagadda, and Na Li.
> 
> 
> Bugs: sentry-1760
>     https://issues.apache.org/jira/browse/sentry-1760
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> The patch will set the 'requestHmsSnapshot' to TRUE whenever the following cases are found:
> 
> * List of notifications received are different than expected. 
>   This may happen when Sentry has been down or HDFS sync was disabled for a while (more than 24h), 
>   and the HMS cleared old notifications (older than 24h) not processed by Sentry causing a gap when retrieving notifications.
> * Latest Sentry notification ID processed is bigger than current HMS notification ID.
>   This may happen when the HMS DB data was reset or restore from an old snapshot causing sync issues with Sentry.
> 
> New snapshots are persisted with a new generation ID (or image number), so there's is no need to clean-up older snapshots.
> 
> 
> Diffs
> -----
> 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java 083ee4c247f96d5c87b44b9785663a2783e6644d 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryHMSClient.java 05518e81f52965dc1ff102dcdd446010381b9a7a 
>   sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHMSFollower.java d67c16258c67aae997de4c0451c8b642ab05d298 
> 
> 
> Diff: https://reviews.apache.org/r/60926/diff/5/
> 
> 
> Testing
> -------
> 
> Added test cases on TestHmsFollower
> 
> 
> Thanks,
> 
> Sergio Pena
> 
>


Re: Review Request 60926: SENTRY-1760: HMSFollower should detect when a full snapshot from HMS is required

Posted by Sergio Pena <se...@cloudera.com>.

> On July 19, 2017, 9:27 p.m., kalyan kumar kalvagadda wrote:
> > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
> > Lines 169 (patched)
> > <https://reviews.apache.org/r/60926/diff/5/?file=1780058#file1780058line173>
> >
> >     Currently, HMSFollower looks for the latest notifications and processes them after persisting snapshot. 
> >     
> >     You have changed it. Why dod you want to change that behavior?

To make the code cleaner. Btw, why do we have to get notifications if a snapshot is already done? Why not checking that in the next cycle?

There was a comment from Sasha to call createFullSnapshot() immediatly instead of waiting for the next cycle, and I would have had to add 2 more conditions to check if createFullSnapshot() returns EMPTY or not, and continue to next notifications or not. I prefer to keep this simple buy doing one thing on each cycle.


> On July 19, 2017, 9:27 p.m., kalyan kumar kalvagadda wrote:
> > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
> > Lines 193-214 (patched)
> > <https://reviews.apache.org/r/60926/diff/5/?file=1780058#file1780058line197>
> >
> >     With your change, full snapshot is taken again in below situations.
> >     
> >     1) The current notification Id on the HMS is less than the latest processed by Sentry.
> >     2) If the HMS and Sentry processed notifications are out-of-sync.
> >     
> >     In both these situations, I do not see a reason to retain old snapshot. Why don't we delete it before persisting new snapshot?

That's a good question to Sasha.


- Sergio


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60926/#review180972
-----------------------------------------------------------


On July 19, 2017, 8:13 p.m., Sergio Pena wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60926/
> -----------------------------------------------------------
> 
> (Updated July 19, 2017, 8:13 p.m.)
> 
> 
> Review request for sentry, Alexander Kolbasov, kalyan kumar kalvagadda, and Na Li.
> 
> 
> Bugs: sentry-1760
>     https://issues.apache.org/jira/browse/sentry-1760
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> The patch will set the 'requestHmsSnapshot' to TRUE whenever the following cases are found:
> 
> * List of notifications received are different than expected. 
>   This may happen when Sentry has been down or HDFS sync was disabled for a while (more than 24h), 
>   and the HMS cleared old notifications (older than 24h) not processed by Sentry causing a gap when retrieving notifications.
> * Latest Sentry notification ID processed is bigger than current HMS notification ID.
>   This may happen when the HMS DB data was reset or restore from an old snapshot causing sync issues with Sentry.
> 
> New snapshots are persisted with a new generation ID (or image number), so there's is no need to clean-up older snapshots.
> 
> 
> Diffs
> -----
> 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java 083ee4c247f96d5c87b44b9785663a2783e6644d 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryHMSClient.java 05518e81f52965dc1ff102dcdd446010381b9a7a 
>   sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHMSFollower.java d67c16258c67aae997de4c0451c8b642ab05d298 
> 
> 
> Diff: https://reviews.apache.org/r/60926/diff/5/
> 
> 
> Testing
> -------
> 
> Added test cases on TestHmsFollower
> 
> 
> Thanks,
> 
> Sergio Pena
> 
>


Re: Review Request 60926: SENTRY-1760: HMSFollower should detect when a full snapshot from HMS is required

Posted by Sergio Pena <se...@cloudera.com>.

> On July 19, 2017, 9:27 p.m., kalyan kumar kalvagadda wrote:
> > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
> > Lines 193-214 (patched)
> > <https://reviews.apache.org/r/60926/diff/5/?file=1780058#file1780058line197>
> >
> >     With your change, full snapshot is taken again in below situations.
> >     
> >     1) The current notification Id on the HMS is less than the latest processed by Sentry.
> >     2) If the HMS and Sentry processed notifications are out-of-sync.
> >     
> >     In both these situations, I do not see a reason to retain old snapshot. Why don't we delete it before persisting new snapshot?
> 
> Sergio Pena wrote:
>     That's a good question to Sasha.

[5:09 PM] Alexander Kolbasov: a) We don't need to do it immediately
[5:09 PM] Alexander Kolbasov: b) This may cause longer snapshot creation which is critical
[5:10 PM] Alexander Kolbasov: It is better to clean them up later

We can create a follow-up jira to have a clean-up thread.


- Sergio


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60926/#review180972
-----------------------------------------------------------


On July 19, 2017, 8:13 p.m., Sergio Pena wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60926/
> -----------------------------------------------------------
> 
> (Updated July 19, 2017, 8:13 p.m.)
> 
> 
> Review request for sentry, Alexander Kolbasov, kalyan kumar kalvagadda, and Na Li.
> 
> 
> Bugs: sentry-1760
>     https://issues.apache.org/jira/browse/sentry-1760
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> The patch will set the 'requestHmsSnapshot' to TRUE whenever the following cases are found:
> 
> * List of notifications received are different than expected. 
>   This may happen when Sentry has been down or HDFS sync was disabled for a while (more than 24h), 
>   and the HMS cleared old notifications (older than 24h) not processed by Sentry causing a gap when retrieving notifications.
> * Latest Sentry notification ID processed is bigger than current HMS notification ID.
>   This may happen when the HMS DB data was reset or restore from an old snapshot causing sync issues with Sentry.
> 
> New snapshots are persisted with a new generation ID (or image number), so there's is no need to clean-up older snapshots.
> 
> 
> Diffs
> -----
> 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java 083ee4c247f96d5c87b44b9785663a2783e6644d 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryHMSClient.java 05518e81f52965dc1ff102dcdd446010381b9a7a 
>   sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHMSFollower.java d67c16258c67aae997de4c0451c8b642ab05d298 
> 
> 
> Diff: https://reviews.apache.org/r/60926/diff/5/
> 
> 
> Testing
> -------
> 
> Added test cases on TestHmsFollower
> 
> 
> Thanks,
> 
> Sergio Pena
> 
>


Re: Review Request 60926: SENTRY-1760: HMSFollower should detect when a full snapshot from HMS is required

Posted by kalyan kumar kalvagadda <kk...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60926/#review180972
-----------------------------------------------------------



Sergio, I might have some more comments.


sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
Lines 169 (patched)
<https://reviews.apache.org/r/60926/#comment256354>

    Currently, HMSFollower looks for the latest notifications and processes them after persisting snapshot. 
    
    You have changed it. Why dod you want to change that behavior?



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
Lines 193-214 (patched)
<https://reviews.apache.org/r/60926/#comment256365>

    With your change, full snapshot is taken again in below situations.
    
    1) The current notification Id on the HMS is less than the latest processed by Sentry.
    2) If the HMS and Sentry processed notifications are out-of-sync.
    
    In both these situations, I do not see a reason to retain old snapshot. Why don't we delete it before persisting new snapshot?



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
Lines 206-214 (patched)
<https://reviews.apache.org/r/60926/#comment256366>

    we have same check in getNotifications method in SentryHMSClient class. We need to remove one of them.



sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHMSFollower.java
Lines 69 (patched)
<https://reviews.apache.org/r/60926/#comment256362>

    method name testPersistAFullSnapshotWhenNoSnapshotISProcessedYet is better.


- kalyan kumar kalvagadda


On July 19, 2017, 8:13 p.m., Sergio Pena wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60926/
> -----------------------------------------------------------
> 
> (Updated July 19, 2017, 8:13 p.m.)
> 
> 
> Review request for sentry, Alexander Kolbasov, kalyan kumar kalvagadda, and Na Li.
> 
> 
> Bugs: sentry-1760
>     https://issues.apache.org/jira/browse/sentry-1760
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> The patch will set the 'requestHmsSnapshot' to TRUE whenever the following cases are found:
> 
> * List of notifications received are different than expected. 
>   This may happen when Sentry has been down or HDFS sync was disabled for a while (more than 24h), 
>   and the HMS cleared old notifications (older than 24h) not processed by Sentry causing a gap when retrieving notifications.
> * Latest Sentry notification ID processed is bigger than current HMS notification ID.
>   This may happen when the HMS DB data was reset or restore from an old snapshot causing sync issues with Sentry.
> 
> New snapshots are persisted with a new generation ID (or image number), so there's is no need to clean-up older snapshots.
> 
> 
> Diffs
> -----
> 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java 083ee4c247f96d5c87b44b9785663a2783e6644d 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryHMSClient.java 05518e81f52965dc1ff102dcdd446010381b9a7a 
>   sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHMSFollower.java d67c16258c67aae997de4c0451c8b642ab05d298 
> 
> 
> Diff: https://reviews.apache.org/r/60926/diff/5/
> 
> 
> Testing
> -------
> 
> Added test cases on TestHmsFollower
> 
> 
> Thanks,
> 
> Sergio Pena
> 
>


Re: Review Request 60926: SENTRY-1760: HMSFollower should detect when a full snapshot from HMS is required

Posted by Sergio Pena <se...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60926/
-----------------------------------------------------------

(Updated July 19, 2017, 8:13 p.m.)


Review request for sentry, Alexander Kolbasov, kalyan kumar kalvagadda, and Na Li.


Changes
-------

Addressed Sasha and Kalyan's changes

- Change the HMSFollower.run() method to either create a full snapshot or process notifications. This makes the code cleaner.
- Create a method that checks if a snapshot is required
- Create a method that checks if notifications are out-of-sync and a full snapshot is required
- Add more comments


Bugs: sentry-1760
    https://issues.apache.org/jira/browse/sentry-1760


Repository: sentry


Description
-------

The patch will set the 'requestHmsSnapshot' to TRUE whenever the following cases are found:

* List of notifications received are different than expected. 
  This may happen when Sentry has been down or HDFS sync was disabled for a while (more than 24h), 
  and the HMS cleared old notifications (older than 24h) not processed by Sentry causing a gap when retrieving notifications.
* Latest Sentry notification ID processed is bigger than current HMS notification ID.
  This may happen when the HMS DB data was reset or restore from an old snapshot causing sync issues with Sentry.

New snapshots are persisted with a new generation ID (or image number), so there's is no need to clean-up older snapshots.


Diffs (updated)
-----

  sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java 083ee4c247f96d5c87b44b9785663a2783e6644d 
  sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryHMSClient.java 05518e81f52965dc1ff102dcdd446010381b9a7a 
  sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHMSFollower.java d67c16258c67aae997de4c0451c8b642ab05d298 


Diff: https://reviews.apache.org/r/60926/diff/5/

Changes: https://reviews.apache.org/r/60926/diff/4-5/


Testing
-------

Added test cases on TestHmsFollower


Thanks,

Sergio Pena


Re: Review Request 60926: SENTRY-1760: HMSFollower should detect when a full snapshot from HMS is required

Posted by Sergio Pena <se...@cloudera.com>.

> On July 19, 2017, 5:20 p.m., Alexander Kolbasov wrote:
> > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
> > Lines 97 (patched)
> > <https://reviews.apache.org/r/60926/diff/4/?file=1778739#file1778739line97>
> >
> >     Can you achieve the same purpose by providing client in a constructor? This may be a bit cleaner.

The constructor parameters are increasing. We have currently 6, and now we will provide 2 more. I think is cleaner to have setters and avoid large # of method parameters, don't you think?


> On July 19, 2017, 5:20 p.m., Alexander Kolbasov wrote:
> > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
> > Line 152 (original), 159 (patched)
> > <https://reviews.apache.org/r/60926/diff/4/?file=1778739#file1778739line159>
> >
> >     What if we have an old snapshot so it isn't empty (and in fact we keep some number of old snapshots around) so this will only be the case the first time. How is the decision made afterwords?

I created a isFullSnapshotRequired() method that have 2 conditions to check for full snapshots. It has better explanation there.


> On July 19, 2017, 5:20 p.m., Alexander Kolbasov wrote:
> > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
> > Lines 165 (patched)
> > <https://reviews.apache.org/r/60926/diff/4/?file=1778739#file1778739line165>
> >
> >     Why do we need to keep in in the class field?
> >     
> >     This is rather weird - we detect that we need a full snapshot, and instead of just getting it, record this piece of information and retrieve it during the next cycle. What is the point in doing it this way?

I did some changes on the run() method to make this cleaner. I think we should either create a snapshot or process notifications per run to reduce the conditions and make the code cleaner. If a snapshot is done, why should we try to fetch notifications again? Better to do either one per run I think.


> On July 19, 2017, 5:20 p.m., Alexander Kolbasov wrote:
> > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
> > Lines 169 (patched)
> > <https://reviews.apache.org/r/60926/diff/4/?file=1778739#file1778739line169>
> >
> >     Suppose that we have two Sentry servers. The first one got the current notification ID, but later looses leadership and another one processes a few requests by the time we get here, so we can run into this situation. So we should be a bit more conservative and may be check a few things first:
> >     
> >     1) Check that we are still the leader
> >     2) Re-read currentNotificationId and make sure it didn't advanced forward.
> >     
> >     Once we detect the condition we should just get the snapshot rather then defer it to the next iteration.

What about checking for leadership at the moment of persisting the snapshot? Otherwise, I feel we'll have to have this check evertime we do something on the code making it ugly with too many leadershipt checks, and perhaps future developers won't know that we would need this check.


> On July 19, 2017, 5:20 p.m., Alexander Kolbasov wrote:
> > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
> > Lines 185 (patched)
> > <https://reviews.apache.org/r/60926/diff/4/?file=1778739#file1778739line185>
> >
> >     Does HMS provide a guarantee that notifications have no gaps?

HMS events ID are incremented by the application and each notification is persisted in the same transaction as the DDL operation executed, so this should guarantee we have consecutive identifiers and no gaps happened. The gap I was talking about here is a gap when HMS clean-up older notifications if Sentry was shutdown for a long period of time.

Anyway, I redo this part, and made it clear to differentiate between gaps and out-of-sync situations.


> On July 19, 2017, 5:20 p.m., Alexander Kolbasov wrote:
> > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
> > Lines 210 (patched)
> > <https://reviews.apache.org/r/60926/diff/4/?file=1778739#file1778739line210>
> >
> >     Please document this function.

This does not longer exists.


> On July 19, 2017, 5:20 p.m., Alexander Kolbasov wrote:
> > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
> > Lines 217 (patched)
> > <https://reviews.apache.org/r/60926/diff/4/?file=1778739#file1778739line217>
> >
> >     This is a strange way of doing it. Why can't we check the first notification instead? Why are we checking the size instead? What is it that you are actually checking?
> >     
> >     What if HMS notifications contain duplicate IDs (and we know that this is possible) - this will mess up the logic.

Right. I didn't thinkg about it.
I did the change.


> On July 19, 2017, 5:20 p.m., Alexander Kolbasov wrote:
> > sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryHMSClient.java
> > Line 217 (original)
> > <https://reviews.apache.org/r/60926/diff/4/?file=1778740#file1778740line217>
> >
> >     Is this no longer relevant or the logic is moved elsewhere?

I did a change on the HMSfollower.run() to either get a full snapshot or process notifications. I left this code now because the check is not happening anymore. Btw, I did the run() change to make the code cleaner.


- Sergio


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60926/#review180946
-----------------------------------------------------------


On July 18, 2017, 8:23 p.m., Sergio Pena wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60926/
> -----------------------------------------------------------
> 
> (Updated July 18, 2017, 8:23 p.m.)
> 
> 
> Review request for sentry, Alexander Kolbasov, kalyan kumar kalvagadda, and Na Li.
> 
> 
> Bugs: sentry-1760
>     https://issues.apache.org/jira/browse/sentry-1760
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> The patch will set the 'requestHmsSnapshot' to TRUE whenever the following cases are found:
> 
> * List of notifications received are different than expected. 
>   This may happen when Sentry has been down or HDFS sync was disabled for a while (more than 24h), 
>   and the HMS cleared old notifications (older than 24h) not processed by Sentry causing a gap when retrieving notifications.
> * Latest Sentry notification ID processed is bigger than current HMS notification ID.
>   This may happen when the HMS DB data was reset or restore from an old snapshot causing sync issues with Sentry.
> 
> New snapshots are persisted with a new generation ID (or image number), so there's is no need to clean-up older snapshots.
> 
> 
> Diffs
> -----
> 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java 083ee4c247f96d5c87b44b9785663a2783e6644d 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryHMSClient.java 05518e81f52965dc1ff102dcdd446010381b9a7a 
>   sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHMSFollower.java d67c16258c67aae997de4c0451c8b642ab05d298 
> 
> 
> Diff: https://reviews.apache.org/r/60926/diff/4/
> 
> 
> Testing
> -------
> 
> Added test cases on TestHmsFollower
> 
> 
> Thanks,
> 
> Sergio Pena
> 
>


Re: Review Request 60926: SENTRY-1760: HMSFollower should detect when a full snapshot from HMS is required

Posted by Alexander Kolbasov <ak...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60926/#review180946
-----------------------------------------------------------




sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
Lines 97 (patched)
<https://reviews.apache.org/r/60926/#comment256303>

    Can you achieve the same purpose by providing client in a constructor? This may be a bit cleaner.



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
Line 149 (original), 156 (patched)
<https://reviews.apache.org/r/60926/#comment256304>

    It would be very good to have a complete description of the algorithm used to decide whether a full snapshot is needed - it may be better to put it as a block comment at the top of the file or even in the class javadoc.



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
Line 151 (original), 158 (patched)
<https://reviews.apache.org/r/60926/#comment256305>

    Do we always have at least one object? Can Hive have a valid empty path list or it is not possible?
    
    The second part about the gaps - please provide more ddetail - gaps between what and what?



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
Line 152 (original), 159 (patched)
<https://reviews.apache.org/r/60926/#comment256306>

    What if we have an old snapshot so it isn't empty (and in fact we keep some number of old snapshots around) so this will only be the case the first time. How is the decision made afterwords?



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
Lines 165 (patched)
<https://reviews.apache.org/r/60926/#comment256307>

    Why do we need to keep in in the class field?
    
    This is rather weird - we detect that we need a full snapshot, and instead of just getting it, record this piece of information and retrieve it during the next cycle. What is the point in doing it this way?



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
Lines 169 (patched)
<https://reviews.apache.org/r/60926/#comment256308>

    Suppose that we have two Sentry servers. The first one got the current notification ID, but later looses leadership and another one processes a few requests by the time we get here, so we can run into this situation. So we should be a bit more conservative and may be check a few things first:
    
    1) Check that we are still the leader
    2) Re-read currentNotificationId and make sure it didn't advanced forward.
    
    Once we detect the condition we should just get the snapshot rather then defer it to the next iteration.



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
Lines 185 (patched)
<https://reviews.apache.org/r/60926/#comment256309>

    Does HMS provide a guarantee that notifications have no gaps?



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
Lines 210 (patched)
<https://reviews.apache.org/r/60926/#comment256311>

    Please document this function.



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
Lines 217 (patched)
<https://reviews.apache.org/r/60926/#comment256312>

    This is a strange way of doing it. Why can't we check the first notification instead? Why are we checking the size instead? What is it that you are actually checking?
    
    What if HMS notifications contain duplicate IDs (and we know that this is possible) - this will mess up the logic.



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryHMSClient.java
Line 217 (original)
<https://reviews.apache.org/r/60926/#comment256313>

    Is this no longer relevant or the logic is moved elsewhere?



sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryHMSClient.java
Lines 239 (patched)
<https://reviews.apache.org/r/60926/#comment256314>

    can eventId be null here?


- Alexander Kolbasov


On July 18, 2017, 8:23 p.m., Sergio Pena wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60926/
> -----------------------------------------------------------
> 
> (Updated July 18, 2017, 8:23 p.m.)
> 
> 
> Review request for sentry, Alexander Kolbasov, kalyan kumar kalvagadda, and Na Li.
> 
> 
> Bugs: sentry-1760
>     https://issues.apache.org/jira/browse/sentry-1760
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> The patch will set the 'requestHmsSnapshot' to TRUE whenever the following cases are found:
> 
> * List of notifications received are different than expected. 
>   This may happen when Sentry has been down or HDFS sync was disabled for a while (more than 24h), 
>   and the HMS cleared old notifications (older than 24h) not processed by Sentry causing a gap when retrieving notifications.
> * Latest Sentry notification ID processed is bigger than current HMS notification ID.
>   This may happen when the HMS DB data was reset or restore from an old snapshot causing sync issues with Sentry.
> 
> New snapshots are persisted with a new generation ID (or image number), so there's is no need to clean-up older snapshots.
> 
> 
> Diffs
> -----
> 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java 083ee4c247f96d5c87b44b9785663a2783e6644d 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryHMSClient.java 05518e81f52965dc1ff102dcdd446010381b9a7a 
>   sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHMSFollower.java d67c16258c67aae997de4c0451c8b642ab05d298 
> 
> 
> Diff: https://reviews.apache.org/r/60926/diff/4/
> 
> 
> Testing
> -------
> 
> Added test cases on TestHmsFollower
> 
> 
> Thanks,
> 
> Sergio Pena
> 
>


Re: Review Request 60926: SENTRY-1760: HMSFollower should detect when a full snapshot from HMS is required

Posted by kalyan kumar kalvagadda <kk...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60926/#review180881
-----------------------------------------------------------



Can we consider a case where we get the notifications from HMS but there are some gaps in between. Let's say HMSFollower requests notifications from ID:100 and gets a response with a collection of notificaitons starting from ID:101 and the last entry in collection with ID:200.
It could possible that ID:150 and ID:151 are missing. I'm not sure if it is a common scenario OR a scenario which should never happen. For what ever reason it happens, we should log this and move on.

- kalyan kumar kalvagadda


On July 18, 2017, 8:23 p.m., Sergio Pena wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60926/
> -----------------------------------------------------------
> 
> (Updated July 18, 2017, 8:23 p.m.)
> 
> 
> Review request for sentry, Alexander Kolbasov, kalyan kumar kalvagadda, and Na Li.
> 
> 
> Bugs: sentry-1760
>     https://issues.apache.org/jira/browse/sentry-1760
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> The patch will set the 'requestHmsSnapshot' to TRUE whenever the following cases are found:
> 
> * List of notifications received are different than expected. 
>   This may happen when Sentry has been down or HDFS sync was disabled for a while (more than 24h), 
>   and the HMS cleared old notifications (older than 24h) not processed by Sentry causing a gap when retrieving notifications.
> * Latest Sentry notification ID processed is bigger than current HMS notification ID.
>   This may happen when the HMS DB data was reset or restore from an old snapshot causing sync issues with Sentry.
> 
> New snapshots are persisted with a new generation ID (or image number), so there's is no need to clean-up older snapshots.
> 
> 
> Diffs
> -----
> 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java 083ee4c247f96d5c87b44b9785663a2783e6644d 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryHMSClient.java 05518e81f52965dc1ff102dcdd446010381b9a7a 
>   sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHMSFollower.java d67c16258c67aae997de4c0451c8b642ab05d298 
> 
> 
> Diff: https://reviews.apache.org/r/60926/diff/4/
> 
> 
> Testing
> -------
> 
> Added test cases on TestHmsFollower
> 
> 
> Thanks,
> 
> Sergio Pena
> 
>


Re: Review Request 60926: SENTRY-1760: HMSFollower should detect when a full snapshot from HMS is required

Posted by Sergio Pena <se...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60926/
-----------------------------------------------------------

(Updated July 18, 2017, 8:23 p.m.)


Review request for sentry, Alexander Kolbasov, kalyan kumar kalvagadda, and Na Li.


Changes
-------

- Fixed unit tests
- Addressed Lina's comments.


Bugs: sentry-1760
    https://issues.apache.org/jira/browse/sentry-1760


Repository: sentry


Description
-------

The patch will set the 'requestHmsSnapshot' to TRUE whenever the following cases are found:

* List of notifications received are different than expected. 
  This may happen when Sentry has been down or HDFS sync was disabled for a while (more than 24h), 
  and the HMS cleared old notifications (older than 24h) not processed by Sentry causing a gap when retrieving notifications.
* Latest Sentry notification ID processed is bigger than current HMS notification ID.
  This may happen when the HMS DB data was reset or restore from an old snapshot causing sync issues with Sentry.

New snapshots are persisted with a new generation ID (or image number), so there's is no need to clean-up older snapshots.


Diffs (updated)
-----

  sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java 083ee4c247f96d5c87b44b9785663a2783e6644d 
  sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryHMSClient.java 05518e81f52965dc1ff102dcdd446010381b9a7a 
  sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHMSFollower.java d67c16258c67aae997de4c0451c8b642ab05d298 


Diff: https://reviews.apache.org/r/60926/diff/4/

Changes: https://reviews.apache.org/r/60926/diff/3-4/


Testing
-------

Added test cases on TestHmsFollower


Thanks,

Sergio Pena


Re: Review Request 60926: SENTRY-1760: HMSFollower should detect when a full snapshot from HMS is required

Posted by Sergio Pena <se...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60926/
-----------------------------------------------------------

(Updated July 18, 2017, 7:31 p.m.)


Review request for sentry, Alexander Kolbasov, kalyan kumar kalvagadda, and Na Li.


Changes
-------

Addressed Lina's comments


Bugs: sentry-1760
    https://issues.apache.org/jira/browse/sentry-1760


Repository: sentry


Description
-------

The patch will set the 'requestHmsSnapshot' to TRUE whenever the following cases are found:

* List of notifications received are different than expected. 
  This may happen when Sentry has been down or HDFS sync was disabled for a while (more than 24h), 
  and the HMS cleared old notifications (older than 24h) not processed by Sentry causing a gap when retrieving notifications.
* Latest Sentry notification ID processed is bigger than current HMS notification ID.
  This may happen when the HMS DB data was reset or restore from an old snapshot causing sync issues with Sentry.

New snapshots are persisted with a new generation ID (or image number), so there's is no need to clean-up older snapshots.


Diffs (updated)
-----

  sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java 083ee4c247f96d5c87b44b9785663a2783e6644d 
  sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryHMSClient.java 05518e81f52965dc1ff102dcdd446010381b9a7a 
  sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHMSFollower.java d67c16258c67aae997de4c0451c8b642ab05d298 


Diff: https://reviews.apache.org/r/60926/diff/3/

Changes: https://reviews.apache.org/r/60926/diff/2-3/


Testing
-------

Added test cases on TestHmsFollower


Thanks,

Sergio Pena


Re: Review Request 60926: SENTRY-1760: HMSFollower should detect when a full snapshot from HMS is required

Posted by Na Li <li...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60926/#review180845
-----------------------------------------------------------




sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java
Lines 168 (patched)
<https://reviews.apache.org/r/60926/#comment256142>

    client.getNotifications calls the same thing.
    
    "CurrentNotificationEventId eventId = client.getCurrentNotificationEventId();"
    
    Can you remove that check there? We don't need two places calling the same thing.


- Na Li


On July 18, 2017, 3:20 p.m., Sergio Pena wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60926/
> -----------------------------------------------------------
> 
> (Updated July 18, 2017, 3:20 p.m.)
> 
> 
> Review request for sentry, Alexander Kolbasov, kalyan kumar kalvagadda, and Na Li.
> 
> 
> Bugs: sentry-1760
>     https://issues.apache.org/jira/browse/sentry-1760
> 
> 
> Repository: sentry
> 
> 
> Description
> -------
> 
> The patch will set the 'requestHmsSnapshot' to TRUE whenever the following cases are found:
> 
> * List of notifications received are different than expected. 
>   This may happen when Sentry has been down or HDFS sync was disabled for a while (more than 24h), 
>   and the HMS cleared old notifications (older than 24h) not processed by Sentry causing a gap when retrieving notifications.
> * Latest Sentry notification ID processed is bigger than current HMS notification ID.
>   This may happen when the HMS DB data was reset or restore from an old snapshot causing sync issues with Sentry.
> 
> New snapshots are persisted with a new generation ID (or image number), so there's is no need to clean-up older snapshots.
> 
> 
> Diffs
> -----
> 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java 083ee4c247f96d5c87b44b9785663a2783e6644d 
>   sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryHMSClient.java 05518e81f52965dc1ff102dcdd446010381b9a7a 
>   sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHMSFollower.java d67c16258c67aae997de4c0451c8b642ab05d298 
> 
> 
> Diff: https://reviews.apache.org/r/60926/diff/2/
> 
> 
> Testing
> -------
> 
> Added test cases on TestHmsFollower
> 
> 
> Thanks,
> 
> Sergio Pena
> 
>


Re: Review Request 60926: SENTRY-1760: HMSFollower should detect when a full snapshot from HMS is required

Posted by Sergio Pena <se...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60926/
-----------------------------------------------------------

(Updated July 18, 2017, 3:20 p.m.)


Review request for sentry, Alexander Kolbasov, kalyan kumar kalvagadda, and Na Li.


Bugs: sentry-1760
    https://issues.apache.org/jira/browse/sentry-1760


Repository: sentry


Description
-------

The patch will set the 'requestHmsSnapshot' to TRUE whenever the following cases are found:

* List of notifications received are different than expected. 
  This may happen when Sentry has been down or HDFS sync was disabled for a while (more than 24h), 
  and the HMS cleared old notifications (older than 24h) not processed by Sentry causing a gap when retrieving notifications.
* Latest Sentry notification ID processed is bigger than current HMS notification ID.
  This may happen when the HMS DB data was reset or restore from an old snapshot causing sync issues with Sentry.

New snapshots are persisted with a new generation ID (or image number), so there's is no need to clean-up older snapshots.


Diffs (updated)
-----

  sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/HMSFollower.java 083ee4c247f96d5c87b44b9785663a2783e6644d 
  sentry-provider/sentry-provider-db/src/main/java/org/apache/sentry/service/thrift/SentryHMSClient.java 05518e81f52965dc1ff102dcdd446010381b9a7a 
  sentry-provider/sentry-provider-db/src/test/java/org/apache/sentry/service/thrift/TestHMSFollower.java d67c16258c67aae997de4c0451c8b642ab05d298 


Diff: https://reviews.apache.org/r/60926/diff/2/

Changes: https://reviews.apache.org/r/60926/diff/1-2/


Testing
-------

Added test cases on TestHmsFollower


Thanks,

Sergio Pena