You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@sentry.apache.org by "kalyan kumar kalvagadda (JIRA)" <ji...@apache.org> on 2018/02/05 14:45:00 UTC

[jira] [Comment Edited] (SENTRY-2115) Incorrect behavior of HMsFollower when HDFSSync feature is disabled.

    [ https://issues.apache.org/jira/browse/SENTRY-2115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16322610#comment-16322610 ] 

kalyan kumar kalvagadda edited comment on SENTRY-2115 at 2/5/18 2:44 PM:
-------------------------------------------------------------------------

There are couple of ways to solve this
 *Approach-1:* 
 * Clearing MAuthzPathsMapping information when HDFS-SYNC is disabled.
 * Reset MSentryHmsNotification and start processing all the notifications from NOTIFICATION_LOG table when an out-of-sync situation is detected.

*Approach-2:*
 * Continue to update MAuthzPathsMapping and MAuthzPathsSnapshotId even when HDFS-SYNC is disabled. That way when the feature is enabled sentry-namenode plug-in gets the latest snapshot. Downside for this approach is that, sentry would be spending CPU cycles and memory processing and storing the patch changes in MSentryPathChange and MAuthzPathsMapping.


was (Author: kkalyan):
There are couple of ways to solve this
1. By clearing MAuthzPathsMapping and MAuthzPathsSnapshotId information when HDFS-SYNC is disabled.
2. Continue to update MAuthzPathsMapping and MAuthzPathsSnapshotId even when  HDFS-SYNC is disabled. That way when the feature is enabled sentry-namenode plug-in gets the latest snapshot. Downside for this approach is that, sentry would be spending CPU cycles and memory processing and storing the patch changes in MSentryPathChange and MAuthzPathsMapping.

>  Incorrect behavior of HMsFollower when HDFSSync feature is disabled.
> ---------------------------------------------------------------------
>
>                 Key: SENTRY-2115
>                 URL: https://issues.apache.org/jira/browse/SENTRY-2115
>             Project: Sentry
>          Issue Type: Bug
>            Reporter: kalyan kumar kalvagadda
>            Assignee: kalyan kumar kalvagadda
>            Priority: Major
>
> *Current Behavior,*
> *Scenario-1:* When HDFS sync is disabled, and sentry is started for the first time.
>  * Sentry would take a full snapshot of HMS and just persists the event-id of the current notification-id of HMS into SENTRY_HMS_NOTIFICATION_ID. {color:#FF0000}This is wrong{color}
> *Scenario-2:* When HDFS sync is disabled, and current event-id from HMS is less than last event-d processed by sentry
>  * Sentry would take a full snapshot of HMS and just persists the event-id of the current notification-id of HMS into SENTRY_HMS_NOTIFICATION_ID. {color:#FF0000}This is wrong{color}
> *Scenario-3:* When HDFS sync is disabled, and first event-id in the subsequent pull is not greater than the last event-id processed by sentry by 1.
>  * Sentry would take a full snapshot of HMS and just persists the event-id of the current notification-id of HMS into SENTRY_HMS_NOTIFICATION_ID.{color:#FF0000} This is wrong{color}
> *Scenario-4:* Initially HDFS sync was enabled and later disabled for while and then HDFS sync is enabled and sentry service is restarted to get it to effect.
>  * On disabling HDFS sync, HMSFollower would update the SENTRY_HMS_NOTIFICATION_ID table but not the MSentryPathChange table.
> When HDFS sync is enabled again, HMSFollower would continue fetching new notifications and process them and update the MSentryPathChange and MAuthzPathsMapping info. This is not correct as sentry would not take a snapshot and will not have any path-mapping information about HMS objects. As a result, HDFS will ACL will not be added for the existing HMS objects.{color:#FF0000} This is wrong.{color}
> *Correct Behavior:*
>  * Full snapshots need not be taken in all Scenario-1, Scenario-2 and Scenario-3.
>  * When Sentry detects out-of-sync situations, it should reset SENTRY_HMS_NOTIFICATION_ID table and start processing the event in HMS_NOTIFICATION_LOG from beginning.
>  * To handle scenario explained in *Scenario-4,* sentry should reset the mapping information when ever HDFS sync is disabled. That way it can learn from scratch when the feature is enabled back. There is no value is holding stale data even when we know it will have issues when the feature is enabled back.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)