You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@sentry.apache.org by "Na Li (JIRA)" <ji...@apache.org> on 2018/04/17 22:05:00 UTC

[jira] [Commented] (SENTRY-2203) Leader Lock is not released when Sentry service shuts down

    [ https://issues.apache.org/jira/browse/SENTRY-2203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16441571#comment-16441571 ] 

Na Li commented on SENTRY-2203:
-------------------------------

Log message shows the call stack when releasing the leader lock failed. It also shows that the reason of the failure is because CuratorFrameworkImpl was not in start state
{code}
2018-04-08 04:34:31,760 INFO sentry.org.apache.curator.framework.imps.CuratorFrameworkImpl: backgroundOperationsLoop exiting                       <-- CuratorFrameworkImpl is closed
2018-04-08 04:34:31,762 INFO org.apache.sentry.provider.db.service.persistent.LeaderStatusMonitor: LeaderStatusMonitor: interrupted
2018-04-08 04:34:31,762 INFO org.apache.sentry.service.thrift.SentryService: Attempting to stop sentry thrift service...
2018-04-08 04:34:31,762 INFO org.apache.sentry.provider.db.service.persistent.LeaderStatusMonitor: LeaderStatusMonitor: becoming standby
2018-04-08 04:34:31,762 INFO org.apache.sentry.service.thrift.SentryService: Attempting to stop sentry web service...
2018-04-08 04:34:31,762 ERROR sentry.org.apache.curator.framework.recipes.leader.LeaderSelector: The leader threw an exception
java.lang.IllegalStateException: instance must be started before calling this method
	at com.google.common.base.Preconditions.checkState(Preconditions.java:145)                                                                                      <-- CuratorFrameworkImpl is not in started state
	at sentry.org.apache.curator.framework.imps.CuratorFrameworkImpl.delete(CuratorFrameworkImpl.java:359)
	at sentry.org.apache.curator.framework.recipes.locks.LockInternals.deleteOurPath(LockInternals.java:339)
	at sentry.org.apache.curator.framework.recipes.locks.LockInternals.releaseLock(LockInternals.java:123)
	at sentry.org.apache.curator.framework.recipes.locks.InterProcessMutex.release(InterProcessMutex.java:154)
	at sentry.org.apache.curator.framework.recipes.leader.LeaderSelector.doWork(LeaderSelector.java:427)
	at sentry.org.apache.curator.framework.recipes.leader.LeaderSelector.doWorkLoop(LeaderSelector.java:444)
	at sentry.org.apache.curator.framework.recipes.leader.LeaderSelector.access$100(LeaderSelector.java:64)
	at sentry.org.apache.curator.framework.recipes.leader.LeaderSelector$2.call(LeaderSelector.java:245)
	at sentry.org.apache.curator.framework.recipes.leader.LeaderSelector$2.call(LeaderSelector.java:239)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
{code}
d) CuratorFrameworkImpl code
{code}
  public DeleteBuilder delete() {
    Preconditions.checkState(this.getState() == CuratorFrameworkState.STARTED, "instance must be started before calling this method");
    return new DeleteBuilderImpl(this);
  }
{code}

> Leader Lock is not released when Sentry service shuts down
> ----------------------------------------------------------
>
>                 Key: SENTRY-2203
>                 URL: https://issues.apache.org/jira/browse/SENTRY-2203
>             Project: Sentry
>          Issue Type: Bug
>          Components: Sentry
>    Affects Versions: 2.1.0
>            Reporter: Na Li
>            Assignee: Na Li
>            Priority: Critical
>         Attachments: SENTRY-2203.001.patch
>
>
> In our testing for sentry HA, we found after restarting sentry service without restarting zookeeper service, it is possible that none of sentry servers is elected as leader to sync with HMS.
> What happened was
> 1) When a leader is elected, the sentry server host holds the leader lock. The lock is identified by the mutexPath. All sentry servers in a cluster use the same mutexPath.
> 2) When sentry service is shutdown, the HAContext is shutdown, so its contained CuratorFrameworkImpl was shutdown, but the leader lock was still hold by the sentry server host 
> 3) When the Interruption signal from shutdown caused the leader election thread to be interrupted, releasing the leader lock failed because CuratorFrameworkImpl was not in started state. 
> 4) When sentry server restarts, acquiring the leader lock failed because it was not released. So no active sentry servers is leader. 
> 5) If releasing leader lock happened before CuratorFrameworkImpl was shutdown, this issue won't happen. If restarting zookeeper after sentry service restart, this issue won't happen.
> To fix this issue,
> Sentry LeaderStatusMonitor can deactivate the leader to release the leader lock when it is closed, so the leader lock can be guaranteed to release before CuratorFrameworkImpl is shutdown.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)