You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Wangda Tan (JIRA)" <ji...@apache.org> on 2018/06/01 21:01:00 UTC

[jira] [Updated] (YARN-7962) Race Condition When Stopping DelegationTokenRenewer causes RM crash during failover

     [ https://issues.apache.org/jira/browse/YARN-7962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Wangda Tan updated YARN-7962:
-----------------------------
    Summary: Race Condition When Stopping DelegationTokenRenewer causes RM crash during failover  (was: Race Condition When Stopping DelegationTokenRenewer)

> Race Condition When Stopping DelegationTokenRenewer causes RM crash during failover
> -----------------------------------------------------------------------------------
>
>                 Key: YARN-7962
>                 URL: https://issues.apache.org/jira/browse/YARN-7962
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 3.0.0
>            Reporter: BELUGA BEHR
>            Assignee: BELUGA BEHR
>            Priority: Critical
>         Attachments: YARN-7962.1.patch, YARN-7962.2.patch, YARN-7962.3.patch, YARN-7962.4.patch, YARN-7962.6.patch, YARN-7962.7.patch
>
>
> [https://github.com/apache/hadoop/blob/69fa81679f59378fd19a2c65db8019393d7c05a2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/DelegationTokenRenewer.java]
> {code:java}
>   private ThreadPoolExecutor renewerService;
>   private void processDelegationTokenRenewerEvent(
>       DelegationTokenRenewerEvent evt) {
>     serviceStateLock.readLock().lock();
>     try {
>       if (isServiceStarted) {
>         renewerService.execute(new DelegationTokenRenewerRunnable(evt));
>       } else {
>         pendingEventQueue.add(evt);
>       }
>     } finally {
>       serviceStateLock.readLock().unlock();
>     }
>   }
>   @Override
>   protected void serviceStop() {
>     if (renewalTimer != null) {
>       renewalTimer.cancel();
>     }
>     appTokens.clear();
>     allTokens.clear();
>     this.renewerService.shutdown();
> {code}
> {code:java}
> 2018-02-21 11:18:16,253  FATAL org.apache.hadoop.yarn.event.AsyncDispatcher: Error in dispatcher thread
> java.util.concurrent.RejectedExecutionException: Task org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable@39bddaf2 rejected from java.util.concurrent.ThreadPoolExecutor@5f71637b[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 15487]
> 	at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2048)
> 	at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821)
> 	at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1372)
> 	at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.processDelegationTokenRenewerEvent(DelegationTokenRenewer.java:196)
> 	at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.applicationFinished(DelegationTokenRenewer.java:734)
> 	at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.finishApplication(RMAppManager.java:199)
> 	at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.handle(RMAppManager.java:424)
> 	at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.handle(RMAppManager.java:65)
> 	at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:177)
> 	at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109)
> 	at java.lang.Thread.run(Thread.java:745)
> {code}
> What I think is going on here is that the {{serviceStop}} method is not setting the {{isServiceStarted}} flag to 'false'.
> Please update so that the {{serviceStop}} method grabs the {{serviceStateLock}} and sets {{isServiceStarted}} to _false_, before shutting down the {{renewerService}} thread pool, to avoid this condition.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org