You are viewing a plain text version of this content. The canonical link for it is here.

Posted to yarn-issues@hadoop.apache.org by "Hudson (JIRA)" <ji...@apache.org> on 2018/05/11 03:17:00 UTC

[jira] [Commented] (YARN-7003) DRAINING state of queues is not recovered after RM restart

    [ https://issues.apache.org/jira/browse/YARN-7003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16471434#comment-16471434 ] 

Hudson commented on YARN-7003:
------------------------------

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14169 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/14169/])
YARN-7003. DRAINING state of queues is not recovered after RM restart. (wwei: rev 9db9cd95bd0348070a286e69e7965c03c9bd39d6)
* (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/AbstractCSQueue.java
* (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestQueueState.java


> DRAINING state of queues is not recovered after RM restart
> ----------------------------------------------------------
>
>                 Key: YARN-7003
>                 URL: https://issues.apache.org/jira/browse/YARN-7003
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacityscheduler
>    Affects Versions: 2.9.0, 3.0.0-alpha4
>            Reporter: Tao Yang
>            Assignee: Tao Yang
>            Priority: Major
>             Fix For: 3.2.0, 3.1.1, 3.0.3
>
>         Attachments: YARN-7003.001.patch, YARN-7003.002.patch, YARN-7003.003.patch, YARN-7003.004.patch
>
>
> DRAINING state is a temporary state in RM memory, when queue state is set to be STOPPED but there are still some pending or active apps in it, the queue state will be changed to DRAINING instead of STOPPED after refreshing queues. We've encountered the problem that the state of this queue will aways be STOPPED after RM restarted, so that it can be removed at any time and leave some apps in a non-existing queue.
> To fix this problem, we could recover DRAINING state in the recovery process of pending/active apps. I will upload a patch with test case later for review.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org