You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Eric Badger (JIRA)" <ji...@apache.org> on 2017/06/05 14:48:04 UTC

[jira] [Commented] (YARN-5333) Some recovered apps are put into default queue when RM HA

    [ https://issues.apache.org/jira/browse/YARN-5333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16037054#comment-16037054 ] 

Eric Badger commented on YARN-5333:
-----------------------------------

[~sunilg], backporting this to 2.8 broke a unit test. It would also be nice if you could comment on the JIRA when you backport so that it's obvious that the backport was at a different time than the original commit. 
{noformat}
testTransitionedToActiveRefreshFail(org.apache.hadoop.yarn.server.resourcemanager.TestRMHA)  Time elapsed: 2.396 sec  <<< FAILURE!
java.lang.AssertionError: null
	at org.junit.Assert.fail(Assert.java:86)
	at org.junit.Assert.assertTrue(Assert.java:41)
	at org.junit.Assert.assertTrue(Assert.java:52)
	at org.apache.hadoop.yarn.server.resourcemanager.TestRMHA.testTransitionedToActiveRefreshFail(TestRMHA.java:623)
{noformat}

> Some recovered apps are put into default queue when RM HA
> ---------------------------------------------------------
>
>                 Key: YARN-5333
>                 URL: https://issues.apache.org/jira/browse/YARN-5333
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Jun Gong
>            Assignee: Jun Gong
>              Labels: release-blocker
>             Fix For: 2.9.0, 2.7.4, 3.0.0-alpha1, 2.8.2
>
>         Attachments: YARN-5333.01.patch, YARN-5333.02.patch, YARN-5333.03.patch, YARN-5333.04.patch, YARN-5333.05.patch, YARN-5333.06.patch, YARN-5333.07.patch, YARN-5333.08.patch, YARN-5333.09.patch, YARN-5333.10.patch
>
>
> Enable RM HA and use FairScheduler, {{yarn.scheduler.fair.allow-undeclared-pools}} is set to false, {{yarn.scheduler.fair.user-as-default-queue}} is set to false.
> Reproduce steps:
> 1. Start two RMs.
> 2. After RMs are running, change both RM's file {{etc/hadoop/fair-scheduler.xml}}, then add some queues.
> 3. Submit some apps to the new added queues.
> 4. Stop the active RM, then the standby RM will transit to active and recover apps.
> However the new active RM will put recovered apps into default queue because it might have not loaded the new {{fair-scheduler.xml}}. We need call {{initScheduler}} before start active services or bring {{refreshAll()}} in front of {{rm.transitionToActive()}}. *It seems it is also important for other scheduler*.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org