You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Ying Zhang (JIRA)" <ji...@apache.org> on 2017/02/08 04:22:41 UTC

[jira] [Commented] (YARN-6031) Application recovery has failed when node label feature is turned off during RM recovery

    [ https://issues.apache.org/jira/browse/YARN-6031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15857371#comment-15857371 ] 

Ying Zhang commented on YARN-6031:
----------------------------------

Hi [~sunilg], sorry for the late reply (was out for the Spring Festival holiday). Here is the patch for branch-2.8, please have a look.
I've found a problem with the test case when making the patch for branch-2.8. TestRMRestart runs all test cases for CapacityScheduler and FairScheduler respectively, and this test case can only run successfully for CapacityScheduler since it involves running application with node label specified. On trunk, we don't see this problem because due to YARN-4805, TestRMRestart now only runs for CapacityScheduler. I've modified the test case a little bit to just run when it is CapacityScheduler.
{code}
  public void testRMRestartAfterNodeLabelDisabled() throws Exception {
    // Skip this test case if it is not CapacityScheduler since NodeLabel is
    // not fully supported yet for FairScheduler and others.
    if (!getSchedulerType().equals(SchedulerType.CAPACITY)) {
      return;
    }
...
{code}
We should probably make this change to trunk too. Let me know you want to make the change through this JIRA, or I need to open another JIRA to address it?

> Application recovery has failed when node label feature is turned off during RM recovery
> ----------------------------------------------------------------------------------------
>
>                 Key: YARN-6031
>                 URL: https://issues.apache.org/jira/browse/YARN-6031
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: scheduler
>    Affects Versions: 2.8.0
>            Reporter: Ying Zhang
>            Assignee: Ying Zhang
>            Priority: Minor
>         Attachments: YARN-6031.001.patch, YARN-6031.002.patch, YARN-6031.003.patch, YARN-6031.004.patch, YARN-6031.005.patch, YARN-6031.006.patch, YARN-6031.007.patch
>
>
> Here is the repro steps:
> Enable node label, restart RM, configure CS properly, and run some jobs;
> Disable node label, restart RM, and the following exception thrown:
> {noformat}
> Caused by: org.apache.hadoop.yarn.exceptions.InvalidLabelResourceRequestException: Invalid resource request, node label not enabled but request contains label expression
>         at org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:225)
>         at org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:248)
>         at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:394)
>         at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:339)
>         at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recoverApplication(RMAppManager.java:319)
>         at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:436)
>         at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1165)
>         at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:574)
>         at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>         ... 10 more
> {noformat}
> During RM restart, application recovery failed due to that application had node label expression specified while node label has been disabled.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org