You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Miklos Szegedi (JIRA)" <ji...@apache.org> on 2017/02/24 01:24:44 UTC

[jira] [Comment Edited] (YARN-6172) TestFSAppStarvation fails on trunk

    [ https://issues.apache.org/jira/browse/YARN-6172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15881712#comment-15881712 ] 

Miklos Szegedi edited comment on YARN-6172 at 2/24/17 1:23 AM:
---------------------------------------------------------------

Thank you, [~varun_saxena] for reporting this.
I was able to repro the scenario above. There are two issues here.
First, the update thread resets the queue demand and adds each application demand to it one by one every time it runs without locking. Whenever this value is sampled, the test compares it with the expected value. However, if we have not finished with the update, this can be 0 or anything less than the actual demand.
A different unrelated issue is that the test actually calls {{Thread.yield()}} instead of properly waiting for the expected application count value to propagate. I will send out a patch soon.

{code}
@Override
  public void updateDemand() {
    // Compute demand by iterating through apps in the queue
    // Limit demand to maxResources
    demand = Resources.createResource(0);
    readLock.lock();
    try {
      for (FSAppAttempt sched : runnableApps) {
        updateDemandForApp(sched);
      }
      for (FSAppAttempt sched : nonRunnableApps) {
        updateDemandForApp(sched);
      }
    } finally {
      readLock.unlock();
    }
    // Cap demand to maxShare to limit allocation to maxShare
    demand = Resources.componentwiseMin(demand, maxShare);
    if (LOG.isDebugEnabled()) {
      LOG.debug("The updated demand for " + getName() + " is " + demand
          + "; the max is " + maxShare);
      LOG.debug("The updated fairshare for " + getName() + " is "
          + getFairShare());
    }
  }
  
  private void updateDemandForApp(FSAppAttempt sched) {
    sched.updateDemand();
    Resource toAdd = sched.getDemand();
    if (LOG.isDebugEnabled()) {
      LOG.debug("Counting resource from " + sched.getName() + " " + toAdd
          + "; Total resource demand for " + getName() + " now "
          + demand);
    }
    demand = Resources.add(demand, toAdd);
  }
{code}


was (Author: miklos.szegedi@cloudera.com):
I was able to repro the scenario above. There are two issues here.
First, the update thread resets the queue demand and adds each application demand to it one by one every time it runs without locking. Whenever this value is sampled, the test compares it with the expected value. However, if we have not finished with the update, this can be 0 or anything less than the actual demand.
A different unrelated issue is that the test actually calls {{Thread.yield()}} instead of properly waiting for the expected application count value to propagate. I will send out a patch soon.

{code}
@Override
  public void updateDemand() {
    // Compute demand by iterating through apps in the queue
    // Limit demand to maxResources
    demand = Resources.createResource(0);
    readLock.lock();
    try {
      for (FSAppAttempt sched : runnableApps) {
        updateDemandForApp(sched);
      }
      for (FSAppAttempt sched : nonRunnableApps) {
        updateDemandForApp(sched);
      }
    } finally {
      readLock.unlock();
    }
    // Cap demand to maxShare to limit allocation to maxShare
    demand = Resources.componentwiseMin(demand, maxShare);
    if (LOG.isDebugEnabled()) {
      LOG.debug("The updated demand for " + getName() + " is " + demand
          + "; the max is " + maxShare);
      LOG.debug("The updated fairshare for " + getName() + " is "
          + getFairShare());
    }
  }
  
  private void updateDemandForApp(FSAppAttempt sched) {
    sched.updateDemand();
    Resource toAdd = sched.getDemand();
    if (LOG.isDebugEnabled()) {
      LOG.debug("Counting resource from " + sched.getName() + " " + toAdd
          + "; Total resource demand for " + getName() + " now "
          + demand);
    }
    demand = Resources.add(demand, toAdd);
  }
{code}

> TestFSAppStarvation fails on trunk
> ----------------------------------
>
>                 Key: YARN-6172
>                 URL: https://issues.apache.org/jira/browse/YARN-6172
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>            Reporter: Varun Saxena
>         Attachments: YARN-6172.000.patch
>
>
> Refer to test report https://builds.apache.org/job/PreCommit-YARN-Build/14882/testReport/
> {noformat}
> java.lang.AssertionError: null
> 	at org.junit.Assert.fail(Assert.java:86)
> 	at org.junit.Assert.assertTrue(Assert.java:41)
> 	at org.junit.Assert.assertTrue(Assert.java:52)
> 	at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFSAppStarvation.verifyLeafQueueStarvation(TestFSAppStarvation.java:133)
> 	at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFSAppStarvation.testPreemptionEnabled(TestFSAppStarvation.java:106)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org