You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Eric Badger (JIRA)" <ji...@apache.org> on 2016/06/01 19:45:59 UTC

[jira] [Commented] (YARN-1468) TestRMRestart.testRMRestartWaitForPreviousAMToFinish get failed.

    [ https://issues.apache.org/jira/browse/YARN-1468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15310976#comment-15310976 ] 

Eric Badger commented on YARN-1468:
-----------------------------------

[~mitdesai], I saw this test failing in the same way that you described above. I took a look at the test and I either don't understand the meaning of one of the lines or it's a bug. The following piece of code (minus the assertEquals) was added by [YARN-1493|https://issues.apache.org/jira/browse/YARN-1493] and doesn't make sense to me. Why are we checking the size against 2 when we are checking it against 4 immediately after? In my local tests, this loop times out once timeoutSecs >= 40 since rmApp.getAttempts.size() is equal to 4 the whole time. This leads me to believe that the assert failure would occur when this loop is executed and the size is actually equal to 2 initially. That way it would break out of the loop early and only get up to 3 (or stay at 2) before the assertEquals against 4 is executed. 

{noformat}
    // wait for the attempt to be created.
    int timeoutSecs = 0;
    while (rmApp.getAppAttempts().size() != 2 && timeoutSecs++ < 40) {
      Thread.sleep(200);
    }
    Assert.assertEquals(4, rmApp.getAppAttempts().size());
{noformat}

I think changing ".size() != 2" to ".size() != 4" will fix this race in the test. Thoughts? 

cc [~djp]

> TestRMRestart.testRMRestartWaitForPreviousAMToFinish get failed.
> ----------------------------------------------------------------
>
>                 Key: YARN-1468
>                 URL: https://issues.apache.org/jira/browse/YARN-1468
>             Project: Hadoop YARN
>          Issue Type: Test
>          Components: resourcemanager
>            Reporter: Junping Du
>            Assignee: Junping Du
>            Priority: Critical
>
> Log is as following:
> {code}
> Tests run: 13, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 149.968 sec <<< FAILURE! - in org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart
> testRMRestartWaitForPreviousAMToFinish(org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart)  Time elapsed: 44.197 sec  <<< FAILURE!
> junit.framework.AssertionFailedError: AppAttempt state is not correct (timedout) expected:<ALLOCATED> but was:<SCHEDULED>
>         at junit.framework.Assert.fail(Assert.java:50)
>         at junit.framework.Assert.failNotEquals(Assert.java:287)
>         at junit.framework.Assert.assertEquals(Assert.java:67)
>         at org.apache.hadoop.yarn.server.resourcemanager.MockAM.waitForState(MockAM.java:82)
>         at org.apache.hadoop.yarn.server.resourcemanager.MockRM.sendAMLaunched(MockRM.java:292)
>         at org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.launchAM(TestRMRestart.java:826)
>         at org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartWaitForPreviousAMToFinish(TestRMRestart.java:464)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org