You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Sunil G (JIRA)" <ji...@apache.org> on 2017/10/13 06:54:00 UTC

[jira] [Comment Edited] (YARN-7314) Multiple Resource manager test cases failing

    [ https://issues.apache.org/jira/browse/YARN-7314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16203127#comment-16203127 ] 

Sunil G edited comment on YARN-7314 at 10/13/17 6:53 AM:
---------------------------------------------------------

Adding more analysis from my end here.

I could see these test cases fails in trunk with out any patch. https://builds.apache.org/job/PreCommit-YARN-Build/17888/testReport/

When I ran these in my local machine on trunk i can see same errors as well. There are some scheduler state pblm in FS side. And some -ve values are also coming. [~templedf] [~yufeigu], could you please pool in to see why this is happening as test cases failures are in FS. cc/ [~rohithsharma] and [~leftnoteasy]

{code}
Running org.apache.hadoop.yarn.server.resourcemanager.TestApplicationMasterService
Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 47.665 sec <<< FAILURE! - in org.apache.hadoop.yarn.server.resourcemanager.TestApplicationMasterService
testResourceTypes(org.apache.hadoop.yarn.server.resourcemanager.TestApplicationMasterService)  Time elapsed: 47.522 sec  <<< FAILURE!
java.lang.AssertionError: Attempt state is not correct (timeout). expected:<ALLOCATED> but was:<SCHEDULED>
	at org.apache.hadoop.yarn.server.resourcemanager.TestApplicationMasterService.testResourceTypes(TestApplicationMasterService.java:520)

Running org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart
Tests run: 38, Failures: 2, Errors: 0, Skipped: 0, Time elapsed: 68.306 sec <<< FAILURE! - in org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart
testSchedulerRecovery[FAIR](org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart)  Time elapsed: 0.434 sec  <<< FAILURE!
java.lang.AssertionError: expected:<<memory:6144, vCores:6>> but was:<<memory:6144, vCores:-3>>
	at org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart.checkFSQueue(TestWorkPreservingRMRestart.java:484)
	at org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart.testSchedulerRecovery(TestWorkPreservingRMRestart.java:243)

testDynamicQueueRecovery[FAIR](org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart)  Time elapsed: 0.471 sec  <<< FAILURE!
java.lang.AssertionError: expected:<<memory:6144, vCores:6>> but was:<<memory:6144, vCores:-3>>
	at org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart.checkFSQueue(TestWorkPreservingRMRestart.java:484)
	at org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart.testDynamicQueueRecovery(TestWorkPreservingRMRestart.java:387)
{code}


was (Author: sunilg):
Adding more analysis from end here.

I could these test cases fails in trunk with out any patch. https://builds.apache.org/job/PreCommit-YARN-Build/17888/testReport/

When I ran these in my local machine on trunk i can see same errors as well. There are some scheduler state pblm in FS side. And some -ve values are also coming. [~templedf] [~yufeigu], could you please pool in to see why this is happening as test cases failures are in FS. cc/ [~rohithsharma] and [~leftnoteasy]

{code}
Running org.apache.hadoop.yarn.server.resourcemanager.TestApplicationMasterService
Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 47.665 sec <<< FAILURE! - in org.apache.hadoop.yarn.server.resourcemanager.TestApplicationMasterService
testResourceTypes(org.apache.hadoop.yarn.server.resourcemanager.TestApplicationMasterService)  Time elapsed: 47.522 sec  <<< FAILURE!
java.lang.AssertionError: Attempt state is not correct (timeout). expected:<ALLOCATED> but was:<SCHEDULED>
	at org.apache.hadoop.yarn.server.resourcemanager.TestApplicationMasterService.testResourceTypes(TestApplicationMasterService.java:520)

Running org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart
Tests run: 38, Failures: 2, Errors: 0, Skipped: 0, Time elapsed: 68.306 sec <<< FAILURE! - in org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart
testSchedulerRecovery[FAIR](org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart)  Time elapsed: 0.434 sec  <<< FAILURE!
java.lang.AssertionError: expected:<<memory:6144, vCores:6>> but was:<<memory:6144, vCores:-3>>
	at org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart.checkFSQueue(TestWorkPreservingRMRestart.java:484)
	at org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart.testSchedulerRecovery(TestWorkPreservingRMRestart.java:243)

testDynamicQueueRecovery[FAIR](org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart)  Time elapsed: 0.471 sec  <<< FAILURE!
java.lang.AssertionError: expected:<<memory:6144, vCores:6>> but was:<<memory:6144, vCores:-3>>
	at org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart.checkFSQueue(TestWorkPreservingRMRestart.java:484)
	at org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart.testDynamicQueueRecovery(TestWorkPreservingRMRestart.java:387)
{code}

> Multiple Resource manager test cases failing
> --------------------------------------------
>
>                 Key: YARN-7314
>                 URL: https://issues.apache.org/jira/browse/YARN-7314
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: lovekesh bansal
>            Priority: Critical
>
> Following 14 unit tests are failing in the trunk:
>  org.apache.hadoop.yarn.server.resourcemanager.TestApplicationMasterService.testResourceTypes		
>  org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart.testSchedulerRecovery[FAIR]		
>  org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart.testDynamicQueueRecovery[FAIR]		
>  org.apache.hadoop.yarn.server.resourcemanager.ahs.TestRMApplicationHistoryWriter.testRMWritingMassiveHistoryForFairSche		
>  org.apache.hadoop.yarn.server.resourcemanager.ahs.TestRMApplicationHistoryWriter.testRMWritingMassiveHistoryForCapacitySche		
>  org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFSAppAttempt.testHeadroomWithBlackListedNodes		
>  org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler.testQueueMaxAMShare		
>  org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler.testMultipleCompletedEvent		
>  org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler.testRefreshQueuesWhenRMHA		
>  org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler.testQueueMaxAMShareWithContainerReservation		
>  org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler.testComputeMaxAMResource		
>  org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler.testQueueMaxAMShareDefault		
>  org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler.testReservationMetrics		
>  org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler.testRequestAMResourceInZeroFairShareQueue		



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org