You are viewing a plain text version of this content. The canonical link for it is here.

Posted to yarn-issues@hadoop.apache.org by "Hudson (JIRA)" <ji...@apache.org> on 2018/07/25 09:55:00 UTC

[jira] [Commented] (YARN-8546) Resource leak caused by a reserved container being released more than once under async scheduling

    [ https://issues.apache.org/jira/browse/YARN-8546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555461#comment-16555461 ] 

Hudson commented on YARN-8546:
------------------------------

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #14635 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/14635/])
YARN-8546. Resource leak caused by a reserved container being released (wwei: rev 5be9f4a5d05c9cb99348719fe35626b1de3055db)
* (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp.java
* (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacitySchedulerAsyncScheduling.java


> Resource leak caused by a reserved container being released more than once under async scheduling
> -------------------------------------------------------------------------------------------------
>
>                 Key: YARN-8546
>                 URL: https://issues.apache.org/jira/browse/YARN-8546
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: capacity scheduler
>    Affects Versions: 3.1.0
>            Reporter: Weiwei Yang
>            Assignee: Tao Yang
>            Priority: Major
>              Labels: global-scheduling
>         Attachments: YARN-8546.001.patch
>
>
> I was able to reproduce this issue by starting a job, and this job keeps requesting containers until it uses up cluster available resource. My cluster has 70200 vcores, and each task it applies for 100 vcores, I was expecting total 702 containers can be allocated but eventually there was only 701. The last container could not get allocated because queue used resource is updated to be more than 100%.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org