You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Sandy Ryza (JIRA)" <ji...@apache.org> on 2015/04/01 20:50:55 UTC

[jira] [Commented] (YARN-3415) Non-AM containers can be counted towards amResourceUsage of a fairscheduler queue

    [ https://issues.apache.org/jira/browse/YARN-3415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14391218#comment-14391218 ] 

Sandy Ryza commented on YARN-3415:
----------------------------------

+1

> Non-AM containers can be counted towards amResourceUsage of a fairscheduler queue
> ---------------------------------------------------------------------------------
>
>                 Key: YARN-3415
>                 URL: https://issues.apache.org/jira/browse/YARN-3415
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: fairscheduler
>    Affects Versions: 2.6.0
>            Reporter: Rohit Agarwal
>            Assignee: zhihai xu
>            Priority: Critical
>         Attachments: YARN-3415.000.patch, YARN-3415.001.patch
>
>
> We encountered this problem while running a spark cluster. The amResourceUsage for a queue became artificially high and then the cluster got deadlocked because the maxAMShare constrain kicked in and no new AM got admitted to the cluster.
> I have described the problem in detail here: https://github.com/apache/spark/pull/5233#issuecomment-87160289
> In summary - the condition for adding the container's memory towards amResourceUsage is fragile. It depends on the number of live containers belonging to the app. We saw that the spark AM went down without explicitly releasing its requested containers and then one of those containers memory was counted towards amResource.
> cc - [~sandyr]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)