You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Rohith Sharma K S (JIRA)" <ji...@apache.org> on 2015/09/25 06:09:04 UTC

[jira] [Commented] (YARN-2487) Need to support timeout of AM When no containers are assigned to it for a defined period

    [ https://issues.apache.org/jira/browse/YARN-2487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14907533#comment-14907533 ] 

Rohith Sharma K S commented on YARN-2487:
-----------------------------------------

Hi [~Naganarasimha Garla], it is worth for keeping the application if it is running. But problem is currently YARN does not identifies the reasons for the not progressing. App not progressing could be because of several reasons. So I feel, if any mechanism to get reason for not progressing applications, this could be handled. I believe, YARN-4091 is one such issue which trying to get more debug information and  planning to expose REST interface for getting per application progress information.

> Need to support timeout of AM When no containers are assigned to it for a defined period
> ----------------------------------------------------------------------------------------
>
>                 Key: YARN-2487
>                 URL: https://issues.apache.org/jira/browse/YARN-2487
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: resourcemanager
>            Reporter: Naganarasimha G R
>            Assignee: Naganarasimha G R
>
>  There are some scenarios where AM will not get containers and indefinitely waiting. We faced one such sceanrio which makes the applications to get hung : 
> Consider a cluster setup which has 2 NMS of each 8GB resource,
> And 2 applications(MR2) are launched in the default queue where in each AM is taking 2 GB each.
> Each AM is placed in each of the NM. Now each AM is requesting for container of 7Gb  mem resource .
> As in each NM only 6GB resource is available both the applications are hung forever.
> To avoid such scenarios i would like to propose 
> generic timeout feature for all AM's in yarn, such that if no containers are assigned for an application for a defined period than yarn can timeout the application attempt.
> Default can be set to 0 where in RM will not timeout the app attempt and user can set his own timeout when he submits the application



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)