You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Sachin Aggarwal (JIRA)" <ji...@apache.org> on 2017/02/24 10:49:44 UTC

[jira] [Commented] (YARN-2487) Need to support timeout of AM When no containers are assigned to it for a defined period

    [ https://issues.apache.org/jira/browse/YARN-2487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15882439#comment-15882439 ] 

Sachin Aggarwal commented on YARN-2487:
---------------------------------------

[~naganarasimha_gr@apache.org] [~rohithsharma]  [~nijel],  [~hejian991]  [~wangda],

I have a usecase which is similar to this. is there a chance you guys can consider this. or let me know if you have any work around this problem.

I am running Jupyter kernel gateway  in my custer. when JKG receives a request it starts a kernel in yarn-client mode.
in yarn-client mode application master and executer runs in yarn and driver runs outside.
In this case notebook comes up and kernel is running but I am not getting any container for master when other guys are using all resources.
my aim here is. in such scenario I should wait for some time for resources and then I have to let user know that ur notebook is not able to get resources. please try after sometime and I kill his request.

let know if you guys have more questions 


> Need to support timeout of AM When no containers are assigned to it for a defined period
> ----------------------------------------------------------------------------------------
>
>                 Key: YARN-2487
>                 URL: https://issues.apache.org/jira/browse/YARN-2487
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: resourcemanager
>            Reporter: Naganarasimha G R
>            Assignee: Naganarasimha G R
>
>  There are some scenarios where AM will not get containers and indefinitely waiting. We faced one such sceanrio which makes the applications to get hung : 
> Consider a cluster setup which has 2 NMS of each 8GB resource,
> And 2 applications(MR2) are launched in the default queue where in each AM is taking 2 GB each.
> Each AM is placed in each of the NM. Now each AM is requesting for container of 7Gb  mem resource .
> As in each NM only 6GB resource is available both the applications are hung forever.
> To avoid such scenarios i would like to propose 
> generic timeout feature for all AM's in yarn, such that if no containers are assigned for an application for a defined period than yarn can timeout the application attempt.
> Default can be set to 0 where in RM will not timeout the app attempt and user can set his own timeout when he submits the application



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org