You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Íñigo Goiri (JIRA)" <ji...@apache.org> on 2019/02/12 02:49:00 UTC

[jira] [Commented] (YARN-999) In case of long running tasks, reduce node resource should balloon out resource quickly by calling preemption API and suspending running task.

    [ https://issues.apache.org/jira/browse/YARN-999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16765614#comment-16765614 ] 

Íñigo Goiri commented on YARN-999:
----------------------------------

After YARN-996, I've been playing with the resources of NMs and I think I see the behavior described here.
When I change the resources to a small number the available go to negative but the containers keep running.

{code}
In YARN-311, we get OverCommitTimeout into ResourceOption so allocated containers (over new capacity) will be released after timeout.
{code}
I tried to go through the code to see where the overcommit timeout was used but I didn't get anywhere useful.
Does anybody know if this is actually implemented?

As this has been 6 years, I'd take over this if nobody is on it.

> In case of long running tasks, reduce node resource should balloon out resource quickly by calling preemption API and suspending running task. 
> -----------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-999
>                 URL: https://issues.apache.org/jira/browse/YARN-999
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: graceful, nodemanager, scheduler
>            Reporter: Junping Du
>            Assignee: Junping Du
>            Priority: Major
>
> In current design and implementation, when we decrease resource on node to less than resource consumption of current running tasks, tasks can still be running until the end. But just no new task get assigned on this node (because AvailableResource < 0) until some tasks are finished and AvailableResource > 0 again. This is good for most cases but in case of long running task, it could be too slow for resource setting to actually work so preemption could be hired here.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org