You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-dev@hadoop.apache.org by "Andrew Chung (Jira)" <ji...@apache.org> on 2021/11/24 16:12:00 UTC

[jira] [Created] (YARN-11015) Decouple queue capacity with ability to run OPPORTUNISTIC container

Andrew Chung created YARN-11015:
-----------------------------------

             Summary: Decouple queue capacity with ability to run OPPORTUNISTIC container
                 Key: YARN-11015
                 URL: https://issues.apache.org/jira/browse/YARN-11015
             Project: Hadoop YARN
          Issue Type: Sub-task
          Components: container-queuing, resourcemanager
            Reporter: Andrew Chung


Motivation:
With YARN-11005, we will be able to schedule OContainers on nodes based on resource availability. That said, we should be able to allow nodes with 0 queue capacity to run OContainers (as these containers should be started directly immediately if resources are available, even if they are put on a "queue" first).
However, with the current implementation, if we set the queue length of NMs to be 0, at the RM, it assumes infinite queue capacity while at the NM, it disables the running of any OContainers, killing OContainers that arrive directly.
This issue works to address the above issues with the {{QUEUE_LENGTH_THEN_RESOURCES}} allocator.
This issue does not aim to change the existing behavior of the {{QUEUE_LENGTH}} allocator.

Proposed design:
To add a new {{NodeManager}} config, {{opportunistic-containers-queue-policy}}, which allows the specification of the queueing policy at the NM.
Will start with {{BY_RESOURCES}} and {{BY_QUEUE_LEN}}, where if {{BY_RESOURCES}} is specified, the NM will queue as long as it has enough resources to run all pending + running containers. Otherwise, it will reject the {{OPPORTUNISTIC}} container.
On the other hand, if {{BY_QUEUE_LEN}} is specified, the NM will only accept as many containers as its queue capacity is configured.
Thus, if {{BY_QUEUE_LEN}} is specified and the NM's queue capacity is configured to be 0, the NM will reject all incoming {{OPPORTUNISTIC}} containers (today's behavior).

Note that this configuration *does not affect how the RM behaves*.
At the RM, if the queue capacity reported by the node is = 0 *and* the allocation policy is set to {{QUEUE_LENGTH_THEN_RESOURCES}}, it assumes that the node can still run {{OPPORTUNISTIC}} containers if it has available resources, otherwise it skips the node.
Subsequently, if the queue capacity reported by the node is = 0 *and* the allocation policy is set to {{QUEUE_LENGTH}}, it still assumes that the node can run infinitely many {{OPPORTUNISTIC}} containers, and it will be on the NM to reject these containers (today's behavior).



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-dev-help@hadoop.apache.org