You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Peter Bacsko (Jira)" <ji...@apache.org> on 2021/05/31 18:30:00 UTC

[jira] [Updated] (YARN-10796) Capacity Scheduler: dynamic queue cannot scale out properly if it's capacity is 0%

     [ https://issues.apache.org/jira/browse/YARN-10796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Peter Bacsko updated YARN-10796:
--------------------------------
    Description: 
If we have a dynamic queue (AutoCreatedLeafQueue) with capacity = 0%, then it cannot properly scale even if it's max-capacity and the parent's max-capacity would allow it.

Example:
{noformat}
Cluster Capacity:  16 GB / 16cpu (2 nodes, each with 8 GB / 8 cpu )
Container allocation size: 1G / 1 vcore

root.dynamic 
    Effective Capacity:      <memory: 8192, vCores: 8> ( 50.0%)
    Effective Max Capacity:  <memory:16384, vCores:16> (100.0%) 

    Template:
        Capacity:               40%
        Max Capacity:           100%
        User Limit Factor:      4
 {noformat}
leaf-queue-template.capacity = 40%
 leaf-queue-template.maximum-capacity = 100%
 leaf-queue-template.maximum-am-resource-percent = 50%
 leaf-queue-template.minimum-user-limit-percent =100%
 leaf-queue-template.user-limit-factor = 4

"root.dynamic" has a maximum capacity of 100% and a capacity of 50%.

Let's assume there are running containers in these dynamic queues (MR sleep jobs):
 root.dynamic.user1 = 1 AM + 3 container (capacity = 40%)
 root.dynamic.user2 = 1 AM + 3 container (capacity = 40%)
 root.dynamic.user3 = 1 AM + 15 container (capacity = 0%)

This scenario will result in an underutilized cluster. There will be approx 18% unused capacity. On the other hand, it's still possible to submit a new application to root.dynamic.user1 or root.dynamic.user2 and reaching a 100% utilization is possible.

  was:
If we have a dynamic queue (AutoCreatedLeafQueue) with capacity = 0%, then it cannot properly scale even if it's max-capacity and the parent's max-capacity would allow it.

Example:
{noformat}
Cluster Capacity:  16 GB / 16cpu (2 nodes, each with 8 GB / 8 cpu )
Container allocation size: 1G / 1 vcore

Root.dynamic 
    Effective Capacity:      <memory: 8192, vCores: 8> ( 50.0%)
    Effective Max Capacity:  <memory:16384, vCores:16> (100.0%) 

    Template:
        Capacity:               40%
        Max Capacity:           100%
        User Limit Factor:      4
 {noformat}
leaf-queue-template.capacity = 40%
 leaf-queue-template.maximum-capacity = 100%
 leaf-queue-template.maximum-am-resource-percent = 50%
 leaf-queue-template.minimum-user-limit-percent =100%
 leaf-queue-template.user-limit-factor = 4

"root.dynamic" has a maximum capacity of 100% and a capacity of 50%.

Let's assume there are running containers in these dynamic queues (MR sleep jobs):
 root.dynamic.user1 = 1 AM + 3 container (capacity = 40%)
 root.dynamic.user2 = 1 AM + 3 container (capacity = 40%)
 root.dynamic.user3 = 1 AM + 15 container (capacity = 0%)

This scenario will result in an underutilized cluster. There will be approx 18% unused capacity. On the other hand, it's still possible to submit a new application to root.dynamic.user1 or root.dynamic.user2 and reaching a 100% utilization is possible.


> Capacity Scheduler: dynamic queue cannot scale out properly if it's capacity is 0%
> ----------------------------------------------------------------------------------
>
>                 Key: YARN-10796
>                 URL: https://issues.apache.org/jira/browse/YARN-10796
>             Project: Hadoop YARN
>          Issue Type: Task
>          Components: capacity scheduler, capacityscheduler
>            Reporter: Peter Bacsko
>            Assignee: Peter Bacsko
>            Priority: Major
>
> If we have a dynamic queue (AutoCreatedLeafQueue) with capacity = 0%, then it cannot properly scale even if it's max-capacity and the parent's max-capacity would allow it.
> Example:
> {noformat}
> Cluster Capacity:  16 GB / 16cpu (2 nodes, each with 8 GB / 8 cpu )
> Container allocation size: 1G / 1 vcore
> root.dynamic 
>     Effective Capacity:      <memory: 8192, vCores: 8> ( 50.0%)
>     Effective Max Capacity:  <memory:16384, vCores:16> (100.0%) 
>     Template:
>         Capacity:               40%
>         Max Capacity:           100%
>         User Limit Factor:      4
>  {noformat}
> leaf-queue-template.capacity = 40%
>  leaf-queue-template.maximum-capacity = 100%
>  leaf-queue-template.maximum-am-resource-percent = 50%
>  leaf-queue-template.minimum-user-limit-percent =100%
>  leaf-queue-template.user-limit-factor = 4
> "root.dynamic" has a maximum capacity of 100% and a capacity of 50%.
> Let's assume there are running containers in these dynamic queues (MR sleep jobs):
>  root.dynamic.user1 = 1 AM + 3 container (capacity = 40%)
>  root.dynamic.user2 = 1 AM + 3 container (capacity = 40%)
>  root.dynamic.user3 = 1 AM + 15 container (capacity = 0%)
> This scenario will result in an underutilized cluster. There will be approx 18% unused capacity. On the other hand, it's still possible to submit a new application to root.dynamic.user1 or root.dynamic.user2 and reaching a 100% utilization is possible.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org