You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2023/01/19 04:46:00 UTC
[jira] [Commented] (YARN-11411) [Umbrella] Build Concurrent Yarn Scheduler

    [ https://issues.apache.org/jira/browse/YARN-11411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17678485#comment-17678485 ] 

ASF GitHub Bot commented on YARN-11411:
---------------------------------------

krishan1390 opened a new pull request, #5313:
URL: https://github.com/apache/hadoop/pull/5313

   As part of https://issues.apache.org/jira/browse/YARN-11412 to build a concurrent users manager, I am planning to isolate user management into a package and limit its public APIs. This will help maintain the thread safety guarantees of the classes better and not allow certain internal behaviours (like caches) from escaping which will help to modify implementations as per need.
   
   This PR is encapsulating certain APIs and making the required changes in dependent classes and test cases. 




> [Umbrella] Build Concurrent Yarn Scheduler
> ------------------------------------------
>
>                 Key: YARN-11411
>                 URL: https://issues.apache.org/jira/browse/YARN-11411
>             Project: Hadoop YARN
>          Issue Type: New Feature
>            Reporter: Krishan Goyal
>            Assignee: Krishan Goyal
>            Priority: Major
>
> We operate multiple yarn clusters with each cluster capped to ~ 10k nodes which is its scalability limit. We expect multiple benefits with fewer clusters & larger cluster sizes (better elasticity, operational simplicity, larger queues). 
> Thus, we want to scale a single yarn cluster to as much as possible in terms of number of nodes heartbeating to the cluster (& proportionally increase container allocation rate) without degradation in overall quantiles (p50 / p75 / p95) of container allocation delay 
> The scalability limit of a yarn cluster is primarily driven by RM’s processing of node heartbeats & container allocation. The CPU usage of our RM is < 10% & RM is primarily bottlenecked on global queue & user read/write locks for container allocation
> By removing these locks (through a very naive & incorrect implementation), we were able to scale RM to 25k nodes (& proportional increase in container allocs/sec) with avg RM CPU utilization of 20% (so there is still room for improvement to use more CPU / scale up further).
> This primarily requires
>  # Async scheduling to decouple scheduling from node heartbeats (existing feature)
>  # Removing global write locks in scheduler path (primarily to maintain queues and users)
>  # Multi threaded event queue dispatcher to process events parallelly
> Additionally we need to probably scale RPC handling, DT management, preemption flows, Timeline server, RM HA failover. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org