You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Tao Yang (JIRA)" <ji...@apache.org> on 2019/07/01 05:28:00 UTC

[jira] [Commented] (YARN-9623) Auto adjust max queue length of app activities to make sure activities on all nodes can be covered

    [ https://issues.apache.org/jira/browse/YARN-9623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16875952#comment-16875952 ] 

Tao Yang commented on YARN-9623:
--------------------------------

Thanks [~cheersyang].
I just noticed that Jerkins report has related failures in TestLeafQueue, if there's no yarn configuration in mock RMContext, cleanup interval can't be initialized to 5 seconds by default, causing the cleanup thread keeps repeatedly running without interval which may bring some problems to mock objects.
Add a default value for ActivitiesManager#activitiesCleanupIntervalMs can solve this problem in UT.  Should I create a new issue or update in this issue?

> Auto adjust max queue length of app activities to make sure activities on all nodes can be covered
> --------------------------------------------------------------------------------------------------
>
>                 Key: YARN-9623
>                 URL: https://issues.apache.org/jira/browse/YARN-9623
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Tao Yang
>            Assignee: Tao Yang
>            Priority: Major
>             Fix For: 3.3.0
>
>         Attachments: YARN-9623.001.patch, YARN-9623.002.patch
>
>
> Currently we can use configuration entry "yarn.resourcemanager.activities-manager.app-activities.max-queue-length" to control max queue length of app activities, but in some scenarios , this configuration may need to be updated in a growing cluster. Moreover, it's better for users to ignore that conf therefor it should be auto adjusted internally.
>  There are some differences among different scheduling modes:
>  * multi-node placement disabled
>  ** Heartbeat driven scheduling: max queue length of app activities should not less than the number of nodes, considering nodes can not be always in order, we should make some room for misorder, for example, we can guarantee that max queue length should not be less than 1.2 * numNodes
>  ** Async scheduling: every async scheduling thread goes through all nodes in order, in this mode, we should guarantee that max queue length should be numThreads * numNodes.
>  * multi-node placement enabled: activities on all nodes can be involved in a single app allocation, therefor there's no need to adjust for this mode.
> To sum up, we can adjust the max queue length of app activities like this:
> {code}
> int configuredMaxQueueLength;
> int maxQueueLength;
> serviceInit(){
>   ...
>   configuredMaxQueueLength = ...; //read configured max queue length
>   maxQueueLength = configuredMaxQueueLength; //take configured value as default
> }
> CleanupThread#run(){
>   ...
>   if (multiNodeDisabled) {
>     if (asyncSchedulingEnabled) {
>        maxQueueLength = max(configuredMaxQueueLength, numSchedulingThreads * numNodes);
>     } else {
>        maxQueueLength = max(configuredMaxQueueLength, 1.2 * numNodes);
>     }
>   } else if (maxQueueLength != configuredMaxQueueLength) {
>     maxQueueLength = configuredMaxQueueLength;
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org