You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Konstantinos Karanasos (JIRA)" <ji...@apache.org> on 2016/01/08 17:56:40 UTC

[jira] [Commented] (YARN-4412) Create ClusterMonitor to compute ordered list of preferred NMs for OPPORTUNITIC containers

    [ https://issues.apache.org/jira/browse/YARN-4412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15089499#comment-15089499 ] 

Konstantinos Karanasos commented on YARN-4412:
----------------------------------------------

Thank you for the patch, [~asuresh]!

Some first comments:
# I suggest to not create a ContainerQueueInfo for this JIRA. Let's just assume the node is sending just the estimated queue wait time. If more is needed, let's do it as part of YARN-2883.
# Wherever you see code related to "stragglers", it needs to be removed. It appears in various classes, including YarnConfiguration, DistributedSchedulingService, TopKNodeSelector. (Some background: it was some code I had added for testing what happens when queue estimates are off).
# In YarnConfiguration, I would rename "DIST_SCHEDULING_TOP_K_COMPUTE_INT_MS_DEFAULT" to something like "CLUSTER_MONITORING_INTERVAL", since it will also be used for corrective mechanisms (YARN-2888) and probably other stuff.
# Is the EventHandler needed? If I am not wrong you factored out some code from the ResourceManager?
# I think we should add the choice to order the nodes based on their number of queued containers (we actually already have code for that). This will be useful when estimated queue wait time is not available or not reliable. We can add a parameter in the YarnConfiguration, and then have two node comparators based on this parameter (on that uses the estimated wait time and another that uses number of queued containers).
# Let's change the package of the DistributedSchedulingService in the JIRA where it was introduced rather than doing it in this patch (YARN-2885 if I'm not wrong).
# In the TopKNodeSelector, we should remove the updateSuccessfulContainers() from this JIRA.

> Create ClusterMonitor to compute ordered list of preferred NMs for OPPORTUNITIC containers
> ------------------------------------------------------------------------------------------
>
>                 Key: YARN-4412
>                 URL: https://issues.apache.org/jira/browse/YARN-4412
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: nodemanager, resourcemanager
>            Reporter: Arun Suresh
>            Assignee: Arun Suresh
>         Attachments: YARN-4412-yarn-2877.v1.patch
>
>
> Introduce a Cluster Monitor that aggregates load information from individual Node Managers and computes an ordered list of preferred Node managers to be used as target Nodes for OPPORTUNISTIC container allocations. 
> This list can be pushed out to the Node Manager (specifically the AMRMProxy running on the Node) via the Allocate Response. This will be used to make local Scheduling decisions



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)