You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Inigo Goiri (JIRA)" <ji...@apache.org> on 2016/07/12 00:14:11 UTC

[jira] [Commented] (YARN-5356) ResourceUtilization should also include resource availability

    [ https://issues.apache.org/jira/browse/YARN-5356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15371947#comment-15371947 ] 

Inigo Goiri commented on YARN-5356:
-----------------------------------

In general, we have 3 values:
* Actual resources of the full machine. This currently comes from {{NodeManagerHardwareUtils}} if I remember correctly. For example, it can be 12 cores for example
* Resource available for the Node Manager. This is currently defined in yarn-site.xml with key {{yarn.nodemanager.resource.cpu-vcores}} or with the {{updateNodeResource()}}. For example, 6 cores.
* Actual utilization of the machine. This is extracted in the {{NodeResourceMonitor}} with the {{ResourceCalculatorPlugin}}. And it can be 400%, which would imply 4 out of the 12 cores used.

[~nroberts], I understand that your problem is that with the current approach you know that you have 6 cores available to the NM and 4 of them are used. However, the machine is not that utilized (~30%). Correct? In that case, we would only need to report the actual size of the machine at registration time as it would never change. Not sure that {{ResourceUtilization}} would be the right place for that as it would be reported in every heartbeat continuously.


> ResourceUtilization should also include resource availability
> -------------------------------------------------------------
>
>                 Key: YARN-5356
>                 URL: https://issues.apache.org/jira/browse/YARN-5356
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: nodemanager, resourcemanager
>    Affects Versions: 3.0.0-alpha1
>            Reporter: Nathan Roberts
>
> Currently ResourceUtilization contains absolute quantities of resource used (e.g. 4096MB memory used). It would be good if it also included how much of that resource is actually available on the node so that the RM can use this data to schedule more effectively (overcommit, etc)
> Currently the only available information is the Resource the node registered with (or later updated using updateNodeResource). However, these aren't really sufficient to get a good view of how utilized a resource is. For example, if a node reports 400% CPU utilization, does that mean it's completely full, or barely utilized? Today there is no reliable way to figure this out.
> [~elgoiri] - Lots of good work is happening in YARN-2965 so curious if you have thoughts/opinions on this?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org