You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Sunil Govindan (JIRA)" <ji...@apache.org> on 2018/06/22 16:55:00 UTC

[jira] [Commented] (YARN-8446) Support of managing multi-dimensional resources

    [ https://issues.apache.org/jira/browse/YARN-8446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16520585#comment-16520585 ] 

Sunil Govindan commented on YARN-8446:
--------------------------------------

This looks very interesting. Thanks [~cheersyang] for filing this. This will definitely help to support a wide variety of resources in YARN.
{code:java}
public enum ResourceTypes {
  COUNTABLE
}{code}
Few questions:
 # So we have only COUNTABLE resource type for now. As per this proposal, couple of more addition to this is *SET* and *MULTIDIMENSIONAL.* I have some comments on this. Once we indicate a resource as SET, for eg: I am considering *IPAddress* as a resource name in this context. 
 ## This will now be good if all values in this set are unique. But in larger context, are we looking to consider * or similar special characters for each resource types. To make it more clear, each resource type might need some specification to be fed in and * might be one such with a specific meaning and there could be more such specs for each resources. 
 ## Also such non-countable resources will be consumed for each containers, and after use that has to be added back to the resource set. I am thinking of the performance cost here as we might need to consider the resource as a critical section.
 ## As of today, resource is per-node or per-partition or at cluster level. We aggregate and use the same for various uses or metrics etc. I am wondering the semantic changes to api's such as *Resources.add* or *Resources.multiply* or Resources.divide. Its better we need to avoid such non-countable from these apis but being said we might need to aggregate to higher level for above said reason. Could u please share some insights to this.
 #  Earlier I have commented in other Jira of multiple resources about concept of shared resources. Is this *MULTIDIMENSIONAL* also consider shared resources?

 ## Same question as 1.3, In *MULTIDIMENSIONAL* how we do the operations?

> Support of managing multi-dimensional resources
> -----------------------------------------------
>
>                 Key: YARN-8446
>                 URL: https://issues.apache.org/jira/browse/YARN-8446
>             Project: Hadoop YARN
>          Issue Type: New Feature
>            Reporter: Weiwei Yang
>            Priority: Major
>
> To better support long running jobs and services, we need to extend YARN to support other resources, such as disk, IP, port. Current resource types is not flexible enough to make this work because it only supports COUNTABLE type which is single value.
>  
> Propose to extend resource types by adding two more general types, such as SET, MULTIDIMENSIONAL (naming TBD). With schema like
> *SET*:  a set of values
> {noformat}
> ["10.100.0.1", "10.100.0.2"]
> ["9981", "9982", "9983"]
> {noformat}
> *MULTIDIMENSIONAL*: a set of values, each value can be a resource instance with multiple values. 
> {noformat}
> [ disk1 : { attributes: { "type" : "SATA", "index" : 1 },  "size" : "500gb", "iops" : "1000" },
>   disk2 : { attributes: { "type" : "SSD", "index" : 2 },  "size" : "100gb", "iops" : "1000" } ]
> {noformat}
> this way, we could support better resource management and isolations. The idea is to make this as general as possible so we can easily support some other complex resources.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org