You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Daniel Templeton (JIRA)" <ji...@apache.org> on 2017/05/23 21:57:04 UTC

[jira] [Commented] (YARN-3409) Add constraint node labels

    [ https://issues.apache.org/jira/browse/YARN-3409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16021942#comment-16021942 ] 

Daniel Templeton commented on YARN-3409:
----------------------------------------

Sorry for coming late to the conversation.  Last week I had a quick chat with [~Naganarasimha] offline about the plans, and I wanted to share an alternate perspective.

If you go look at the way HPC job schedulers (like Grid Engine et al) handle this requirement, it's an extension of resources.  The work that [~vvasudev] has done on resource types opens up a natural path to add "static" resource types with the characteristics described here.  The advantage is that the plumbing for resources is already very mature, and extending it to support static resources would not introduce much in the way of new logic.  The implementation of constraints then naturally becomes a superset of resource matching for the consumable resources.  The disadvantage that [~Naganarasimha] pointed out is that users would have to understand that resources can be static or consumable, which is a higher bar than just asserting that all resources are consumable. Given that all the major HPC job schedulers have been using static resources for this purpose successfully for decades, I don't see that being a major issue.

To add a little more detail, here's the what Grid Engine does (that's relevant to us).  (See http://gridscheduler.sourceforge.net/htmlman/htmlman5/complex.html)
* All resources have a type, e.g. string, double, boolean, etc.
* All resources have an associated relational operator.  For example the memory resource has >= as a relational operator, meaning that a request for 4GB of memory is treated as >= 4GB of memory.  In general, resources can only be meaningfully compared one direction.
* All resources are either consumable or static.  Only numeric resources can be consumable.
* Memory and CPU (and a couple others) are provided implicitly by the system.
* It's possible to configure the agents to run scripts periodically to programmatically determine values for any resources. Consumable resources decrement from that value.
* The scheduler uses the relational operator for all resources to determine whether resource requests fit a destination queue/host.

Putting static resources and consumables in the same boat saves a fair bit of logic duplication in implementing things like programmatically determined values.

> Add constraint node labels
> --------------------------
>
>                 Key: YARN-3409
>                 URL: https://issues.apache.org/jira/browse/YARN-3409
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: api, capacityscheduler, client
>            Reporter: Wangda Tan
>            Assignee: Naganarasimha G R
>         Attachments: Constraint-Node-Labels-Requirements-Design-doc_v1.pdf, YARN-3409.WIP.001.patch
>
>
> Specify only one label for each node (IAW, partition a cluster) is a way to determinate how resources of a special set of nodes could be shared by a group of entities (like teams, departments, etc.). Partitions of a cluster has following characteristics:
> - Cluster divided to several disjoint sub clusters.
> - ACL/priority can apply on partition (Only market team / marke team has priority to use the partition).
> - Percentage of capacities can apply on partition (Market team has 40% minimum capacity and Dev team has 60% of minimum capacity of the partition).
> Constraints are orthogonal to partition, they’re describing attributes of node’s hardware/software just for affinity. Some example of constraints:
> - glibc version
> - JDK version
> - Type of CPU (x86_64/i686)
> - Type of OS (windows, linux, etc.)
> With this, application can be able to ask for resource has (glibc.version >= 2.20 && JDK.version >= 8u20 && x86_64).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org