You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Zhankun Tang (JIRA)" <ji...@apache.org> on 2017/10/17 05:44:00 UTC

[jira] [Comment Edited] (YARN-6620) Add support in NodeManager to isolate GPU devices by using CGroups

    [ https://issues.apache.org/jira/browse/YARN-6620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16207026#comment-16207026 ] 

Zhankun Tang edited comment on YARN-6620 at 10/17/17 5:43 AM:
--------------------------------------------------------------

[~wangda], Thanks for the great effort. I'll implement the FPGA resource plugin(only supports OpenCL FPGA application for now) based on that.

One question is that in above discussions you mentioned the GPU is a mandatory resource like cpu and memory, but what's the difference between mandatory and first-class resource? And a list of first-class resource? Currently, If I understand correctly,
1. First-class resource should be parsed from resource-types.xml and node-resources.xml(or auto discover) instead of yarn-site.xml?
2. First-calss resource handler should register itself with the same resource name defined in xml files?
3. First-class resource should be shown in a separate user-defined column in web pages?


was (Author: tangzhankun):
[~wangda], Thanks for the great effort. I'll implement the FPGA resource plugin(only supports OpenCL FPGA application for now) based on that.

One question is that in above discussions you mentioned the GPU is a mandatory resource like cpu and memory, but what's the difference between mandatory and first-class resource? And a list of first-class resource? Currently, If I understand correctly,
1. First-class resource should be parsed from resource-types.xml and node-resources.xml instead of yarn-site.xml?
2. First-calss resource handler should register itself with the same resource name defined in xml files?
3. First-class resource should be shown in a separate user-defined column in web pages?

> Add support in NodeManager to isolate GPU devices by using CGroups
> ------------------------------------------------------------------
>
>                 Key: YARN-6620
>                 URL: https://issues.apache.org/jira/browse/YARN-6620
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Wangda Tan
>            Assignee: Wangda Tan
>             Fix For: 3.1.0
>
>         Attachments: YARN-6620.001.patch, YARN-6620.002.patch, YARN-6620.003.patch, YARN-6620.004.patch, YARN-6620.005.patch, YARN-6620.006-WIP.patch, YARN-6620.007.patch, YARN-6620.008.patch, YARN-6620.009.patch, YARN-6620.010.patch, YARN-6620.011.patch, YARN-6620.012.patch, YARN-6620.013.patch, YARN-6620.014.patch, YARN-6620.015.patch, YARN-6620.016.patch, YARN-6620.017.patch
>
>
> This JIRA plan to add support of:
> 1) GPU configuration for NodeManagers
> 2) Isolation in CGroups. (Java side).
> 3) NM restart and recovery allocated GPU devices



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org