You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@mesos.apache.org by "chester kuo (JIRA)" <ji...@apache.org> on 2015/02/06 04:41:34 UTC

[jira] [Updated] (MESOS-2262) Adding GPGPU resource into Mesos, so we can know if any GPU/Heterogeneous resource are available from slave

     [ https://issues.apache.org/jira/browse/MESOS-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

chester kuo updated MESOS-2262:
-------------------------------
    Summary: Adding GPGPU resource into Mesos, so we can know if any GPU/Heterogeneous resource are available from slave  (was: Adding GPGPU resource into Mesos framework, so we can know if any GPGPU resource are available for master)

> Adding GPGPU resource into Mesos, so we can know if any GPU/Heterogeneous resource are available from slave
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: MESOS-2262
>                 URL: https://issues.apache.org/jira/browse/MESOS-2262
>             Project: Mesos
>          Issue Type: Task
>          Components: framework, slave
>         Environment: OpenCL support env, such as OS X, Linux, Windows..
>            Reporter: chester kuo
>            Assignee: chester kuo
>            Priority: Minor
>
> Extending Mesos to support Heterogeneous resource such as GPGPU/FPGA..etc as computing resources in the data-center, OpenCL will be first target to add into Mesos (support by all major GPU vendor) , I will reserve to support others such as CUDA in the future.
> In this feature, slave will be supported to do resources discover including but not limited to, 
> (1) Heterogeneous Computing protocol type : "OpenCL". "CUDA", "HSA"
> (2) Computing global memory (MB)
> (3) Computing run time version , such as "1.2" , "2.0"
> (4) Computing compute unit (double)
> (5) Computing device type : GPGPU, CPU, Accelerator device.
> (6) Computing (number of devices): (double)
> The Heterogeneous resource isolation will be supported in the framework instead of in the slave devices side, the major reason here is , the ecosystem , such as OpenCL operate on top of private device driver own by vendors, only runtime library (OpenCL) is user-space application, so its hard for us to do like Linux cgroup to have CPU/memory resource isolation. As a result we may use run time library to do device isolation and memory allocation.
> (PS, if anyone know how to do it for GPGPU driver, please drop me a note)
> Meanwhile, some run-time library (such as OpenCL) support to run on top of CPU, so we need to use isolator API to notify this once it allocated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)