You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Chun-Hung Hsiao (JIRA)" <ji...@apache.org> on 2019/02/19 22:57:00 UTC

[jira] [Comment Edited] (MESOS-9549) nvidia/cuda 10 does not work on GPU isolator

    [ https://issues.apache.org/jira/browse/MESOS-9549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772405#comment-16772405 ] 

Chun-Hung Hsiao edited comment on MESOS-9549 at 2/19/19 10:56 PM:
------------------------------------------------------------------

Instead of using the {{maintainer}} label, maybe it is more generic to use either of the approaches:
# Check for the [{{NVIDIA_REQUIRE_CUDA}}|https://github.com/NVIDIA/nvidia-container-runtime/blob/master/README.md#nvidia_require_cuda] environment variable as another workaround.
# Check the [{{NVIDIA_DRIVER_CAPABILITIES}}|https://github.com/NVIDIA/nvidia-container-runtime/blob/master/README.md#nvidia_require_cuda] environment variable, as this is the actual controller for the libnvidia-container CLI, which is used by the nvidia runtime to prepare the binaries/libraries: https://github.com/NVIDIA/libnvidia-container/blob/master/src/nvc_info.c


was (Author: chhsia0):
Instead of using the {{maintainer}} label, maybe it is more generic to use either of the approaches:
# Check for the [{{NVIDIA_REQUIRE_CUDA}}|https://github.com/NVIDIA/nvidia-container-runtime/blob/master/README.md#nvidia_require_cuda] environment variable as another workaround.
# Check the [{{NVIDIA_DRIVER_CAPABILITIES}}|https://github.com/NVIDIA/nvidia-container-runtime/blob/master/README.md#nvidia_require_cuda] environment variable, is this is the actual controller for the libnvidia-container CLI, which is used by the nvidia runtime to prepare the binaries/libraries: https://github.com/NVIDIA/libnvidia-container/blob/master/src/nvc_info.c

> nvidia/cuda 10 does not work on GPU isolator
> --------------------------------------------
>
>                 Key: MESOS-9549
>                 URL: https://issues.apache.org/jira/browse/MESOS-9549
>             Project: Mesos
>          Issue Type: Bug
>            Reporter: Jie Yu
>            Priority: Major
>
> I verified that nvidia/cuda 9 (i.e., 9.2-devel-ubuntu18.04) works with GPU isolator.
> The unit test NvidiaGpuTest.ROOT_INTERNET_CURL_CGROUPS_NVIDIA_GPU_NvidiaDockerImage captures this, and is currently failing on GPU hosts since it uses latest nvidia/cuda image.
> If fails with
> {format}
> sh: 1: nvidia-smi: not found
> {format}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)