You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Chun-Hung Hsiao (JIRA)" <ji...@apache.org> on 2019/02/19 22:57:00 UTC
[jira] [Comment Edited] (MESOS-9549) nvidia/cuda 10 does not work
on GPU isolator
[ https://issues.apache.org/jira/browse/MESOS-9549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772405#comment-16772405 ]
Chun-Hung Hsiao edited comment on MESOS-9549 at 2/19/19 10:56 PM:
------------------------------------------------------------------
Instead of using the {{maintainer}} label, maybe it is more generic to use either of the approaches:
# Check for the [{{NVIDIA_REQUIRE_CUDA}}|https://github.com/NVIDIA/nvidia-container-runtime/blob/master/README.md#nvidia_require_cuda] environment variable as another workaround.
# Check the [{{NVIDIA_DRIVER_CAPABILITIES}}|https://github.com/NVIDIA/nvidia-container-runtime/blob/master/README.md#nvidia_require_cuda] environment variable, as this is the actual controller for the libnvidia-container CLI, which is used by the nvidia runtime to prepare the binaries/libraries: https://github.com/NVIDIA/libnvidia-container/blob/master/src/nvc_info.c
was (Author: chhsia0):
Instead of using the {{maintainer}} label, maybe it is more generic to use either of the approaches:
# Check for the [{{NVIDIA_REQUIRE_CUDA}}|https://github.com/NVIDIA/nvidia-container-runtime/blob/master/README.md#nvidia_require_cuda] environment variable as another workaround.
# Check the [{{NVIDIA_DRIVER_CAPABILITIES}}|https://github.com/NVIDIA/nvidia-container-runtime/blob/master/README.md#nvidia_require_cuda] environment variable, is this is the actual controller for the libnvidia-container CLI, which is used by the nvidia runtime to prepare the binaries/libraries: https://github.com/NVIDIA/libnvidia-container/blob/master/src/nvc_info.c
> nvidia/cuda 10 does not work on GPU isolator
> --------------------------------------------
>
> Key: MESOS-9549
> URL: https://issues.apache.org/jira/browse/MESOS-9549
> Project: Mesos
> Issue Type: Bug
> Reporter: Jie Yu
> Priority: Major
>
> I verified that nvidia/cuda 9 (i.e., 9.2-devel-ubuntu18.04) works with GPU isolator.
> The unit test NvidiaGpuTest.ROOT_INTERNET_CURL_CGROUPS_NVIDIA_GPU_NvidiaDockerImage captures this, and is currently failing on GPU hosts since it uses latest nvidia/cuda image.
> If fails with
> {format}
> sh: 1: nvidia-smi: not found
> {format}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)