You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mesos.apache.org by Royce Cheng-Yue <ro...@drive.ai> on 2016/06/30 21:46:50 UTC

Retrieving Allocated GPU Resource ID for Task

Hi everyone,

So far, I'm able to run a Mesos cluster with GPU resource allocation and
can issue commands using mesos-execute; however, the commands I am planning
to run require the allocated GPU resource IDs. Specifically, before our
command executes, we need to set an environment variable which specifies
the GPU ID to run the command on. Is there a way to retrieve the GPU ID
after allocation and use the ID in our command before task execution?

Thanks,
Royce

-- 
 

The information in this email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this email 
by anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be 
taken in reliance on it, is prohibited and may be unlawful.

Re: Retrieving Allocated GPU Resource ID for Task

Posted by Royce Cheng-Yue <ro...@drive.ai>.
Hi Kevin,

Thanks! Yeah, I was looking for the minor numbers and ended up retrieving
them using nvidia-smi. Appreciate the help!

Royce

On Thu, Jun 30, 2016 at 6:02 PM, Kevin Klues <kl...@gmail.com> wrote:

> What is the GPU ID you are referring to?
>
> The UUID of the GPU? The short ID listed by `nvidia-smi` when listing
> GPUs? The minor number associated with the underlying /dev device (which
> may be different than the number appearing at the end of /dev/nvidia*). Or
> do you just care about the number on /dev/nvidia* so that you can detect
> which device you actually have access to on the file system?
>
> Either way, there is no builtin support in mesos for getting any of these
> values. However, you could easily run some script as a "pre-command" to get
> at any of thees numbers.
>
> For the UUID and short ID from `nvidia-smi`:
> $ nvidia-smi -L
>
> For the minor numbers:
> $ nvidia-smi -q | grep Minor
> Minor Number : 0
> Minor Number : 1
> Minor Number : 2
> Minor Number : 3
>
> For the actual /dev devices you have access to:
> Loop through and call `touch` on each /dev/nvidia* device and see which
> one's don't give you an error.
>
> On Thu, Jun 30, 2016 at 2:47 PM Royce Cheng-Yue <ro...@drive.ai> wrote:
>
>> Hi everyone,
>>
>> So far, I'm able to run a Mesos cluster with GPU resource allocation and
>> can issue commands using mesos-execute; however, the commands I am planning
>> to run require the allocated GPU resource IDs. Specifically, before our
>> command executes, we need to set an environment variable which specifies
>> the GPU ID to run the command on. Is there a way to retrieve the GPU ID
>> after allocation and use the ID in our command before task execution?
>>
>> Thanks,
>> Royce
>>
>> The information in this email is confidential and may be legally
>> privileged. It is intended solely for the addressee. Access to this email
>> by anyone else is unauthorized. If you are not the intended recipient, any
>> disclosure, copying, distribution or any action taken or omitted to be
>> taken in reliance on it, is prohibited and may be unlawful.
>>
>

-- 
 

The information in this email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this email 
by anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be 
taken in reliance on it, is prohibited and may be unlawful.

Re: Retrieving Allocated GPU Resource ID for Task

Posted by Kevin Klues <kl...@gmail.com>.
What is the GPU ID you are referring to?

The UUID of the GPU? The short ID listed by `nvidia-smi` when listing GPUs?
The minor number associated with the underlying /dev device (which may be
different than the number appearing at the end of /dev/nvidia*). Or do you
just care about the number on /dev/nvidia* so that you can detect which
device you actually have access to on the file system?

Either way, there is no builtin support in mesos for getting any of these
values. However, you could easily run some script as a "pre-command" to get
at any of thees numbers.

For the UUID and short ID from `nvidia-smi`:
$ nvidia-smi -L

For the minor numbers:
$ nvidia-smi -q | grep Minor
Minor Number : 0
Minor Number : 1
Minor Number : 2
Minor Number : 3

For the actual /dev devices you have access to:
Loop through and call `touch` on each /dev/nvidia* device and see which
one's don't give you an error.

On Thu, Jun 30, 2016 at 2:47 PM Royce Cheng-Yue <ro...@drive.ai> wrote:

> Hi everyone,
>
> So far, I'm able to run a Mesos cluster with GPU resource allocation and
> can issue commands using mesos-execute; however, the commands I am planning
> to run require the allocated GPU resource IDs. Specifically, before our
> command executes, we need to set an environment variable which specifies
> the GPU ID to run the command on. Is there a way to retrieve the GPU ID
> after allocation and use the ID in our command before task execution?
>
> Thanks,
> Royce
>
> The information in this email is confidential and may be legally
> privileged. It is intended solely for the addressee. Access to this email
> by anyone else is unauthorized. If you are not the intended recipient, any
> disclosure, copying, distribution or any action taken or omitted to be
> taken in reliance on it, is prohibited and may be unlawful.
>