You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Szilard Nemeth (JIRA)" <ji...@apache.org> on 2019/02/05 12:33:00 UTC

[jira] [Commented] (YARN-9139) Simplify initializer code of GpuDiscoverer

    [ https://issues.apache.org/jira/browse/YARN-9139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16760748#comment-16760748 ] 

Szilard Nemeth commented on YARN-9139:
--------------------------------------

I changed the validation of the existence of the GPU discovery binary in the following way:
With the old code, GpuDiscoverer.initialize() did not throw exception if the binary was not found but the exception was thrown later when GpuDiscoverer.getGpusUsableByYarn gets called.

As most of the tests in TestGpuResourceHandler was relying on the fact that an exception is only thrown later from GpuDiscoverer.getGpusUsableByYarn, patch002 introduced test failures for almost all the testcases since the testcases only called initialize and the exception is now thrown in an earlier state (in a fail-fast way).
As the binaryPath would be having the value of "/usr/local/nvidia/bin/nvidia-smi" if the Configuration object has no explicit setting for the path, I required to modify all the tests and provide the path explicitly with the Configuration object, so the tests are independent of the runtime environment and most likely no Jenkins nor most of the development environments have nvidia-smi set up under the default path.
Patch003 fixes these issues.

> Simplify initializer code of GpuDiscoverer
> ------------------------------------------
>
>                 Key: YARN-9139
>                 URL: https://issues.apache.org/jira/browse/YARN-9139
>             Project: Hadoop YARN
>          Issue Type: Improvement
>            Reporter: Szilard Nemeth
>            Assignee: Szilard Nemeth
>            Priority: Major
>         Attachments: YARN-9139.001.patch, YARN-9139.002.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org