You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2019/07/17 02:41:10 UTC

[GitHub] [incubator-mxnet] DickJC123 commented on a change in pull request #15551: Bypass cuda/cudnn checks if no driver.

DickJC123 commented on a change in pull request #15551: Bypass cuda/cudnn checks if no driver.
URL: https://github.com/apache/incubator-mxnet/pull/15551#discussion_r304199281
 
 

 ##########
 File path: src/common/cuda_utils.cc
 ##########
 @@ -44,8 +44,15 @@ namespace cuda {
 // Dynamic init here will emit a warning if runtime and compile-time cuda lib versions mismatch.
 // Also if the user has recompiled their source to a version no longer tested by upstream CI.
 bool cuda_version_check_performed = []() {
-  // Don't bother with checks if there are no GPUs visible (e.g. with CUDA_VISIBLE_DEVICES="")
-  if (dmlc::GetEnv("MXNET_CUDA_VERSION_CHECKING", true) && Context::GetGPUCount() > 0) {
+  // MXNet might be built on a machine with a cuda toolkit, but no GPUs or GPU driver.
+  // To allow that machine to execute say: python -c 'import mxnet; print(mxnet.__version__)',
+  // we won't perform a check if there is no driver.  Any actual attempt to use the cuda API's
+  // will yield the desired message: CUDA driver version is insufficient for CUDA runtime version.
+  int cuda_driver_version = 0;
+  CUDA_CALL(cudaDriverGetVersion(&cuda_driver_version));
+  // Also, don't bother with checks if there are no GPUs visible (e.g. with CUDA_VISIBLE_DEVICES="")
+  if (dmlc::GetEnv("MXNET_CUDA_VERSION_CHECKING", true) && cuda_driver_version > 0
+                                                        && Context::GetGPUCount() > 0) {
 
 Review comment:
   Per your suggestion, I have reworked the PR and now have GetGPUCount return 0 if cuda_driver_version == 0.
   
   Also, I feel now the best way to ensure not impacting non-gpu platforms is to perform the cuda/cudnn checks at the point where the user creates a GPU context (as opposed to the current approach that uses dynamic initialization of libmxnet.so).
   
   Since the context creation is defined in ./include/mxnet/base.h, and since I need a non-header file to ensure only one lib version warning will be emitted, I've moved my prior work in ./src/common/cuda_utils.cc to a new file ./src/base.cc.  This follows the code placement of (for example) resource.h/resource.cc.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services