You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2020/01/14 02:23:27 UTC

[GitHub] [incubator-mxnet] ChaiBapchya opened a new pull request #17297: Fix NCCL Cmake autodetect issue

ChaiBapchya opened a new pull request #17297: Fix NCCL Cmake autodetect issue
URL: https://github.com/apache/incubator-mxnet/pull/17297
 
 
   ## Description ##
   Fixes https://github.com/apache/incubator-mxnet/issues/17239
   
   ## Checklist ##
   ### Essentials ###
   - [ ] Changes are complete (i.e. I finished coding on this PR)
   - [ ] Code is well-documented: 
   - [ ] To the best of my knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change
   
   ### Changes ###
   modified:   cmake/Modules/FindNCCL.cmake
   
   ## Test ##
   Tested with following command
   ```
   cmake -GNinja -DUSE_CUDA=ON -DCMAKE_CUDA_COMPILER_LAUNCHER=ccache -DCMAKE_C_COMPILER_LAUNCHER=ccache -DCMAKE_CXX_COMPILER_LAUNCHER=ccache -DCMAKE_BUILD_TYPE=Release -DUSE_CUDNN=ON -DUSE_NCCL=ON ..
   ```
   
   Without this branch, it failed with NCCL not found
   ```
   Could NOT find NCCL (missing: NCCL_INCLUDE_DIRS NCCL_LIBRARIES)
   CMake Warning at CMakeLists.txt:636 (message):
     Could not find NCCL libraries
   ```
   
   With this PR, it passes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] leezu commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue

Posted by GitBox <gi...@apache.org>.
leezu commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue
URL: https://github.com/apache/incubator-mxnet/pull/17297#discussion_r367382186
 
 

 ##########
 File path: cmake/Modules/FindNCCL.cmake
 ##########
 @@ -33,6 +33,23 @@
 
 set(NCCL_ROOT_DIR "" CACHE PATH "Folder contains NVIDIA NCCL")
 
+# first check in the /usr/local/cuda before other paths
 
 Review comment:
   Quoting from https://cmake.org/cmake/help/latest/command/find_path.html
   
   > This command is used to find a directory containing the named file. A cache entry named by `<VAR>` is created to store the result of this command. If the file in a directory is found the result is stored in the variable and the search will not be repeated unless the variable is cleared. If nothing is found, the result will be `<VAR>-NOTFOUND`, and the search will be attempted again the next time find_path is invoked with the same variable.
   
   Thus if there is some `nccl.h` at `/usr/local/cuda` but the user specifies `NCCL_ROOT_DIR=/home/ubuntu/myversionofnccl`, the `/usr/local/cuda` is wrongly used? Or do I misunderstand the doc? 
   
   Btw, as per https://cmake.org/cmake/help/latest/policy/CMP0074.html `NCCL_ROOT_DIR` should probably be called `NCCL_ROOT` (or `NCCL_ROOT` should be added in addition to `NCCL_ROOT_DIR`).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] ChaiBapchya commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue

Posted by GitBox <gi...@apache.org>.
ChaiBapchya commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue
URL: https://github.com/apache/incubator-mxnet/pull/17297#discussion_r369890499
 
 

 ##########
 File path: cmake/Modules/FindNCCL.cmake
 ##########
 @@ -31,7 +31,13 @@
 # install NCCL in the same location as the CUDA toolkit.
 # See https://github.com/caffe2/caffe2/issues/1601
 
-set(NCCL_ROOT_DIR "" CACHE PATH "Folder contains NVIDIA NCCL")
+set(NCCL_ROOT $ENV{NCCL_ROOT} CACHE PATH "Folder contains NVIDIA NCCL")
 
 Review comment:
   @leezu Setting empty string to a cache variable does serve a purpose (declaring a variable so as to check if it is set or not in future)
   So if NCCL_ROOT_DIR is not already set then it assigns a default value to it (right now since we have deleted this line, if variable is not set via command line, it won't exist -> hence can't use it in this file)
   
   What do you think?
   
   https://stackoverflow.com/questions/59867283/whats-the-point-of-using-empty-string-to-set-cache-variable-in-cmake/59868538#59868538

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] leezu commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue

Posted by GitBox <gi...@apache.org>.
leezu commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue
URL: https://github.com/apache/incubator-mxnet/pull/17297#discussion_r367579230
 
 

 ##########
 File path: cmake/Modules/FindNCCL.cmake
 ##########
 @@ -33,6 +33,23 @@
 
 set(NCCL_ROOT_DIR "" CACHE PATH "Folder contains NVIDIA NCCL")
 
+# first check in the /usr/local/cuda before other paths
 
 Review comment:
   > I saw this NCCL_ROOT and found that NCCL_ROOT_DIR is deprecated hence no longer to be used in PyTorch repo
   > https://github.com/pytorch/pytorch/blob/f94aab45fd7b1e7e32d8ea4324da6d3632a0d412/cmake/Modules/FindNCCL.cmake#L21
   > 
   > However, I couldn't find official cmake doc saying so. Nonetheless, I think I should add that too.
   
   Thanks. The official doc should be https://cmake.org/cmake/help/latest/policy/CMP0074.html 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] ChaiBapchya commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue

Posted by GitBox <gi...@apache.org>.
ChaiBapchya commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue
URL: https://github.com/apache/incubator-mxnet/pull/17297#discussion_r368081829
 
 

 ##########
 File path: cmake/Modules/FindNCCL.cmake
 ##########
 @@ -33,6 +33,23 @@
 
 set(NCCL_ROOT_DIR "" CACHE PATH "Folder contains NVIDIA NCCL")
 
+# first check in the /usr/local/cuda before other paths
 
 Review comment:
   1. Autocheck will be done provided `NO_DEFAULT_PATH` or `NO_PACKAGE_ROOT_PATH` are not set. So I'm assuming that's the case.
   Refer : https://stackoverflow.com/questions/59783425/does-cmake-find-path-find-library-search-for-packagename-root-by-default/59788313#59788313
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] apeforest commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue

Posted by GitBox <gi...@apache.org>.
apeforest commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue
URL: https://github.com/apache/incubator-mxnet/pull/17297#discussion_r366134308
 
 

 ##########
 File path: cmake/Modules/FindNCCL.cmake
 ##########
 @@ -33,6 +33,26 @@
 
 set(NCCL_ROOT_DIR "" CACHE PATH "Folder contains NVIDIA NCCL")
 
+# if CUDAToolkit_FOUND is not found, try default location
+#if (NOT CUDAToolkit_FOUND)
 
 Review comment:
   Space?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] ChaiBapchya commented on issue #17297: Fix NCCL Cmake autodetect issue

Posted by GitBox <gi...@apache.org>.
ChaiBapchya commented on issue #17297: Fix NCCL Cmake autodetect issue
URL: https://github.com/apache/incubator-mxnet/pull/17297#issuecomment-575742882
 
 
   Wow! It passed! Weird! @leezu thanks for retrigger

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] ChaiBapchya commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue

Posted by GitBox <gi...@apache.org>.
ChaiBapchya commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue
URL: https://github.com/apache/incubator-mxnet/pull/17297#discussion_r369890499
 
 

 ##########
 File path: cmake/Modules/FindNCCL.cmake
 ##########
 @@ -31,7 +31,13 @@
 # install NCCL in the same location as the CUDA toolkit.
 # See https://github.com/caffe2/caffe2/issues/1601
 
-set(NCCL_ROOT_DIR "" CACHE PATH "Folder contains NVIDIA NCCL")
+set(NCCL_ROOT $ENV{NCCL_ROOT} CACHE PATH "Folder contains NVIDIA NCCL")
 
 Review comment:
   @leezu Setting empty string to a cache variable does serve a purpose (declaring a variable so as to check if it is set or not in future)
   So if NCCL_ROOT_DIR is not already set then it assigns a default value to it (right now since we have deleted this line, if variable is not set via command line, it won't exist -> hence can't use it in this file)
   
   Conclusion - we should have that line of empty string as default value for a cache variable.
   
   What do you think?
   
   https://stackoverflow.com/questions/59867283/whats-the-point-of-using-empty-string-to-set-cache-variable-in-cmake/59868538#59868538

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] leezu commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue

Posted by GitBox <gi...@apache.org>.
leezu commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue
URL: https://github.com/apache/incubator-mxnet/pull/17297#discussion_r367149908
 
 

 ##########
 File path: cmake/Modules/FindNCCL.cmake
 ##########
 @@ -33,6 +33,23 @@
 
 set(NCCL_ROOT_DIR "" CACHE PATH "Folder contains NVIDIA NCCL")
 
+# first check in the /usr/local/cuda before other paths
 
 Review comment:
   "first check" may break the assumption that `NCCL_ROOT_DIR` can be used to specify nccl root directory?
   
   Would adding `/usr/local/cuda` as last item in `find_path`, `find_library` below be sufficient? Or does it throw an error if `/usr/local/cuda` doesn't exist?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] leezu commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue

Posted by GitBox <gi...@apache.org>.
leezu commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue
URL: https://github.com/apache/incubator-mxnet/pull/17297#discussion_r368281540
 
 

 ##########
 File path: cmake/Modules/FindNCCL.cmake
 ##########
 @@ -33,6 +33,23 @@
 
 set(NCCL_ROOT_DIR "" CACHE PATH "Folder contains NVIDIA NCCL")
 
+# first check in the /usr/local/cuda before other paths
 
 Review comment:
   1. Yes, you should remove the manual checks for `NCCL_ROOT`.
   
   2. I don't think it's superfluous (ie it does something), but it seems not necessary because it seems non-standard (ie other Find modules don't check the environment variables, so it's not expected that you check them here)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] leezu commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue

Posted by GitBox <gi...@apache.org>.
leezu commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue
URL: https://github.com/apache/incubator-mxnet/pull/17297#discussion_r368082784
 
 

 ##########
 File path: cmake/Modules/FindNCCL.cmake
 ##########
 @@ -55,6 +67,23 @@ find_library(NCCL_LIBRARIES
   $ENV{NCCL_DIR}/lib
   )
 
+# if not found in any of the above paths, finally, check in the /usr/local/cuda for UNIX systems
+if (UNIX)
 
 Review comment:
   A variable will be set to not found. The calling cmake module needs to check that variable and error out if nccl is required. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] ChaiBapchya commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue

Posted by GitBox <gi...@apache.org>.
ChaiBapchya commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue
URL: https://github.com/apache/incubator-mxnet/pull/17297#discussion_r369303550
 
 

 ##########
 File path: cmake/Modules/FindNCCL.cmake
 ##########
 @@ -31,7 +31,13 @@
 # install NCCL in the same location as the CUDA toolkit.
 # See https://github.com/caffe2/caffe2/issues/1601
 
-set(NCCL_ROOT_DIR "" CACHE PATH "Folder contains NVIDIA NCCL")
+set(NCCL_ROOT $ENV{NCCL_ROOT} CACHE PATH "Folder contains NVIDIA NCCL")
 
 Review comment:
   So I am guessing I should remove it here too?
   
   https://github.com/ChaiBapchya/incubator-mxnet/blob/a9abba5a934755813481e6583435298a0c317957/cmake/Modules/FindNVTX.cmake#L18
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] ChaiBapchya commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue

Posted by GitBox <gi...@apache.org>.
ChaiBapchya commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue
URL: https://github.com/apache/incubator-mxnet/pull/17297#discussion_r367813676
 
 

 ##########
 File path: cmake/Modules/FindNCCL.cmake
 ##########
 @@ -33,6 +33,23 @@
 
 set(NCCL_ROOT_DIR "" CACHE PATH "Folder contains NVIDIA NCCL")
 
+# first check in the /usr/local/cuda before other paths
 
 Review comment:
   1. NCCL_ROOT auto check - I guess it is done automatically. Though it's not mentioned explicitly anywhere. Hence asked here - https://stackoverflow.com/questions/59783425/does-cmake-find-path-find-library-search-for-packagename-root-by-default
   
   2. > Also it seems none of the other Find modules check the X_LIB_DIR and X_LIB_DIR env variables.
   
   Not sure what you implying? You think that is superfluous too? 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] apeforest commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue

Posted by GitBox <gi...@apache.org>.
apeforest commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue
URL: https://github.com/apache/incubator-mxnet/pull/17297#discussion_r366134380
 
 

 ##########
 File path: cmake/Modules/FindNCCL.cmake
 ##########
 @@ -33,6 +33,26 @@
 
 set(NCCL_ROOT_DIR "" CACHE PATH "Folder contains NVIDIA NCCL")
 
+# if CUDAToolkit_FOUND is not found, try default location
+#if (NOT CUDAToolkit_FOUND)
+  if (UNIX)
+    set (search_paths "/usr/local/cuda")
+
+    find_path(NCCL_INCLUDE_DIRS
+    NAMES nccl.h
+    PATHS ${search_paths}
+    PATH_SUFFIXES include
+    )
+
+    find_library(NCCL_LIBRARIES
+    NAMES nccl
+    PATHS ${search_paths}
+    PATH_SUFFIXES lib
+    )
+
+  endif()
+  #endif()
 
 Review comment:
   Remove 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] ChaiBapchya commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue

Posted by GitBox <gi...@apache.org>.
ChaiBapchya commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue
URL: https://github.com/apache/incubator-mxnet/pull/17297#discussion_r367575646
 
 

 ##########
 File path: cmake/Modules/FindNCCL.cmake
 ##########
 @@ -33,6 +33,23 @@
 
 set(NCCL_ROOT_DIR "" CACHE PATH "Folder contains NVIDIA NCCL")
 
+# first check in the /usr/local/cuda before other paths
 
 Review comment:
   > Btw, as per https://cmake.org/cmake/help/latest/policy/CMP0074.html `NCCL_ROOT_DIR` should probably be called `NCCL_ROOT` (or `NCCL_ROOT` should be added in addition to `NCCL_ROOT_DIR`).
   
   
   Ya. Even I saw this NCCL_ROOT and found that NCCL_ROOT_DIR is deprecated hence no longer to be used in PyTorch repo
   https://github.com/pytorch/pytorch/blob/f94aab45fd7b1e7e32d8ea4324da6d3632a0d412/cmake/Modules/FindNCCL.cmake#L21
   
   However, I couldn't find official cmake doc saying so. Nonetheless, I think I should add that too.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] ChaiBapchya commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue

Posted by GitBox <gi...@apache.org>.
ChaiBapchya commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue
URL: https://github.com/apache/incubator-mxnet/pull/17297#discussion_r369303550
 
 

 ##########
 File path: cmake/Modules/FindNCCL.cmake
 ##########
 @@ -31,7 +31,13 @@
 # install NCCL in the same location as the CUDA toolkit.
 # See https://github.com/caffe2/caffe2/issues/1601
 
-set(NCCL_ROOT_DIR "" CACHE PATH "Folder contains NVIDIA NCCL")
+set(NCCL_ROOT $ENV{NCCL_ROOT} CACHE PATH "Folder contains NVIDIA NCCL")
 
 Review comment:
   So I am guessing I should remove it here too?
   https://github.com/apache/incubator-mxnet/blob/a9abba5a934755813481e6583435298a0c317957/cmake/Modules/FindNVTX.cmake#L18
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] ChaiBapchya commented on issue #17297: Fix NCCL Cmake autodetect issue

Posted by GitBox <gi...@apache.org>.
ChaiBapchya commented on issue #17297: Fix NCCL Cmake autodetect issue
URL: https://github.com/apache/incubator-mxnet/pull/17297#issuecomment-574798885
 
 
   @mxnet-label-bot [pr-awaiting-review]

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] ChaiBapchya commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue

Posted by GitBox <gi...@apache.org>.
ChaiBapchya commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue
URL: https://github.com/apache/incubator-mxnet/pull/17297#discussion_r367682940
 
 

 ##########
 File path: cmake/Modules/FindNCCL.cmake
 ##########
 @@ -33,6 +33,23 @@
 
 set(NCCL_ROOT_DIR "" CACHE PATH "Folder contains NVIDIA NCCL")
 
+# first check in the /usr/local/cuda before other paths
+if (UNIX)
+  set (search_paths "/usr/local/cuda")
+
+  find_path(NCCL_INCLUDE_DIRS
+  NAMES nccl.h
 
 Review comment:
   yes good catch! indentation needed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] apeforest commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue

Posted by GitBox <gi...@apache.org>.
apeforest commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue
URL: https://github.com/apache/incubator-mxnet/pull/17297#discussion_r367568453
 
 

 ##########
 File path: cmake/Modules/FindNCCL.cmake
 ##########
 @@ -33,6 +33,23 @@
 
 set(NCCL_ROOT_DIR "" CACHE PATH "Folder contains NVIDIA NCCL")
 
+# first check in the /usr/local/cuda before other paths
+if (UNIX)
+  set (search_paths "/usr/local/cuda")
+
+  find_path(NCCL_INCLUDE_DIRS
+  NAMES nccl.h
 
 Review comment:
   Not familiar with cmake, but do you need add indentation here?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] leezu commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue

Posted by GitBox <gi...@apache.org>.
leezu commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue
URL: https://github.com/apache/incubator-mxnet/pull/17297#discussion_r366166613
 
 

 ##########
 File path: cmake/Modules/FindNCCL.cmake
 ##########
 @@ -33,6 +33,26 @@
 
 set(NCCL_ROOT_DIR "" CACHE PATH "Folder contains NVIDIA NCCL")
 
+# if CUDAToolkit_FOUND is not found, try default location
+#if (NOT CUDAToolkit_FOUND)
+  if (UNIX)
+    set (search_paths "/usr/local/cuda")
+
+    find_path(NCCL_INCLUDE_DIRS
+    NAMES nccl.h
+    PATHS ${search_paths}
 
 Review comment:
   Should this be HINTS instead of PATHS?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] ChaiBapchya commented on issue #17297: Fix NCCL Cmake autodetect issue

Posted by GitBox <gi...@apache.org>.
ChaiBapchya commented on issue #17297: Fix NCCL Cmake autodetect issue
URL: https://github.com/apache/incubator-mxnet/pull/17297#issuecomment-575524044
 
 
   Also @leezu any idea why it's failing for Windows CPU Cmake. Unable to discern from the Error log.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] leezu commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue

Posted by GitBox <gi...@apache.org>.
leezu commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue
URL: https://github.com/apache/incubator-mxnet/pull/17297#discussion_r369899466
 
 

 ##########
 File path: cmake/Modules/FindNCCL.cmake
 ##########
 @@ -31,7 +31,13 @@
 # install NCCL in the same location as the CUDA toolkit.
 # See https://github.com/caffe2/caffe2/issues/1601
 
-set(NCCL_ROOT_DIR "" CACHE PATH "Folder contains NVIDIA NCCL")
+set(NCCL_ROOT $ENV{NCCL_ROOT} CACHE PATH "Folder contains NVIDIA NCCL")
 
 Review comment:
   Yes, I agree with the Stackoverflow answer. In your case, it would "assigns description" to the variable. But as these are deprecated variables I don't think we need to add description.
   The standard / supported variable is `NCCL_ROOT` as per https://cmake.org/cmake/help/latest/policy/CMP0074.html

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] leezu merged pull request #17297: Fix NCCL Cmake autodetect issue

Posted by GitBox <gi...@apache.org>.
leezu merged pull request #17297: Fix NCCL Cmake autodetect issue
URL: https://github.com/apache/incubator-mxnet/pull/17297
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] apeforest commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue

Posted by GitBox <gi...@apache.org>.
apeforest commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue
URL: https://github.com/apache/incubator-mxnet/pull/17297#discussion_r366134334
 
 

 ##########
 File path: cmake/Modules/FindNCCL.cmake
 ##########
 @@ -33,6 +33,26 @@
 
 set(NCCL_ROOT_DIR "" CACHE PATH "Folder contains NVIDIA NCCL")
 
+# if CUDAToolkit_FOUND is not found, try default location
+#if (NOT CUDAToolkit_FOUND)
+  if (UNIX)
+    set (search_paths "/usr/local/cuda")
+
+    find_path(NCCL_INCLUDE_DIRS
+    NAMES nccl.h
 
 Review comment:
   Indent

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] apeforest commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue

Posted by GitBox <gi...@apache.org>.
apeforest commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue
URL: https://github.com/apache/incubator-mxnet/pull/17297#discussion_r367568504
 
 

 ##########
 File path: cmake/Modules/FindNCCL.cmake
 ##########
 @@ -33,6 +33,23 @@
 
 set(NCCL_ROOT_DIR "" CACHE PATH "Folder contains NVIDIA NCCL")
 
+# first check in the /usr/local/cuda before other paths
+if (UNIX)
+  set (search_paths "/usr/local/cuda")
+
+  find_path(NCCL_INCLUDE_DIRS
+  NAMES nccl.h
+  PATHS ${search_paths}
+  PATH_SUFFIXES include
+  )
+
+  find_library(NCCL_LIBRARIES
+  NAMES nccl
 
 Review comment:
   Not familiar with cmake, but do you need add indentation here?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] leezu commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue

Posted by GitBox <gi...@apache.org>.
leezu commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue
URL: https://github.com/apache/incubator-mxnet/pull/17297#discussion_r369298016
 
 

 ##########
 File path: cmake/Modules/FindNCCL.cmake
 ##########
 @@ -31,7 +31,13 @@
 # install NCCL in the same location as the CUDA toolkit.
 # See https://github.com/caffe2/caffe2/issues/1601
 
-set(NCCL_ROOT_DIR "" CACHE PATH "Folder contains NVIDIA NCCL")
+set(NCCL_ROOT $ENV{NCCL_ROOT} CACHE PATH "Folder contains NVIDIA NCCL")
 
 Review comment:
   > You can also set(ENV{variable_name} value) and get $ENV{variable_name} environment variables, though it is generally a very good idea to avoid them.
   
   https://cliutils.gitlab.io/modern-cmake/chapters/basics/variables.html
   
   Should the env variable checks here be removed? You could just delete these 3 lines.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] ChaiBapchya commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue

Posted by GitBox <gi...@apache.org>.
ChaiBapchya commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue
URL: https://github.com/apache/incubator-mxnet/pull/17297#discussion_r369304098
 
 

 ##########
 File path: cmake/Modules/FindNCCL.cmake
 ##########
 @@ -31,7 +31,13 @@
 # install NCCL in the same location as the CUDA toolkit.
 # See https://github.com/caffe2/caffe2/issues/1601
 
-set(NCCL_ROOT_DIR "" CACHE PATH "Folder contains NVIDIA NCCL")
+set(NCCL_ROOT $ENV{NCCL_ROOT} CACHE PATH "Folder contains NVIDIA NCCL")
 
 Review comment:
   or is it specifically to do with ENV vars. In that case we can set "" to these 3 variables - NCCL_ROOT, NCCL_INCLUDE_DIR and NCCL_LIB_DIR
   Since that seems to be the norm

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] leezu commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue

Posted by GitBox <gi...@apache.org>.
leezu commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue
URL: https://github.com/apache/incubator-mxnet/pull/17297#discussion_r366166613
 
 

 ##########
 File path: cmake/Modules/FindNCCL.cmake
 ##########
 @@ -33,6 +33,26 @@
 
 set(NCCL_ROOT_DIR "" CACHE PATH "Folder contains NVIDIA NCCL")
 
+# if CUDAToolkit_FOUND is not found, try default location
+#if (NOT CUDAToolkit_FOUND)
+  if (UNIX)
+    set (search_paths "/usr/local/cuda")
+
+    find_path(NCCL_INCLUDE_DIRS
+    NAMES nccl.h
+    PATHS ${search_paths}
 
 Review comment:
   Should this be HINTS?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] ChaiBapchya commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue

Posted by GitBox <gi...@apache.org>.
ChaiBapchya commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue
URL: https://github.com/apache/incubator-mxnet/pull/17297#discussion_r366185864
 
 

 ##########
 File path: cmake/Modules/FindNCCL.cmake
 ##########
 @@ -33,6 +33,26 @@
 
 set(NCCL_ROOT_DIR "" CACHE PATH "Folder contains NVIDIA NCCL")
 
+# if CUDAToolkit_FOUND is not found, try default location
+#if (NOT CUDAToolkit_FOUND)
+  if (UNIX)
+    set (search_paths "/usr/local/cuda")
+
+    find_path(NCCL_INCLUDE_DIRS
+    NAMES nccl.h
+    PATHS ${search_paths}
 
 Review comment:
   https://cmake.org/cmake/help/v3.4/command/find_path.html
   Looks like either Hints or paths is permitted. Any specific reason to use hints?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] leezu commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue

Posted by GitBox <gi...@apache.org>.
leezu commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue
URL: https://github.com/apache/incubator-mxnet/pull/17297#discussion_r369329996
 
 

 ##########
 File path: cmake/Modules/FindNCCL.cmake
 ##########
 @@ -31,7 +31,13 @@
 # install NCCL in the same location as the CUDA toolkit.
 # See https://github.com/caffe2/caffe2/issues/1601
 
-set(NCCL_ROOT_DIR "" CACHE PATH "Folder contains NVIDIA NCCL")
+set(NCCL_ROOT $ENV{NCCL_ROOT} CACHE PATH "Folder contains NVIDIA NCCL")
 
 Review comment:
   @ChaiBapchya Yes, for this PR my concern is only that you are introducing environment variable depencies, which is not standard practice and not needed.
   So for this PR, let's just remove the 3 lines.
   For a later PR, NVTX can be refactored.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] ChaiBapchya commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue

Posted by GitBox <gi...@apache.org>.
ChaiBapchya commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue
URL: https://github.com/apache/incubator-mxnet/pull/17297#discussion_r367575646
 
 

 ##########
 File path: cmake/Modules/FindNCCL.cmake
 ##########
 @@ -33,6 +33,23 @@
 
 set(NCCL_ROOT_DIR "" CACHE PATH "Folder contains NVIDIA NCCL")
 
+# first check in the /usr/local/cuda before other paths
 
 Review comment:
   Ya. Even I saw this NCCL_ROOT and found that NCCL_ROOT_DIR is deprecated hence no longer to be used in PyTorch repo
   https://github.com/pytorch/pytorch/blob/f94aab45fd7b1e7e32d8ea4324da6d3632a0d412/cmake/Modules/FindNCCL.cmake#L21
   
   However, I couldn't find official cmake doc saying so. Nonetheless, I think I should add that too.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] apeforest commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue

Posted by GitBox <gi...@apache.org>.
apeforest commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue
URL: https://github.com/apache/incubator-mxnet/pull/17297#discussion_r368080668
 
 

 ##########
 File path: cmake/Modules/FindNCCL.cmake
 ##########
 @@ -55,6 +67,23 @@ find_library(NCCL_LIBRARIES
   $ENV{NCCL_DIR}/lib
   )
 
+# if not found in any of the above paths, finally, check in the /usr/local/cuda for UNIX systems
+if (UNIX)
 
 Review comment:
   What happens if not UNIX? is there any error/warning message issued?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] apeforest commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue

Posted by GitBox <gi...@apache.org>.
apeforest commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue
URL: https://github.com/apache/incubator-mxnet/pull/17297#discussion_r368079950
 
 

 ##########
 File path: cmake/Modules/FindNCCL.cmake
 ##########
 @@ -55,6 +67,23 @@ find_library(NCCL_LIBRARIES
   $ENV{NCCL_DIR}/lib
   )
 
+# if not found in any of the above paths, finally, check in the /usr/local/cuda for UNIX systems
+if (UNIX)
+  set (search_paths "/usr/local/cuda")
+
+  find_path(NCCL_INCLUDE_DIRS
+   NAMES nccl.h
 
 Review comment:
   Sorry for being picky, but isn't the indent supposed to be 2 space for all?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] leezu commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue

Posted by GitBox <gi...@apache.org>.
leezu commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue
URL: https://github.com/apache/incubator-mxnet/pull/17297#discussion_r367715337
 
 

 ##########
 File path: cmake/Modules/FindNCCL.cmake
 ##########
 @@ -33,6 +33,23 @@
 
 set(NCCL_ROOT_DIR "" CACHE PATH "Folder contains NVIDIA NCCL")
 
+# first check in the /usr/local/cuda before other paths
 
 Review comment:
   @ChaiBapchya Thank you!
   
   After reading CMP0074 again,
   > [...] now searches prefixes specified by the `<PackageName>_ROOT` CMake variable and the `<PackageName>_ROOT` environment variable. Package roots are maintained as a stack so nested calls to all find_* commands inside find modules and config packages also search the roots as prefixes.
   
   seems to imply that it's not necessary to list `NCCL_ROOT` as part of `find_path` and `find_library`, but rather it's done automatically. Do you understand it in the same way?
   Also https://cmake.org/cmake/help/latest/command/find_package.html#search-procedure (paragraph "The set of installation prefixes is constructed using the following steps [...]") suggests this.
   
   If so,  `set(NCCL_ROOT $ENV{NCCL_ROOT} CACHE PATH "Folder contains NVIDIA NCCL")` would be superfluous. Also it seems none of the other Find modules check the `X_LIB_DIR` and `X_LIB_DIR` env variables.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] ChaiBapchya commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue

Posted by GitBox <gi...@apache.org>.
ChaiBapchya commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue
URL: https://github.com/apache/incubator-mxnet/pull/17297#discussion_r367201643
 
 

 ##########
 File path: cmake/Modules/FindNCCL.cmake
 ##########
 @@ -33,6 +33,23 @@
 
 set(NCCL_ROOT_DIR "" CACHE PATH "Folder contains NVIDIA NCCL")
 
+# first check in the /usr/local/cuda before other paths
 
 Review comment:
   I'm not sure why it breaks the assumption. Can you help me rephrase the comment? I can remove "first" if that's what confuses the users.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] ChaiBapchya commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue

Posted by GitBox <gi...@apache.org>.
ChaiBapchya commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue
URL: https://github.com/apache/incubator-mxnet/pull/17297#discussion_r367701531
 
 

 ##########
 File path: cmake/Modules/FindNCCL.cmake
 ##########
 @@ -33,6 +33,23 @@
 
 set(NCCL_ROOT_DIR "" CACHE PATH "Folder contains NVIDIA NCCL")
 
+# first check in the /usr/local/cuda before other paths
 
 Review comment:
   Moved /usr/local/cuda check to be after other checks
   Also added NCCL_ROOT alongside NCCL_ROOT_DIR
   Also added deprecation warning for ROOT_DIR 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] ChaiBapchya commented on issue #17297: Fix NCCL Cmake autodetect issue

Posted by GitBox <gi...@apache.org>.
ChaiBapchya commented on issue #17297: Fix NCCL Cmake autodetect issue
URL: https://github.com/apache/incubator-mxnet/pull/17297#issuecomment-574796638
 
 
   Verified this works by running following
   Fresh cmake build
   ```
   rm -rf build
   mkdir -p build && cd build
   cmake -GNinja \
       -DUSE_CUDA=ON \
       -DUSE_MKL_IF_AVAILABLE=OFF \
       -DCMAKE_CUDA_COMPILER_LAUNCHER=ccache \
       -DCMAKE_C_COMPILER_LAUNCHER=ccache \
       -DCMAKE_CXX_COMPILER_LAUNCHER=ccache \
       -DCMAKE_BUILD_TYPE=Release \
       -DUSE_NCCL=ON \
   ..
   ninja
   ```
   Python binding
   ```
   cd ..
   pip install -e python/.
   python3 -c "from mxnet.runtime import feature_list; print(feature_list())"
   [✔ CUDA, ✔ CUDNN, ✔ NCCL, ✔ CUDA_RTC, ✖ TENSORRT, ✔ CPU_SSE, ✔ CPU_SSE2, ✔ CPU_SSE3, ✔ CPU_SSE4_1, ✔ CPU_SSE4_2, ✖ CPU_SSE4A, ✔ CPU_AVX, ✖ CPU_AVX2, ✔ OPENMP, ✖ SSE, ✔ F16C, ✔ JEMALLOC, ✔ BLAS_OPEN, ✖ BLAS_ATLAS, ✖ BLAS_MKL, ✖ BLAS_APPLE, ✔ LAPACK, ✖ MKLDNN, ✔ OPENCV, ✖ CAFFE, ✖ PROFILER, ✖ DIST_KVSTORE, ✖ CXX14, ✖ INT64_TENSOR_SIZE, ✔ SIGNAL_HANDLER, ✖ DEBUG, ✖ TVM_OP]
   ```
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] apeforest commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue

Posted by GitBox <gi...@apache.org>.
apeforest commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue
URL: https://github.com/apache/incubator-mxnet/pull/17297#discussion_r369304872
 
 

 ##########
 File path: cmake/Modules/FindNCCL.cmake
 ##########
 @@ -31,7 +31,13 @@
 # install NCCL in the same location as the CUDA toolkit.
 # See https://github.com/caffe2/caffe2/issues/1601
 
-set(NCCL_ROOT_DIR "" CACHE PATH "Folder contains NVIDIA NCCL")
+set(NCCL_ROOT $ENV{NCCL_ROOT} CACHE PATH "Folder contains NVIDIA NCCL")
 
 Review comment:
   @ChaiBapchya let's focus on one thing in a PR. I would change the NVTX_ROOT_DIR in separate PR if needed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] leezu commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue

Posted by GitBox <gi...@apache.org>.
leezu commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue
URL: https://github.com/apache/incubator-mxnet/pull/17297#discussion_r366391503
 
 

 ##########
 File path: cmake/Modules/FindNCCL.cmake
 ##########
 @@ -33,6 +33,26 @@
 
 set(NCCL_ROOT_DIR "" CACHE PATH "Folder contains NVIDIA NCCL")
 
+# if CUDAToolkit_FOUND is not found, try default location
+#if (NOT CUDAToolkit_FOUND)
+  if (UNIX)
+    set (search_paths "/usr/local/cuda")
+
+    find_path(NCCL_INCLUDE_DIRS
+    NAMES nccl.h
+    PATHS ${search_paths}
 
 Review comment:
   When writing the comment I thought PATHS may skip default locations whereas HINTS would be defaults + specified locations. Thanks for pointing out they are equivalent.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] ChaiBapchya commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue

Posted by GitBox <gi...@apache.org>.
ChaiBapchya commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue
URL: https://github.com/apache/incubator-mxnet/pull/17297#discussion_r367577696
 
 

 ##########
 File path: cmake/Modules/FindNCCL.cmake
 ##########
 @@ -33,6 +33,23 @@
 
 set(NCCL_ROOT_DIR "" CACHE PATH "Folder contains NVIDIA NCCL")
 
+# first check in the /usr/local/cuda before other paths
 
 Review comment:
   > If there is some nccl.h at /usr/local/cuda but the user specifies NCCL_ROOT_DIR=/home/ubuntu/myversionofnccl, the /usr/local/cuda is wrongly used? Or do I misunderstand the doc?
   
   I agree, since "search" will not be repeated until var is set, it makes sense to assume whatever location is found first is final. So in that case first check has to be env var and second for default.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] ChaiBapchya commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue

Posted by GitBox <gi...@apache.org>.
ChaiBapchya commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue
URL: https://github.com/apache/incubator-mxnet/pull/17297#discussion_r369303550
 
 

 ##########
 File path: cmake/Modules/FindNCCL.cmake
 ##########
 @@ -31,7 +31,13 @@
 # install NCCL in the same location as the CUDA toolkit.
 # See https://github.com/caffe2/caffe2/issues/1601
 
-set(NCCL_ROOT_DIR "" CACHE PATH "Folder contains NVIDIA NCCL")
+set(NCCL_ROOT $ENV{NCCL_ROOT} CACHE PATH "Folder contains NVIDIA NCCL")
 
 Review comment:
   So I am guessing I should remove it here too?
   https://github.com/ChaiBapchya/incubator-mxnet/blob/a9abba5a934755813481e6583435298a0c317957/cmake/Modules/FindNVTX.cmake#L18
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] leezu commented on issue #17297: Fix NCCL Cmake autodetect issue

Posted by GitBox <gi...@apache.org>.
leezu commented on issue #17297: Fix NCCL Cmake autodetect issue
URL: https://github.com/apache/incubator-mxnet/pull/17297#issuecomment-575570006
 
 
   Not sure. I restarted the job to make sure it's not an intermittent issue. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] ChaiBapchya commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue

Posted by GitBox <gi...@apache.org>.
ChaiBapchya commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue
URL: https://github.com/apache/incubator-mxnet/pull/17297#discussion_r367578134
 
 

 ##########
 File path: cmake/Modules/FindNCCL.cmake
 ##########
 @@ -33,6 +33,23 @@
 
 set(NCCL_ROOT_DIR "" CACHE PATH "Folder contains NVIDIA NCCL")
 
+# first check in the /usr/local/cuda before other paths
 
 Review comment:
   Surprisingly, I couldn't find a check for `/usr/local/cuda` in case of pytorch's FindNCCL
   I have created a discuss question on their forum anyway - https://discuss.pytorch.org/t/findnccl-cmake-search-for-usr-local-cuda/66891

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] apeforest commented on issue #17297: Fix NCCL Cmake autodetect issue

Posted by GitBox <gi...@apache.org>.
apeforest commented on issue #17297: Fix NCCL Cmake autodetect issue
URL: https://github.com/apache/incubator-mxnet/pull/17297#issuecomment-575970030
 
 
   @ChaiBapchya Please fix the indent problem as I pointed out. Otherwise, this PR is good to go for me.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] ChaiBapchya edited a comment on issue #17297: Fix NCCL Cmake autodetect issue

Posted by GitBox <gi...@apache.org>.
ChaiBapchya edited a comment on issue #17297: Fix NCCL Cmake autodetect issue
URL: https://github.com/apache/incubator-mxnet/pull/17297#issuecomment-574798885
 
 
   @mxnet-label-bot add [pr-awaiting-review]

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] ChaiBapchya commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue

Posted by GitBox <gi...@apache.org>.
ChaiBapchya commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue
URL: https://github.com/apache/incubator-mxnet/pull/17297#discussion_r367813676
 
 

 ##########
 File path: cmake/Modules/FindNCCL.cmake
 ##########
 @@ -33,6 +33,23 @@
 
 set(NCCL_ROOT_DIR "" CACHE PATH "Folder contains NVIDIA NCCL")
 
+# first check in the /usr/local/cuda before other paths
 
 Review comment:
   1. NCCL_ROOT auto check - I guess it is done automatically. Though it's not mentioned explicitly anywhere. Hence asked here - https://stackoverflow.com/questions/59783425/does-cmake-find-path-find-library-search-for-packagename-root-by-default
   
   2. 
   > Also it seems none of the other Find modules check the X_LIB_DIR and X_LIB_DIR env variables.
   Not sure what you implying? You think that is superfluous too? 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] apeforest commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue

Posted by GitBox <gi...@apache.org>.
apeforest commented on a change in pull request #17297: Fix NCCL Cmake autodetect issue
URL: https://github.com/apache/incubator-mxnet/pull/17297#discussion_r366134308
 
 

 ##########
 File path: cmake/Modules/FindNCCL.cmake
 ##########
 @@ -33,6 +33,26 @@
 
 set(NCCL_ROOT_DIR "" CACHE PATH "Folder contains NVIDIA NCCL")
 
+# if CUDAToolkit_FOUND is not found, try default location
+#if (NOT CUDAToolkit_FOUND)
 
 Review comment:
   Remove?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services