You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2020/04/04 16:45:44 UTC

[GitHub] [incubator-mxnet] r2d3 commented on issue #17826: [quetions] Check failed: e == cudaSuccess: CUDA: initialization error

r2d3 commented on issue #17826: [quetions] Check failed: e == cudaSuccess: CUDA: initialization error
URL: https://github.com/apache/incubator-mxnet/issues/17826#issuecomment-609056307
 
 
   HI @rondogency and @Rainweic,
   
   even with an "import mxnet" in each process, I got similar issue.
   
   For example: https://github.com/aws-samples/parallelize-ml-inference
   is creating `global model` in `init_worker`.
   It does not do `import mxnet` in each process. I modified the code to do the import in `init_worker` but still gets the CUDA error
   
   `  File "/home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/symbol/symbol.py", line 1776, in simple_bind
       ctypes.byref(exe_handle)))
     File "/home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/base.py", line 255, in check_call
       raise MXNetError(py_str(_LIB.MXGetLastError()))
   mxnet.base.MXNetError: [16:40:00] src/engine/./../common/cuda_utils.h:379: Check failed: e == cudaSuccess || e == cudaErrorCudartUnloading: CUDA: initialization error`
   
   https://github.com/aws-samples/parallelize-ml-inference/issues/3
   
   Regards
   
   David

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services