You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2020/07/31 21:54:28 UTC

[GitHub] [incubator-mxnet] stu1130 opened a new issue #18834: CUDA: an illegal memory access was encountered on hybridized yolo model

stu1130 opened a new issue #18834:
URL: https://github.com/apache/incubator-mxnet/issues/18834


   ## Step to reproduce
   * build mxnet-cu101 from source based off mxnet 1.7.0 branch
   * gluoncv 0.7
   ```
   from gluoncv import model_zoo, data, utils
   from matplotlib import pyplot as plt
   import mxnet as mx
   from mxnet import gluon 
   
   net = model_zoo.get_model('yolo3_darknet53_coco', pretrained=True, ctx=mx.gpu())
   net.hybridize()
   x = mx.nd.random.uniform(shape=(1, 3, 1000, 1000), ctx=mx.gpu())
   _, scores, _ = net(x)
   print(scores.shape)
   net.export("yolo")
   
   deserialized_net = gluon.nn.SymbolBlock.imports("yolo-symbol.json", ['data'], "yolo-0000.params", ctx=mx.gpu())
   image = mx.nd.random.normal(shape=(1, 3, 1000, 1000), ctx=mx.gpu())
   print(deserialized_net(image))
   ```
   ```
   CUDA: Check failed: e == cudaSuccess: an illegal memory access was encountered
   [21:53:46] src/resource.cc:279: Ignore CUDA Error [21:53:46] src/storage/./pooled_storage_manager.h:97: CUDA: an illegal memory access was encountered
   
   
   [21:53:46] src/resource.cc:331: Ignore CUDA Error [21:53:46] src/common/random_generator.cu:70: Check failed: e == cudaSuccess || e == cudaErrorCudartUnloading: CUDA: an illegal memory access was encountered
   ```
   It works fine when I use mxnet-cu101 1.6 pip wheel
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] stu1130 removed a comment on issue #18834: CUDA: an illegal memory access was encountered on hybridized yolo model

Posted by GitBox <gi...@apache.org>.
stu1130 removed a comment on issue #18834:
URL: https://github.com/apache/incubator-mxnet/issues/18834#issuecomment-668860415


   @szha sure but I ran into this error on master branch
   ```
   /home/ubuntu/mxnet_master/tools/dependencies/openblas.sh: line 35: patchelf: command not found
   ```
   This is the command I used
   ```
   tools/staticbuild/build.sh cu101
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] leezu edited a comment on issue #18834: CUDA: an illegal memory access was encountered on hybridized yolo model

Posted by GitBox <gi...@apache.org>.
leezu edited a comment on issue #18834:
URL: https://github.com/apache/incubator-mxnet/issues/18834#issuecomment-668865520


   @stu1130 how about installing patchelf? Just run `apt install patchelf`


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] troyliu0105 commented on issue #18834: CUDA: an illegal memory access was encountered on hybridized yolo model

Posted by GitBox <gi...@apache.org>.
troyliu0105 commented on issue #18834:
URL: https://github.com/apache/incubator-mxnet/issues/18834#issuecomment-753314465


   @szha
   I built v1.8.0, v1.7.0 to reproduce this issue. and I can confirm that still happens. 
   But it works fine when I use prebuild package 1.6.0.post0 from pypi.
   
   I have same issue in my code. After I turned `hybridize` on, it will produce this error. But with `off`, everything works fine.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] troyliu0105 commented on issue #18834: CUDA: an illegal memory access was encountered on hybridized yolo model

Posted by GitBox <gi...@apache.org>.
troyliu0105 commented on issue #18834:
URL: https://github.com/apache/incubator-mxnet/issues/18834#issuecomment-753721743


   @szha I build the most recently master branch. cuz gluoncv is not compatible with mxnet 2.0 (can't pass version assert), so I use previous exported weight to test like below:
   
   ```python
   from gluoncv import model_zoo, data, utils
   from matplotlib import pyplot as plt
   import mxnet as mx
   from mxnet import gluon
   deserialized_net = gluon.nn.SymbolBlock.imports("yolo-symbol.json", ['data'], "yolo-0000.params", ctx=mx.gpu())
   image = mx.nd.random.normal(shape=(1, 3, 1000, 1000), ctx=mx.gpu())
   print(deserialized_net(image))
   ```
   
   and this issue still occurs.
   <img width="1388" alt="image" src="https://user-images.githubusercontent.com/5518286/103495301-4757ab00-4e75-11eb-9a1a-b8cf2f6f9b0f.png">
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] Zha0q1 commented on issue #18834: CUDA: an illegal memory access was encountered on hybridized yolo model

Posted by GitBox <gi...@apache.org>.
Zha0q1 commented on issue #18834:
URL: https://github.com/apache/incubator-mxnet/issues/18834#issuecomment-810597490


   @waytrue17 and I also had this issue with mxnet 1.7


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] shesung commented on issue #18834: CUDA: an illegal memory access was encountered on hybridized yolo model

Posted by GitBox <gi...@apache.org>.
shesung commented on issue #18834:
URL: https://github.com/apache/incubator-mxnet/issues/18834#issuecomment-983275103


   This issue still exists in  mxnet-cu102==1.8.0.post0


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] stu1130 commented on issue #18834: CUDA: an illegal memory access was encountered on hybridized yolo model

Posted by GitBox <gi...@apache.org>.
stu1130 commented on issue #18834:
URL: https://github.com/apache/incubator-mxnet/issues/18834#issuecomment-668860415


   @szha sure but I ran into this error on master branch
   ```
   /home/ubuntu/mxnet_master/tools/dependencies/openblas.sh: line 35: patchelf: command not found
   ```
   This is the command I used
   ```
   tools/staticbuild/build.sh cu101
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] leezu commented on issue #18834: CUDA: an illegal memory access was encountered on hybridized yolo model

Posted by GitBox <gi...@apache.org>.
leezu commented on issue #18834:
URL: https://github.com/apache/incubator-mxnet/issues/18834#issuecomment-753681847


   Ok. I thought you talk about the `ImportError: cannot import name 'nd'` issue described above.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] shesung edited a comment on issue #18834: CUDA: an illegal memory access was encountered on hybridized yolo model

Posted by GitBox <gi...@apache.org>.
shesung edited a comment on issue #18834:
URL: https://github.com/apache/incubator-mxnet/issues/18834#issuecomment-983275103


   This issue still exists in  mxnet-cu102==1.8.0.post0,   mxnet-cu102==1.7.0.post1, 
   
   It is OK with mxnet-cu102==1.6.0.post0


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] troyliu0105 commented on issue #18834: CUDA: an illegal memory access was encountered on hybridized yolo model

Posted by GitBox <gi...@apache.org>.
troyliu0105 commented on issue #18834:
URL: https://github.com/apache/incubator-mxnet/issues/18834#issuecomment-753327925


   @leezu actually, I created different anaconda env and duplicated source code directory for each build. So, I think they are isolated. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] troyliu0105 commented on issue #18834: CUDA: an illegal memory access was encountered on hybridized yolo model

Posted by GitBox <gi...@apache.org>.
troyliu0105 commented on issue #18834:
URL: https://github.com/apache/incubator-mxnet/issues/18834#issuecomment-753577119


   @leezu 😂,that's not the issue I encountered. I think you confused me with the one opened this issue.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] nicklhy commented on issue #18834: CUDA: an illegal memory access was encountered on hybridized yolo model

Posted by GitBox <gi...@apache.org>.
nicklhy commented on issue #18834:
URL: https://github.com/apache/incubator-mxnet/issues/18834#issuecomment-774835060


   Any updates ? Got the same problem for mxnet 1.7.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] leezu commented on issue #18834: CUDA: an illegal memory access was encountered on hybridized yolo model

Posted by GitBox <gi...@apache.org>.
leezu commented on issue #18834:
URL: https://github.com/apache/incubator-mxnet/issues/18834#issuecomment-753323646


   @troyliu0105 it just means that you managed to corrupt your python profile. Try uninstalling all mxnet packages (you probably have multiple insatlled?) and installing again


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] stu1130 commented on issue #18834: CUDA: an illegal memory access was encountered on hybridized yolo model

Posted by GitBox <gi...@apache.org>.
stu1130 commented on issue #18834:
URL: https://github.com/apache/incubator-mxnet/issues/18834#issuecomment-668867727


   @leezu thanks! It works


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] szha commented on issue #18834: CUDA: an illegal memory access was encountered on hybridized yolo model

Posted by GitBox <gi...@apache.org>.
szha commented on issue #18834:
URL: https://github.com/apache/incubator-mxnet/issues/18834#issuecomment-668826305


   @stu1130 could you help verify if this still happens with the master branch?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] szha commented on issue #18834: CUDA: an illegal memory access was encountered on hybridized yolo model

Posted by GitBox <gi...@apache.org>.
szha commented on issue #18834:
URL: https://github.com/apache/incubator-mxnet/issues/18834#issuecomment-668941590


   That is surprising as I can still import nd from a build on the master branch.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] leezu commented on issue #18834: CUDA: an illegal memory access was encountered on hybridized yolo model

Posted by GitBox <gi...@apache.org>.
leezu commented on issue #18834:
URL: https://github.com/apache/incubator-mxnet/issues/18834#issuecomment-668865520


   @stu1130 how about installing patchelf?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] leezu commented on issue #18834: CUDA: an illegal memory access was encountered on hybridized yolo model

Posted by GitBox <gi...@apache.org>.
leezu commented on issue #18834:
URL: https://github.com/apache/incubator-mxnet/issues/18834#issuecomment-753333169


   Sorry, but as you experience the issue of importing, it means that your environment is broken. You can try using `pip install -e .` to install while in the `incubator-mxnet/python` directory


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] stu1130 commented on issue #18834: CUDA: an illegal memory access was encountered on hybridized yolo model

Posted by GitBox <gi...@apache.org>.
stu1130 commented on issue #18834:
URL: https://github.com/apache/incubator-mxnet/issues/18834#issuecomment-668934067


   It seems gluoncv 0.7 doesn't work with mxnet 2.0. Is the gluoncv 0.7 the latest version I can get?
   ```
   Traceback (most recent call last):
     File "demo_yolo.py", line 10, in <module>
       from gluoncv import model_zoo, data, utils
     File "/home/ubuntu/.local/lib/python3.5/site-packages/gluoncv/__init__.py", line 8, in <module>
       from . import data
     File "/home/ubuntu/.local/lib/python3.5/site-packages/gluoncv/data/__init__.py", line 4, in <module>
       from . import transforms
     File "/home/ubuntu/.local/lib/python3.5/site-packages/gluoncv/data/transforms/__init__.py", line 5, in <module>
       from . import image
     File "/home/ubuntu/.local/lib/python3.5/site-packages/gluoncv/data/transforms/image.py", line 6, in <module>
       from mxnet import nd
   ImportError: cannot import name 'nd'
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] leezu edited a comment on issue #18834: CUDA: an illegal memory access was encountered on hybridized yolo model

Posted by GitBox <gi...@apache.org>.
leezu edited a comment on issue #18834:
URL: https://github.com/apache/incubator-mxnet/issues/18834#issuecomment-668865520


   @stu1130 how about installing patchelf? Just run `apt install patchelf` as per the documentation in https://github.com/apache/incubator-mxnet/tree/master/tools/staticbuild


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] waytrue17 commented on issue #18834: CUDA: an illegal memory access was encountered on hybridized yolo model

Posted by GitBox <gi...@apache.org>.
waytrue17 commented on issue #18834:
URL: https://github.com/apache/incubator-mxnet/issues/18834#issuecomment-810598885


   I can confirm that the issue only occurs at `yolo3_darknet53_coco` but not `yolo3_darknet53_voc`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org


[GitHub] [incubator-mxnet] stu1130 commented on issue #18834: CUDA: an illegal memory access was encountered on hybridized yolo model

Posted by GitBox <gi...@apache.org>.
stu1130 commented on issue #18834:
URL: https://github.com/apache/incubator-mxnet/issues/18834#issuecomment-668863096


   @szha sure but when I build mxnet from source I ran into the error.
   ```
   +++ patchelf --set-rpath '$ORIGIN' --force-rpath libopenblas.so
   /home/ubuntu/mxnet_master/tools/dependencies/openblas.sh: line 35: patchelf: command not found
   ```
   This is the command I used.
   ```
   tools/staticbuild/build.sh cu101
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org