You are viewing a plain text version of this content. The canonical link for it is here.
Posted to discuss-archive@mxnet.apache.org by Karl via MXNet Forum <mx...@discoursemail.com.INVALID> on 2020/08/04 03:30:10 UTC

[MXNet Forum] [Gluon] An illegal memory access


i have used mxnet (1.6.0) for face recogniton, but accidently it reports an error after 2 epochs during normal training:
 ```
Traceback (most recent call last):
 File "train_0723.py", line 455, in <module>
    main()
  File "train_0723.py", line 451, in main
    train_net(args)
  File "train_0723.py", line 445, in train_net
    epoch_end_callback=epoch_cb)
  File "/home/user1/recognition/parall_module_local_v1_gluon_group.py", line 573, in fit
    self.update()
  File "/home/user1/recognition/parall_module_local_v1_gluon_group.py", line 406, in update
    mx.nd.waitall()
  File "/home/user1/miniconda3/lib/python3.7/site-packages/mxnet/ndarray/ndarray.py", line 200, in waitall
    check_call(_LIB.MXNDArrayWaitAll())
  File "/home/user1/miniconda3/lib/python3.7/site-packages/mxnet/base.py", line 255, in check_call
    raise MXNetError(py_str(_LIB.MXGetLastError()))
mxnet.base.MXNetError: [03:32:38] /home/ubuntu/mxnet-distro/mxnet-build/3rdparty/mshadow/mshadow/./stream_gpu-inl.h:62: Check failed: e == cudaSuccess: CUDA: an illegal memory access was encountered
Stack trace:
  [bt] (0) /home/user1/miniconda3/lib/python3.7/site-packages/mxnet/libmxnet.so(+0x6b41eb) [0x7f76131a51eb]
  [bt] (1) /home/user1/miniconda3/lib/python3.7/site-packages/mxnet/libmxnet.so(+0x37b2742) [0x7f76162a3742]
  [bt] (2) /home/user1/miniconda3/lib/python3.7/site-packages/mxnet/libmxnet.so(+0x37e3515) [0x7f76162d4515]
  [bt] (3) /home/user1/miniconda3/lib/python3.7/site-packages/mxnet/libmxnet.so(+0x37bf6d1) [0x7f76162b06d1]
  [bt] (4) /home/user1/miniconda3/lib/python3.7/site-packages/mxnet/libmxnet.so(+0x37c2c10) [0x7f76162b3c10]
  [bt] (5) /home/user1/miniconda3/lib/python3.7/site-packages/mxnet/libmxnet.so(+0x37c2ea6) [0x7f76162b3ea6]
  [bt] (6) /home/user1/miniconda3/lib/python3.7/site-packages/mxnet/libmxnet.so(+0x37bde84) [0x7f76162aee84]
  [bt] (7) /home/user1/miniconda3/bin/../lib/libstdc++.so.6(+0xc8421) [0x7f76aca9d421]
  [bt] (8) /lib/x86_64-linux-gnu/libpthread.so.0(+0x9609) [0x7f76bb1f0609]
```

 
i haven't got any clue to solve this error after googling, but only decrease my batch_size 400 to 360, and not sure whether it will encounter error again... still worried about that :frowning:





---
[Visit Topic](https://discuss.mxnet.io/t/an-illegal-memory-access/6461/1) or reply to this email to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click here](https://discuss.mxnet.io/email/unsubscribe/c6cfb36eec3e0673a9c5007fd1e32510ca31bd6f1db6cf0280979ba588b3eeb8).

[MXNet Forum] [Gluon] An illegal memory access

Posted by Triston via MXNet Forum <mx...@discoursemail.com.INVALID>.

@Karl  Do you have a repro script?





---
[Visit Topic](https://discuss.mxnet.io/t/an-illegal-memory-access/6461/2) or reply to this email to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click here](https://discuss.mxnet.io/email/unsubscribe/b0d41239c027fa0c4c5fc614e19a3f9f1f3d4b0f456f394d4414c532456074d5).