You are viewing a plain text version of this content. The canonical link for it is here.
Posted to discuss-archive@mxnet.apache.org by Karl via MXNet Forum <mx...@discoursemail.com.INVALID> on 2020/08/04 03:30:10 UTC
[MXNet Forum] [Gluon] An illegal memory access
i have used mxnet (1.6.0) for face recogniton, but accidently it reports an error after 2 epochs during normal training:
```
Traceback (most recent call last):
File "train_0723.py", line 455, in <module>
main()
File "train_0723.py", line 451, in main
train_net(args)
File "train_0723.py", line 445, in train_net
epoch_end_callback=epoch_cb)
File "/home/user1/recognition/parall_module_local_v1_gluon_group.py", line 573, in fit
self.update()
File "/home/user1/recognition/parall_module_local_v1_gluon_group.py", line 406, in update
mx.nd.waitall()
File "/home/user1/miniconda3/lib/python3.7/site-packages/mxnet/ndarray/ndarray.py", line 200, in waitall
check_call(_LIB.MXNDArrayWaitAll())
File "/home/user1/miniconda3/lib/python3.7/site-packages/mxnet/base.py", line 255, in check_call
raise MXNetError(py_str(_LIB.MXGetLastError()))
mxnet.base.MXNetError: [03:32:38] /home/ubuntu/mxnet-distro/mxnet-build/3rdparty/mshadow/mshadow/./stream_gpu-inl.h:62: Check failed: e == cudaSuccess: CUDA: an illegal memory access was encountered
Stack trace:
[bt] (0) /home/user1/miniconda3/lib/python3.7/site-packages/mxnet/libmxnet.so(+0x6b41eb) [0x7f76131a51eb]
[bt] (1) /home/user1/miniconda3/lib/python3.7/site-packages/mxnet/libmxnet.so(+0x37b2742) [0x7f76162a3742]
[bt] (2) /home/user1/miniconda3/lib/python3.7/site-packages/mxnet/libmxnet.so(+0x37e3515) [0x7f76162d4515]
[bt] (3) /home/user1/miniconda3/lib/python3.7/site-packages/mxnet/libmxnet.so(+0x37bf6d1) [0x7f76162b06d1]
[bt] (4) /home/user1/miniconda3/lib/python3.7/site-packages/mxnet/libmxnet.so(+0x37c2c10) [0x7f76162b3c10]
[bt] (5) /home/user1/miniconda3/lib/python3.7/site-packages/mxnet/libmxnet.so(+0x37c2ea6) [0x7f76162b3ea6]
[bt] (6) /home/user1/miniconda3/lib/python3.7/site-packages/mxnet/libmxnet.so(+0x37bde84) [0x7f76162aee84]
[bt] (7) /home/user1/miniconda3/bin/../lib/libstdc++.so.6(+0xc8421) [0x7f76aca9d421]
[bt] (8) /lib/x86_64-linux-gnu/libpthread.so.0(+0x9609) [0x7f76bb1f0609]
```
i haven't got any clue to solve this error after googling, but only decrease my batch_size 400 to 360, and not sure whether it will encounter error again... still worried about that :frowning:
---
[Visit Topic](https://discuss.mxnet.io/t/an-illegal-memory-access/6461/1) or reply to this email to respond.
You are receiving this because you enabled mailing list mode.
To unsubscribe from these emails, [click here](https://discuss.mxnet.io/email/unsubscribe/c6cfb36eec3e0673a9c5007fd1e32510ca31bd6f1db6cf0280979ba588b3eeb8).
[MXNet Forum] [Gluon] An illegal memory access
Posted by Triston via MXNet Forum <mx...@discoursemail.com.INVALID>.
@Karl Do you have a repro script?
---
[Visit Topic](https://discuss.mxnet.io/t/an-illegal-memory-access/6461/2) or reply to this email to respond.
You are receiving this because you enabled mailing list mode.
To unsubscribe from these emails, [click here](https://discuss.mxnet.io/email/unsubscribe/b0d41239c027fa0c4c5fc614e19a3f9f1f3d4b0f456f394d4414c532456074d5).