You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2019/01/20 14:39:03 UTC
[GitHub] OElesin opened a new issue #13942: Error using executing MXNet
DataLoader
OElesin opened a new issue #13942: Error using executing MXNet DataLoader
URL: https://github.com/apache/incubator-mxnet/issues/13942
When running the DataLoader I encounter a certain error which works from my MacBook and AWS Sagemaker. However, fails when I run the same with AWS Batch (which runs jobs in docker containers). See code below:
```python
data_loader = DataLoader(
dataset, batch_size=BATCH_SIZE, last_batch='keep',
shuffle=False, num_workers=multiprocessing.cpu_count()
)
for i, (data, label) in enumerate(data_loader):
data = data.as_in_context(ctx)
if i % n_print == 0 and i > 0:
print(
"{0} batches, {1} images, {2:.3f} img/sec".format(
i, i*BATCH_SIZE, BATCH_SIZE*n_print/(time.time()-tick)
)
)
tick = time.time()
output = net(data)
features[i * BATCH_SIZE:(i+1)*max(BATCH_SIZE, len(output)), :] = output.asnumpy().squeeze()
```
Error message:
```python
save(x)
File "/usr/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
File "/usr/lib/python2.7/multiprocessing/forking.py", line 66, in dispatcher
rv = reduce(obj)
File "/usr/local/lib/python2.7/dist-packages/mxnet/gluon/data/dataloader.py", line 43, in reduce_ndarray
return rebuild_ndarray, data._to_shared_mem()
File "/usr/local/lib/python2.7/dist-packages/mxnet/ndarray/ndarray.py", line 200, in _to_shared_mem
self.handle, ctypes.byref(shared_pid), ctypes.byref(shared_id)))
File "/usr/local/lib/python2.7/dist-packages/mxnet/base.py", line 149, in check_call
raise MXNetError(py_str(_LIB.MXGetLastError()))
MXNetError: [14:48:14] src/operator/tensor/../tensor/elemwise_unary_op.h:301: Check failed: inputs[0].dptr_ == outputs[0].dptr_ (0x7fe0beffc040 vs. 0x7fe0bf001600)
Stack trace returned 10 entries:
[bt] (0) /usr/local/lib/python2.7/dist-packages/mxnet/libmxnet.so(+0x17ec9d) [0x7fe11ec74c9d]
[bt] (1) /usr/local/lib/python2.7/dist-packages/mxnet/libmxnet.so(+0x17f068) [0x7fe11ec75068]
[bt] (2) /usr/local/lib/python2.7/dist-packages/mxnet/libmxnet.so(+0x8f7034) [0x7fe11f3ed034]
[bt] (3) /usr/local/lib/python2.7/dist-packages/mxnet/libmxnet.so(+0x2825020) [0x7fe12131b020]
[bt] (4) /usr/local/lib/python2.7/dist-packages/mxnet/libmxnet.so(+0x27a3ad8) [0x7fe121299ad8]
[bt] (5) /usr/local/lib/python2.7/dist-packages/mxnet/libmxnet.so(+0x27a3b13) [0x7fe121299b13]
[bt] (6) /usr/local/lib/python2.7/dist-packages/mxnet/libmxnet.so(+0x27ab954) [0x7fe1212a1954]
[bt] (7) /usr/local/lib/python2.7/dist-packages/mxnet/libmxnet.so(+0x27af461) [0x7fe1212a5461]
[bt] (8) /usr/local/lib/python2.7/dist-packages/mxnet/libmxnet.so(+0x27ac01b) [0x7fe1212a201b]
[bt] (9) /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xb8c80) [0x7fe130197c80]
```
Anyone with any ideas?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services