You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2020/07/21 01:30:55 UTC
[GitHub] [incubator-mxnet] ChaiBapchya opened a new issue #18764: Horovod issue with stable PyPi mxnet versions 1.6.0cu102
ChaiBapchya opened a new issue #18764:
URL: https://github.com/apache/incubator-mxnet/issues/18764
## Description
Undefined symbol error upon importing horovod for stable release of mxnet on PyPi
Related to https://github.com/apache/incubator-mxnet/issues/16193
### Error Message
```
python example/distributed_training-horovod/resnet50_imagenet.py
Traceback (most recent call last):
File "example/distributed_training-horovod/resnet50_imagenet.py", line 25, in <module>
import horovod.mxnet as hvd
File "/home/ubuntu/incubator-mxnet/mx_stable_pypi_cu102/lib/python3.7/site-packages/horovod/mxnet/__init__.py", line 25, in <module>
from horovod.mxnet.mpi_ops import allgather
File "/home/ubuntu/incubator-mxnet/mx_stable_pypi_cu102/lib/python3.7/site-packages/horovod/mxnet/mpi_ops.py", line 29, in <module>
_basics = _HorovodBasics(__file__, 'mpi_lib')
File "/home/ubuntu/incubator-mxnet/mx_stable_pypi_cu102/lib/python3.7/site-packages/horovod/common/basics.py", line 27, in __init__
self.MPI_LIB_CTYPES = ctypes.CDLL(full_path, mode=ctypes.RTLD_GLOBAL)
File "/home/ubuntu/anaconda3/lib/python3.7/ctypes/__init__.py", line 364, in __init__
self._handle = _dlopen(self._name, mode)
OSError: /home/ubuntu/incubator-mxnet/mx_stable_pypi_cu102/lib/python3.7/site-packages/horovod/mxnet/mpi_lib.cpython-37m-x86_64-linux-gnu.so: undefined symbol: _ZN5mxnet10CopyFromToERKNS_7NDArrayEPS1_i
```
### Steps to reproduce
```
virtualenv -p python3 mx16cu101
source mx16cu101/bin/activate
pip install mxnet-cu101==1.6.0
pip install gluoncv
pip install horovod
cd incubator-mxnet/
python example/distributed_training-horovod/resnet50_imagenet.py
```
## What have you tried to solve it?
1. Tried nightly releases from PyPi
2. Tried releases [stable & nightly] from repo.mxnet.io
2. Tried non mkl & mkl releases for PyPi as well as repo.mxnet.io
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-mxnet] szha commented on issue #18764: Horovod issue with stable PyPi mxnet versions 1.6.0cu102
Posted by GitBox <gi...@apache.org>.
szha commented on issue #18764:
URL: https://github.com/apache/incubator-mxnet/issues/18764#issuecomment-661557773
> Tried nightly releases from PyPi
Tried releases [stable & nightly] from repo.mxnet.io
Tried non mkl & mkl releases for PyPi as well as repo.mxnet.io
What are the results?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-mxnet] ChaiBapchya commented on issue #18764: Horovod issue with stable PyPi mxnet versions 1.6.0cu102
Posted by GitBox <gi...@apache.org>.
ChaiBapchya commented on issue #18764:
URL: https://github.com/apache/incubator-mxnet/issues/18764#issuecomment-662262347
Same error
```
OSError: /home/ubuntu/incubator-mxnet/mx_stable_pypi_cu102/lib/python3.7/site-packages/horovod/mxnet/mpi_lib.cpython-37m-x86_64-linux-gnu.so: undefined symbol: _ZN5mxnet10CopyFromToERKNS_7NDArrayEPS1_i
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-mxnet] szha commented on issue #18764: Horovod issue with stable PyPi mxnet versions 1.6.0cu102
Posted by GitBox <gi...@apache.org>.
szha commented on issue #18764:
URL: https://github.com/apache/incubator-mxnet/issues/18764#issuecomment-662832752
It still seems to be the case in 2.0 in #18772
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-mxnet] leezu commented on issue #18764: Horovod issue with stable PyPi mxnet versions 1.6.0cu102
Posted by GitBox <gi...@apache.org>.
leezu commented on issue #18764:
URL: https://github.com/apache/incubator-mxnet/issues/18764#issuecomment-662839367
That's a separate problem. @eric-haibin-lin mentioned the problem does not apply to 1.x nightly build
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org