You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2020/08/08 02:56:02 UTC
[GitHub] [incubator-mxnet] beew opened a new issue #18879: mxnet python gpu not work
beew opened a new issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879
## Description
When trying to create a nd array with the gpu context python appears to hang.
## To Reproduce
just create an array
```
import mxnet as mx
a = mx.nd.ones((2, 3), mx.gpu())
```
then it all stops as if it hangs, python was running in the background but no output. Without the gpu context the process is immediate
## Environment
We recommend using our script for collecting the diagnositc information. Run the following command and paste the outputs below:
```
curl --retry 10 -s https://raw.githubusercontent.com/dmlc/gluon-nlp/master/tools/diagnose.py | python
# paste outputs here
```
----------Python Info----------
Version : 3.7.8
Compiler : GCC 5.4.0 20160609
Build : ('default', 'Jul 28 2020 15:58:46')
Arch : ('64bit', 'ELF')
------------Pip Info-----------
Version : 20.2.1
Directory : /home/bernard/opt/python37/lib/python3.7/site-packages/pip
----------MXNet Info-----------
Version : 1.7.0
Directory : /home/bernard/opt/python37/lib/python3.7/site-packages/mxnet
```
This is all I have since the script apparently takes forever to run, may be hanging too.
System is Ubuntu Linux
GPU Nvidia GTX 1070
CUDA10.0, libcudnn7.6
cmake 3.17
mxnet build from source as follows
```
source /opt/intel/bin/compilervars.sh intel64
git clone --recursive https://github.com/apache/incubator-mxnet.git
cd incubator-mxnet && git checkout -b v1.7 origin/v1.7.x
git submodule sync
git submodule update --init --recursive
make -j8 USE_OPENCV=1 USE_BLAS=mkl USE_CUDA=1 USE_CUDA_PATH=/usr/local/cuda USE_CUDNN=0 CUDA_ARCH=-gencode=arch=compute_61,code=sm_61 USE_JEMALLOC=1
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-mxnet] sfraczek commented on issue #18879: mxnet python gpu not work
Posted by GitBox <gi...@apache.org>.
sfraczek commented on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-874192033
Hi.
I hope it worked out for you on your main machine. Is this issue still relevant or can it be closed?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org
[GitHub] [incubator-mxnet] anko-intel commented on issue #18879: mxnet python gpu not work
Posted by GitBox <gi...@apache.org>.
anko-intel commented on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-686552643
Hi @beew, @gzuchow will try to help you with the second issue.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org
[GitHub] [incubator-mxnet] beew edited a comment on issue #18879: mxnet python gpu not work
Posted by GitBox <gi...@apache.org>.
beew edited a comment on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-675014980
@anko-intel
Hi,
1.I am sticking to MKL 2018.2.199 because of this issue https://forums.developer.nvidia.com/t/nvblas-numpy-intelmkl-2018-3-not-work/66090 Not sure if there is any solution at the moment.
2. Nvidia-smi doesn't tell you which cuda is actually used, rather what is the highest cuda version the driver supports (in this case it is 10.1) Since the change from Cuda 10.0 to Cuda 10.1 breaks a lot of things because it changes the names of the lib files and the folder structures. https://github.com/tensorflow/tensorflow/issues/26289 I would have to compile a lot of stuffs from scratch and test them, it will takes me several days if not weeks considering that it takes 3 hours just to compile tensorflow and probably run into other compatibility issues along the way, I don't think it is feasible for me right now, I would wait til next year to upgrade the OS to 20.04. At the moment using mxnet with openblas works for me.
Thanks for the help.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-mxnet] beew edited a comment on issue #18879: mxnet python gpu not work
Posted by GitBox <gi...@apache.org>.
beew edited a comment on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-675014980
@anko-intel
Hi,
1.I am sticking to MKL 2018.2.199 because of this issue https://forums.developer.nvidia.com/t/nvblas-numpy-intelmkl-2018-3-not-work/66090 Not sure if there is any solution at the moment.
2. Nvidia-smi doesn't tell you which cuda is actually used, rather what is the highest cuda version the driver supports (in this case it is 10.1) Since the change from Cuda 10 to Cuda 10.1 breaks a lot of things because it changes the names of the lib files and the folder structures. https://github.com/tensorflow/tensorflow/issues/26289 I would have to compile a lot of stuffs from scratch and test them, it will takes me several days considering that it takes 3 hours just to compile tensorflow, I don't think it is feasible for me right now, I would wait til next year to upgrade the OS to 20.04. At the moment using mxnet with openblas works for me.
Thanks for the help.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-mxnet] szha commented on issue #18879: mxnet python gpu not work
Posted by GitBox <gi...@apache.org>.
szha commented on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-670988014
@beew thanks for testing these settings, it's really helpful. the mxnet-cu100mkl doesn't contain MKL as blas, but only has mkldnn in it. The blas library used for that build is openblas.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-mxnet] sfraczek commented on issue #18879: mxnet python gpu not work
Posted by GitBox <gi...@apache.org>.
sfraczek commented on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-874192033
Hi.
I hope it worked out for you on your main machine. Is this issue still relevant or can it be closed?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org
[GitHub] [incubator-mxnet] beew edited a comment on issue #18879: mxnet python gpu not work
Posted by GitBox <gi...@apache.org>.
beew edited a comment on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-675564235
@anko-intel
Yes, I can also set up update-alternatives just as I switch between mkl and openblas, I will give give it another shot in a few days and report back. Thanks.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-mxnet] szha commented on issue #18879: mxnet python gpu not work
Posted by GitBox <gi...@apache.org>.
szha commented on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-670830449
@beew hi Bernard. Sorry to hear that you are facing this issue. Looks like you have the right CUDA_ARCH for your GPU card. Could you try a couple of things to rule out factors such as GPU, compiler, and dependency library? I'd like to know:
- if the pre-compiled binary works on your platform. You can install a nightly build on the v1.x branch from https://dist.mxnet.io/python. You can use this command: `pip3 install --pre 'mxnet-cu100<2' -f https://dist.mxnet.io/python`
- if it works on your build from source with JEMALLOC turned off.
- if it works on your build from source without `source /opt/intel/bin/compilervars.sh intel64`
For the later two, try varying **only** that option to rule out these specific causes.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-mxnet] beew edited a comment on issue #18879: mxnet python gpu not work
Posted by GitBox <gi...@apache.org>.
beew edited a comment on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-670987830
@szha
Hi,
1.without source /opt/intel/bin/compilervars.sh intel64, build with openblas instead of mkl (since can't find mkl root and headers)
python gpu examples works (from here https://mxnet.apache.org/versions/1.6/get_started/validate_mxnet.html)
2 With source /opt/intel/bin/compilervars.sh intel6 and build with mkl then gpu examples hang as reported here, regardless whether JEMALLOC is turned on or off. If it is turned off, got an additional warning "Significant performance increases can be achieved by installing and enabling gperftools or jemalloc development packages"
So it seems that mkl somehow doesn't play well with cuda...
I haven't tried the nightly branch as it seems that the prebuild mxnet python wheels only support cuda > 10.0 after version 1.5.1, so instead I tried pip install mxnet-cu100mkl==1.5.1 I was able to create the mxnet array in gpu context with no problem with
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-mxnet] pengzhao-intel commented on issue #18879: mxnet python gpu not work
Posted by GitBox <gi...@apache.org>.
pengzhao-intel commented on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-671165617
> cc @PatricZhao whose team likely has the knowledge to find out why MKL build doesn't work in GPU builds.
It sounds wired. We will take an investigation for the issue.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-mxnet] beew commented on issue #18879: mxnet python gpu not work
Posted by GitBox <gi...@apache.org>.
beew commented on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-683290662
@anko-intel
I have no access for my machine for a month, I tried with another machine, this one has MKL 2020.0.166.1 (the latest installed via the deb repository but no cuda. I built mxnet with the options
```
make -j4 USE_OPENCV=1 USE_BLAS=mkl USE_CUDA=0 USE_CUDNN=1 USE_JEMALLOC=1
```
It failed with
```
/usr/bin/ld: build/src/operator/tensor/dot.o: undefined reference to symbol 'cblas_ddot'
/usr/bin/ld: /lib/x86_64-linux-gnu/libblas.so.3: error adding symbols: DSO missing from command line
collect2: error: ld returned 1 exit status
make: *** [Makefile:647: bin/im2rec] Error 1
```
But building with openblas was successful.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-mxnet] beew commented on issue #18879: mxnet python gpu not work
Posted by GitBox <gi...@apache.org>.
beew commented on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-675564235
@anko-intel
Yes, I can also set up update-alternatives just as I switch between mkl and openblas, I will give it another try in a few days to give it another shot and report back. Thanks.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-mxnet] beew edited a comment on issue #18879: mxnet python gpu not work
Posted by GitBox <gi...@apache.org>.
beew edited a comment on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-670987830
@szha
Hi,
1.without source /opt/intel/bin/compilervars.sh intel64, build with openblas instead of mkl (since can't find mkl root and headers)
python gpu examples works (from here https://mxnet.apache.org/versions/1.6/get_started/validate_mxnet.html)
2 With source /opt/intel/bin/compilervars.sh intel6 and build with mkl then gpu examples hang as reported here, regardless whether JEMALLOC is turned on or off. If it is turned off, got an additional warning "Significant performance increases can be achieved by installing and enabling gperftools or jemalloc development packages"
So it seems that mkl somehow doesn't play well with cuda...
I haven't tried the nightly branch as it seems that the prebuild mxnet python wheels only support cuda > 10.0 after version 1.5.1, so instead I tried pip install mxnet-cu100mkl==1.5.1 I was able to create the mxnet array in gpu context with no problem .
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-mxnet] beew edited a comment on issue #18879: mxnet python gpu not work
Posted by GitBox <gi...@apache.org>.
beew edited a comment on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-675014980
@anko-intel
Hi,
1.I am sticking to MKL 2018.2.199 because of this issue https://forums.developer.nvidia.com/t/nvblas-numpy-intelmkl-2018-3-not-work/66090 Not sure if there is any solution at the moment.
2. Nvidia-smi doesn't tell you which cuda is actually used, rather what is the highest cuda version the driver supports (in this case it is 10.1) Since the change from Cuda 10.0 to Cuda 10.1 breaks a lot of things because it changes the names of the lib files and the folder structures. https://github.com/tensorflow/tensorflow/issues/26289 I would have to compile a lot of stuffs from scratch and test them, it will takes me several days if not weeks considering that it takes 3 hours just to compile tensorflow and probably run into other compatibility issues along the way, I don't think it makes sense at this point since my system just works and is highly optimized and customized. I would wait til next year to upgrade the OS to 20.04. At the moment using mxnet with openblas works for me.
Thanks for the help.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-mxnet] beew edited a comment on issue #18879: mxnet python gpu not work
Posted by GitBox <gi...@apache.org>.
beew edited a comment on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-675014980
@anko-intel
Hi,
1.I am sticking to MKL 2018.2.199 because of this issue https://forums.developer.nvidia.com/t/nvblas-numpy-intelmkl-2018-3-not-work/66090
Not sure if there is any solution at the moment.
2. Nvidia-smi doesn't tell you which cuda is actually used, rather what is the highest cuda version the driver supports (in this case it is 10.1) Since the change from Cuda 10 to Cuda 10.1 breaks a lot of things because it changes the names of the lib files and the folder structures. https://github.com/tensorflow/tensorflow/issues/26289 I would have to compile a lot of stuffs from scratch and test them, it will takes me several days considering that it takes 3 hours just to compile tensorflow, I don't think it is feasible for me right now, I would wait til next year to upgrade the OS to 20.04. At the moment using mxnet with openblas works for me.
Thanks for the help.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-mxnet] beew edited a comment on issue #18879: mxnet python gpu not work
Posted by GitBox <gi...@apache.org>.
beew edited a comment on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-690540912
> Using this tutorial: https://software.intel.com/content/www/us/en/develop/articles/installing-intel-free-libs-and-python-apt-repo.html by default mkl packages are installed in /opt/intel/ location.
No, they are not. Doesn't matter what the documentation says, it is probably outdated. I know where they are installed in my machine. ;) I don't know about intel-python, I haven't installed that.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org
[GitHub] [incubator-mxnet] beew commented on issue #18879: mxnet python gpu not work
Posted by GitBox <gi...@apache.org>.
beew commented on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-675014980
@anko-intel
Hi,
1.I am sticking to MKL 2018.2.199 because of this issue https://forums.developer.nvidia.com/t/nvblas-numpy-intelmkl-2018-3-not-work/66090
Not sure if there is any solution at the moment.
2. Nvidia-smi doesn't tell you which cuda is actually used, rather what is the highest cuda version the driver supports (in this case it is 10.1) Since the change from Cuda 10 to Cuda 10.1 breaks a lot of things because it changes the names of the lib files and the folder structures. https://github.com/tensorflow/tensorflow/issues/26289 I would have to compile a lot of stuffs from scratch and test them, it will takes me several days considering that it takes 3 hours just to compile tensorflow, I don't think it is feasible for me right now, I would wait til next year to upgrade the OS to 20.04. At the moment using mxnet with openblas works for me.
Thanks for the help.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-mxnet] bgawrych commented on issue #18879: mxnet python gpu not work
Posted by GitBox <gi...@apache.org>.
bgawrych commented on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-987627798
Closing due to lack of response and no reproduction. If this issue is still relevant feel free to reopen
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org
[GitHub] [incubator-mxnet] beew edited a comment on issue #18879: mxnet python gpu not work
Posted by GitBox <gi...@apache.org>.
beew edited a comment on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-683290662
@anko-intel
I have no access to my machine for a month, I tried with another machine, this one has MKL 2020.0.166.1 (the latest installed via the deb repository), Ubuntu20.04 but no cuda. I built mxnet with the options
```
make -j4 USE_OPENCV=1 USE_BLAS=mkl USE_CUDA=0 USE_CUDNN=0 USE_MKLDNN=1 USE_JEMALLOC=1
```
It failed with
```
/usr/bin/ld: build/src/operator/tensor/dot.o: undefined reference to symbol 'cblas_ddot'
/usr/bin/ld: /lib/x86_64-linux-gnu/libblas.so.3: error adding symbols: DSO missing from command line
collect2: error: ld returned 1 exit status
make: *** [Makefile:647: bin/im2rec] Error 1
```
But building with USE_BLAS=openblas (keeping all other options unchanged) was successful.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-mxnet] anko-intel commented on issue #18879: mxnet python gpu not work
Posted by GitBox <gi...@apache.org>.
anko-intel commented on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-673610353
Hi @beew,
I have tried to reproduce the issue locally on similar hardware (GeForce GTX 1060 6GB, Intel(R) Core(TM) i7-5960X CPU @ 3.00GHz) and software (Ubuntu 18.04, Kernel: 4.15.0-88, Nvidia driver 410.48 , CUDA 10.0.130) but unfortunately I couldn't (the test works for me).
Could you paste the MxNet sha and MKL version or (better) attach info.txt file produced by following script (run from mxnet home directory without sourcing earlier /opt/intel/bin/compilervars.sh)? :
```
INFO_FILE=info.txt
git rev-parse --verify HEAD >${INFO_FILE}
env >>${INFO_FILE}
echo -------------------------------- >>${INFO_FILE}
source /opt/intel/bin/compilervars.sh intel64
env >>${INFO_FILE}
nvidia-smi -L >>${INFO_FILE}
nvidia-smi >>${INFO_FILE}
cat /usr/local/cuda/version.txt >>${INFO_FILE}
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-mxnet] github-actions[bot] commented on issue #18879: mxnet python gpu not work
Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-670813705
Welcome to Apache MXNet (incubating)! We are on a mission to democratize AI, and we are glad that you are contributing to it by opening this issue.
Please make sure to include all the relevant context, and one of the @apache/mxnet-committers will be here shortly.
If you are interested in contributing to our project, let us know! Also, be sure to check out our guide on [contributing to MXNet](https://mxnet.apache.org/community/contribute) and our [development guides wiki](https://cwiki.apache.org/confluence/display/MXNET/Developments).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-mxnet] gzuchow commented on issue #18879: mxnet python gpu not work
Posted by GitBox <gi...@apache.org>.
gzuchow commented on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-687088080
Hi @beew,
Sorry to hear that you have still a problem with MXNet.
I have tried to reproduce second issue which you encountered. Using Ubuntu 18.04 and 20.04 and flags used by you:
> make -j4 USE_OPENCV=1 USE_BLAS=mkl USE_CUDA=0 USE_CUDNN=0 USE_MKLDNN=1 USE_JEMALLOC=1
It ran smoothly, even completed with cmake method.
I want to ask about a version of used MXNet and other libiraries.
PAX-utlis will be needed for this setp:
`apt install pax-utils`
Then run this and post to us the result of info.txt file.
```
INFO_FILE=info.txt
git rev-parse --verify HEAD >${INFO_FILE}
source /opt/intel/bin/compilervars.sh intel64
env >>${INFO_FILE}
scanelf -l -s cblas_ddot |grep dot >>${INFO_FILE}
```
According to this [tutorial](https://mxnet.apache.org/get_started/build_from_source) you can set MXNet environment.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org
[GitHub] [incubator-mxnet] anko-intel commented on issue #18879: mxnet python gpu not work
Posted by GitBox <gi...@apache.org>.
anko-intel commented on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-675441896
Hi @beew,
Thanks for your answer.
You probably already know it - you can also have several MKL versions and switch between them.
Thanks to that you can switch to MKL 2019 just before compilation of Mxnet by :
```
ls -la /opt/intel/compilers_and_libraries
cd /opt/intel
sudo rm compilers_and_libraries
sudo ln -s compilers_and_libraries_2019.1.144 compilers_and_libraries
```
and after MxNet compilation you can restore compilers_and_libraries link to your default MKL versions used by other software.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-mxnet] beew edited a comment on issue #18879: mxnet python gpu not work
Posted by GitBox <gi...@apache.org>.
beew edited a comment on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-683290662
@anko-intel
I have no access to my machine for a month, I tried with another machine, this one has MKL 2020.0.166.1 (the latest installed via intel's deb repository), Ubuntu20.04 but no cuda. I built mxnet with the options
```
make -j4 USE_OPENCV=1 USE_BLAS=mkl USE_CUDA=0 USE_CUDNN=0 USE_MKLDNN=1 USE_JEMALLOC=1
```
It failed with
```
/usr/bin/ld: build/src/operator/tensor/dot.o: undefined reference to symbol 'cblas_ddot'
/usr/bin/ld: /lib/x86_64-linux-gnu/libblas.so.3: error adding symbols: DSO missing from command line
collect2: error: ld returned 1 exit status
make: *** [Makefile:647: bin/im2rec] Error 1
```
But building with USE_BLAS=openblas (keeping all other options unchanged) was successful.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-mxnet] gzuchow commented on issue #18879: mxnet python gpu not work
Posted by GitBox <gi...@apache.org>.
gzuchow commented on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-690100751
Using this tutorial: https://software.intel.com/content/www/us/en/develop/articles/installing-intel-free-libs-and-python-apt-repo.html by default mkl packages are installed in /opt/intel/ location.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org
[GitHub] [incubator-mxnet] beew edited a comment on issue #18879: mxnet python gpu not work
Posted by GitBox <gi...@apache.org>.
beew edited a comment on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-675014980
@anko-intel
Hi,
1.I am sticking to MKL 2018.2.199 because of this issue https://forums.developer.nvidia.com/t/nvblas-numpy-intelmkl-2018-3-not-work/66090 Not sure if there is any solution at the moment.
2. Nvidia-smi doesn't tell you which cuda is actually used, rather what is the highest cuda version the driver supports (in this case it is 10.1) Since the change from Cuda 10 to Cuda 10.1 breaks a lot of things because it changes the names of the lib files and the folder structures. https://github.com/tensorflow/tensorflow/issues/26289 I would have to compile a lot of stuffs from scratch and test them, it will takes me several days if not weeks considering that it takes 3 hours just to compile tensorflow and probably run into other compatibility issues along the way, I don't think it is feasible for me right now, I would wait til next year to upgrade the OS to 20.04. At the moment using mxnet with openblas works for me.
Thanks for the help.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-mxnet] beew commented on issue #18879: mxnet python gpu not work
Posted by GitBox <gi...@apache.org>.
beew commented on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-674360922
@anko-intel
Here it is
```
3143aabb60038b555db2960d712fb2806b16d581
XDG_VTNR=7
XDG_SESSION_ID=c2
CLUTTER_IM_MODULE=xim
XDG_GREETER_DATA_DIR=/var/lib/lightdm-data/bernard
GPG_AGENT_INFO=/home/bernard/.gnupg/S.gpg-agent:0:1
SHELL=/bin/bash
VTE_VERSION=4205
TERM=xterm-256color
MKL_THREADING_LAYER=TBB
QT_LINUX_ACCESSIBILITY_ALWAYS_ON=1
LD_PRELOAD=/usr/lib/libblas.so /usr/lib/liblapack.so
WINDOWID=123731978
GNOME_KEYRING_CONTROL=
UPSTART_SESSION=unix:abstract=/com/ubuntu/upstart-session/1000/1492
NVBLAS_CONFIG_FILE=/home/bernard/.config/nvblas.conf
GTK_MODULES=gail:atk-bridge:unity-gtk-module
PYTHONUSERBASE=/home/bernard/opt/python37
USER=bernard
LD_LIBRARY_PATH=/home/bernard/opt/python37/lib:/home/bernard/opt/libglvnd/lib:/usr/local/lib/petsc:/opt/openmpi-cuda/lib:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64:/usr/local/tensorrt/lib:/home/bernard/opt/libQGLViewer/lib::/usr/local/lib:/home/bernard/opt/opencv/lib:/home/bernard/opt/latte/lib
QT_ACCESSIBILITY=1
LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=00:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arc=01;31:*.arj=01;31:*.taz=01;31:*.lha=01;31:*.lz4=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.tzo=01;31:*.t7z=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.dz=01;31:*.gz=01;31:*.lrz=01;31:*.lz=01;31:*.lzo=01;31:*.xz=01;31:*.bz2=01;31:*.bz=01;31:*.tbz=01;31:*.tbz2=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.war=01;31:*.ear=01;31:*.sar=01;31:*.rar=01;31:*.alz=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.cab=01;31:*.jpg=01;35:*.jpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.webm=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35
:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=00;36:*.au=00;36:*.flac=00;36:*.m4a=00;36:*.mid=00;36:*.midi=00;36:*.mka=00;36:*.mp3=00;36:*.mpc=00;36:*.ogg=00;36:*.ra=00;36:*.wav=00;36:*.oga=00;36:*.opus=00;36:*.spx=00;36:*.xspf=00;36:
XDG_SESSION_PATH=/org/freedesktop/DisplayManager/Session0
XDG_SEAT_PATH=/org/freedesktop/DisplayManager/Seat0
CPATH=/usr/local/include/petsc::/home/bernard/opt/opencv/include
SSH_AUTH_SOCK=/run/user/1000/keyring/ssh
DEFAULTS_PATH=/usr/share/gconf/ubuntu.default.path
XDG_CONFIG_DIRS=/etc/xdg/xdg-ubuntu:/usr/share/upstart/xdg:/etc/xdg
PATH=/home/bernard/opt/python37/bin:/opt/openmpi-cuda/bin:/usr/local/cuda/bin:/home/bernard/bin:/home/bernard/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/home/bernard/opt/qt5/bin:/home/bernard/opt/opencv/bin:/home/bernard/opt/latte/bin
DESKTOP_SESSION=ubuntu
QT_QPA_PLATFORMTHEME=appmenu-qt5
QT_IM_MODULE=ibus
JOB=gnome-session
PWD=/home/bernard/opt/incubator-mxnet
XDG_SESSION_TYPE=x11
XMODIFIERS=@im=ibus
LANG=en_CA.UTF-8
GNOME_KEYRING_PID=
MANDATORY_PATH=/usr/share/gconf/ubuntu.mandatory.path
GDM_LANG=en_CA
IM_CONFIG_PHASE=1
COMPIZ_CONFIG_PROFILE=ubuntu
GDMSESSION=ubuntu
GTK2_MODULES=overlay-scrollbar
SESSIONTYPE=gnome-session
XDG_SEAT=seat0
HOME=/home/bernard/opt/python37
SHLVL=2
LANGUAGE=en_CA:en
GNOME_DESKTOP_SESSION_ID=this-is-deprecated
UPSTART_INSTANCE=
SLEPC_DIR=/home/bernard/opt/slepc
LOGNAME=bernard
XDG_SESSION_DESKTOP=ubuntu
UPSTART_EVENTS=started starting
PYTHONPATH=/home/bernard/opt/python37/lib/python3.7/site-packages:
PREFIX=/home/bernard/opt/python37
QT4_IM_MODULE=xim
XDG_DATA_DIRS=/home/bernard/.local/share/flatpak/exports/share:/var/lib/flatpak/exports/share/:/usr/share/ubuntu:/usr/share/gnome:/home/bernard/.local/share/flatpak/exports/share:/var/lib/flatpak/exports/share:/usr/local/share:/usr/share
DBUS_SESSION_BUS_ADDRESS=unix:abstract=/tmp/dbus-NJkEMOa0FZ
PKG_CONFIG_PATH=:/home/bernard/opt/opencv/lib/pkgconfig
LESSOPEN=| /usr/bin/lesspipe %s
UPSTART_JOB=unity-settings-daemon
INSTANCE=Unity
DISPLAY=:0
XDG_RUNTIME_DIR=/run/user/1000
GTK_IM_MODULE=ibus
XDG_CURRENT_DESKTOP=Unity
PETSC_DIR=/home/bernard/opt/petsc
LESSCLOSE=/usr/bin/lesspipe %s %s
XAUTHORITY=/home/bernard/.Xauthority
_=/usr/bin/env
--------------------------------
MKLROOT=/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl
XDG_VTNR=7
MANPATH=/opt/intel/compilers_and_libraries_2018.2.199/linux/mpi/man:/home/bernard/opt/python37/man:/opt/openmpi-cuda/share/man:/home/bernard/.local/share/man:/usr/local/man:/usr/local/share/man:/usr/share/man
XDG_SESSION_ID=c2
CLUTTER_IM_MODULE=xim
XDG_GREETER_DATA_DIR=/var/lib/lightdm-data/bernard
INTEL_LICENSE_FILE=/opt/intel/compilers_and_libraries_2018.2.199/linux/licenses:/opt/intel/licenses:/home/bernard/opt/python37/intel/licenses
IPPROOT=/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp
GPG_AGENT_INFO=/home/bernard/.gnupg/S.gpg-agent:0:1
SHELL=/bin/bash
VTE_VERSION=4205
TERM=xterm-256color
MKL_THREADING_LAYER=TBB
LIBRARY_PATH=/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/lib/intel64:/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64_lin:/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64_lin:/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.7:/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.7:/opt/intel/compilers_and_libraries_2018.2.199/linux/daal/lib/intel64_lin:/opt/intel/compilers_and_libraries_2018.2.199/linux/daal/../tbb/lib/intel64_lin/gcc4.4
QT_LINUX_ACCESSIBILITY_ALWAYS_ON=1
LD_PRELOAD=/usr/lib/libblas.so /usr/lib/liblapack.so
WINDOWID=123731978
GNOME_KEYRING_CONTROL=
UPSTART_SESSION=unix:abstract=/com/ubuntu/upstart-session/1000/1492
NVBLAS_CONFIG_FILE=/home/bernard/.config/nvblas.conf
GTK_MODULES=gail:atk-bridge:unity-gtk-module
PYTHONUSERBASE=/home/bernard/opt/python37
USER=bernard
LD_LIBRARY_PATH=/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64:/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64_lin:/opt/intel/compilers_and_libraries_2018.2.199/linux/mpi/intel64/lib:/opt/intel/compilers_and_libraries_2018.2.199/linux/mpi/mic/lib:/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/lib/intel64:/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64_lin:/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64_lin:/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.7:/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.7:/opt/intel/compilers_and_libraries_2018.2.199/linux/daal/lib/intel64_lin:/opt/intel/compilers_and_libraries_2018.2.199/linux/daal/../tbb/lib/intel64_lin/gcc4.4:/home/bernard/opt/python37/lib:/home/bernard/opt/libglvnd/lib:/usr/local/lib/petsc:/opt/openmpi-cuda/lib:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64:/usr/loc
al/tensorrt/lib:/home/bernard/opt/libQGLViewer/lib::/usr/local/lib:/home/bernard/opt/opencv/lib:/home/bernard/opt/latte/lib
QT_ACCESSIBILITY=1
LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=00:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arc=01;31:*.arj=01;31:*.taz=01;31:*.lha=01;31:*.lz4=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.tzo=01;31:*.t7z=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.dz=01;31:*.gz=01;31:*.lrz=01;31:*.lz=01;31:*.lzo=01;31:*.xz=01;31:*.bz2=01;31:*.bz=01;31:*.tbz=01;31:*.tbz2=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.war=01;31:*.ear=01;31:*.sar=01;31:*.rar=01;31:*.alz=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.cab=01;31:*.jpg=01;35:*.jpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.webm=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35
:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=00;36:*.au=00;36:*.flac=00;36:*.m4a=00;36:*.mid=00;36:*.midi=00;36:*.mka=00;36:*.mp3=00;36:*.mpc=00;36:*.ogg=00;36:*.ra=00;36:*.wav=00;36:*.oga=00;36:*.opus=00;36:*.spx=00;36:*.xspf=00;36:
PSTLROOT=/opt/intel/compilers_and_libraries_2018.2.199/linux/pstl
XDG_SESSION_PATH=/org/freedesktop/DisplayManager/Session0
XDG_SEAT_PATH=/org/freedesktop/DisplayManager/Seat0
CPATH=/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/include:/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/include:/opt/intel/compilers_and_libraries_2018.2.199/linux/pstl/include:/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/include:/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/include:/opt/intel/compilers_and_libraries_2018.2.199/linux/daal/include:/usr/local/include/petsc::/home/bernard/opt/opencv/include
SSH_AUTH_SOCK=/run/user/1000/keyring/ssh
DEFAULTS_PATH=/usr/share/gconf/ubuntu.default.path
XDG_CONFIG_DIRS=/etc/xdg/xdg-ubuntu:/usr/share/upstart/xdg:/etc/xdg
NLSPATH=/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64/locale/%l_%t/%N:/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64_lin/locale/%l_%t/%N
PATH=/opt/intel/compilers_and_libraries_2018.2.199/linux/bin/intel64:/opt/intel/compilers_and_libraries_2018.2.199/linux/mpi/intel64/bin:/home/bernard/opt/python37/bin:/opt/openmpi-cuda/bin:/usr/local/cuda/bin:/home/bernard/bin:/home/bernard/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/home/bernard/opt/qt5/bin:/home/bernard/opt/opencv/bin:/home/bernard/opt/latte/bin
DESKTOP_SESSION=ubuntu
QT_QPA_PLATFORMTHEME=appmenu-qt5
QT_IM_MODULE=ibus
TBBROOT=/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb
JOB=gnome-session
PWD=/home/bernard/opt/incubator-mxnet
XDG_SESSION_TYPE=x11
XMODIFIERS=@im=ibus
LANG=en_CA.UTF-8
GNOME_KEYRING_PID=
MANDATORY_PATH=/usr/share/gconf/ubuntu.mandatory.path
GDM_LANG=en_CA
IM_CONFIG_PHASE=1
COMPIZ_CONFIG_PROFILE=ubuntu
DAALROOT=/opt/intel/compilers_and_libraries_2018.2.199/linux/daal
GDMSESSION=ubuntu
GTK2_MODULES=overlay-scrollbar
SESSIONTYPE=gnome-session
XDG_SEAT=seat0
HOME=/home/bernard/opt/python37
SHLVL=2
LANGUAGE=en_CA:en
GNOME_DESKTOP_SESSION_ID=this-is-deprecated
UPSTART_INSTANCE=
SLEPC_DIR=/home/bernard/opt/slepc
LOGNAME=bernard
XDG_SESSION_DESKTOP=ubuntu
UPSTART_EVENTS=started starting
PYTHONPATH=/home/bernard/opt/python37/lib/python3.7/site-packages:
PREFIX=/home/bernard/opt/python37
CLASSPATH=/opt/intel/compilers_and_libraries_2018.2.199/linux/mpi/intel64/lib/mpi.jar:/opt/intel/compilers_and_libraries_2018.2.199/linux/daal/lib/daal.jar
QT4_IM_MODULE=xim
XDG_DATA_DIRS=/home/bernard/.local/share/flatpak/exports/share:/var/lib/flatpak/exports/share/:/usr/share/ubuntu:/usr/share/gnome:/home/bernard/.local/share/flatpak/exports/share:/var/lib/flatpak/exports/share:/usr/local/share:/usr/share
DBUS_SESSION_BUS_ADDRESS=unix:abstract=/tmp/dbus-NJkEMOa0FZ
PKG_CONFIG_PATH=/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/bin/pkgconfig::/home/bernard/opt/opencv/lib/pkgconfig
LESSOPEN=| /usr/bin/lesspipe %s
UPSTART_JOB=unity-settings-daemon
INSTANCE=Unity
DISPLAY=:0
XDG_RUNTIME_DIR=/run/user/1000
GTK_IM_MODULE=ibus
XDG_CURRENT_DESKTOP=Unity
PETSC_DIR=/home/bernard/opt/petsc
LESSCLOSE=/usr/bin/lesspipe %s %s
I_MPI_ROOT=/opt/intel/compilers_and_libraries_2018.2.199/linux/mpi
XAUTHORITY=/home/bernard/.Xauthority
_=/usr/bin/env
GPU 0: GeForce GTX 1070 (UUID: GPU-98bf2a5b-636b-5cbe-cc66-cdf024b8a920)
Sat Aug 15 03:11:15 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.56 Driver Version: 418.56 CUDA Version: 10.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 1070 Off | 00000000:01:00.0 On | N/A |
| N/A 42C P8 6W / N/A | 408MiB / 8111MiB | 3% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1142 G /usr/lib/xorg/Xorg 211MiB |
| 0 2056 G compiz 192MiB |
| 0 3329 G /usr/lib/firefox/firefox 2MiB |
+-----------------------------------------------------------------------------+
CUDA Version 10.0.130
CUDA Patch Version 10.0.130.1
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-mxnet] beew commented on issue #18879: mxnet python gpu not work
Posted by GitBox <gi...@apache.org>.
beew commented on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-690540912
> Using this tutorial: https://software.intel.com/content/www/us/en/develop/articles/installing-intel-free-libs-and-python-apt-repo.html by default mkl packages are installed in /opt/intel/ location.
No, they are not. Doesn't matter what the documentation says, it is probably outdated. I know where they are installed in my machine. I don't know about intel-python, I haven't installed that.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org
[GitHub] [incubator-mxnet] szha commented on issue #18879: mxnet python gpu not work
Posted by GitBox <gi...@apache.org>.
szha commented on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-670988062
cc @PatricZhao whose team likely has the knowledge to find out why MKL build doesn't work in GPU builds.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-mxnet] beew commented on issue #18879: mxnet python gpu not work
Posted by GitBox <gi...@apache.org>.
beew commented on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-670987830
@szha
Hi,
1.without source /opt/intel/bin/compilervars.sh intel64, build with openblas instead of mkl (since can't find mkl root and headers)
python gpu examples works (from here https://mxnet.apache.org/versions/1.6/get_started/validate_mxnet.html)
2 With /opt/intel/bin/compilervars.sh intel6 and build with mkl then gpu examples hang as reported here, regardless whether JEMALLOC is turned on or off. If it is turned off, got an additional warning "Significant performance increases can be achieved by installing and enabling gperftools or jemalloc development packages"
So it seems that mkl somehow doesn't play well with cuda...
I haven't tried the nightly branch as it seems that the prebuild mxnet python wheels only support cuda > 10.0 after version 1.5.1, so instead I tried pip install mxnet-cu100mkl==1.5.1 I was able to create the mxnet array in gpu context with no problem with
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-mxnet] anko-intel commented on issue #18879: mxnet python gpu not work
Posted by GitBox <gi...@apache.org>.
anko-intel commented on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-675004193
Hi @beew,
Thanks for your answer. I checked out mxnet sha 3143aabb6 and installed MKL version 2019.1.144 but unfortunately I still cannot reproduce the issue.
Formerly I cannot install MKL 2018.2.199 which you use as it does not support Ubuntu 18.4. When I force to install it, python3 hangs during “import mxnet as mx” – but it could be a different issue than yours.
I also noticed that your driver CUDA version is 10.1 (nvidia-smi) differs than software packet 10.0.130 (cat /usr/local/cuda/version.txt)
So could I ask you to try :
1. Upgrade MKL to 2019 version or newer (you can found it here: https://registrationcenter.intel.com/en/products/download/3685/ )
2. upgrade your CUDA to 10.1 version to have it consistent with the driver
(you can have a couple of CUDA versions in the system, see what is pointed by the link /usr/local/cuda)
Please let me know if it helps.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-mxnet] beew edited a comment on issue #18879: mxnet python gpu not work
Posted by GitBox <gi...@apache.org>.
beew edited a comment on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-683290662
@anko-intel
I have no access to my machine for a month, I tried with another machine, this one has MKL 2020.0.166.1 (the latest installed via the deb repository), Ubuntu20.04 but no cuda. I built mxnet with the options
```
make -j4 USE_OPENCV=1 USE_BLAS=mkl USE_CUDA=0 USE_CUDNN=0 USE_MKLDNN=1 USE_JEMALLOC=1
```
It failed with
```
/usr/bin/ld: build/src/operator/tensor/dot.o: undefined reference to symbol 'cblas_ddot'
/usr/bin/ld: /lib/x86_64-linux-gnu/libblas.so.3: error adding symbols: DSO missing from command line
collect2: error: ld returned 1 exit status
make: *** [Makefile:647: bin/im2rec] Error 1
```
But building with USE_BLAS=openblas was successful.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-mxnet] beew edited a comment on issue #18879: mxnet python gpu not work
Posted by GitBox <gi...@apache.org>.
beew edited a comment on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-683290662
@anko-intel
I have no access for my machine for a month, I tried with another machine, this one has MKL 2020.0.166.1 (the latest installed via the deb repository), Ubuntu20.04 but no cuda. I built mxnet with the options
```
make -j4 USE_OPENCV=1 USE_BLAS=mkl USE_CUDA=0 USE_CUDNN=1 USE_JEMALLOC=1
```
It failed with
```
/usr/bin/ld: build/src/operator/tensor/dot.o: undefined reference to symbol 'cblas_ddot'
/usr/bin/ld: /lib/x86_64-linux-gnu/libblas.so.3: error adding symbols: DSO missing from command line
collect2: error: ld returned 1 exit status
make: *** [Makefile:647: bin/im2rec] Error 1
```
But building with USE_BLAS=openblas was successful.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-mxnet] beew commented on issue #18879: mxnet python gpu not work
Posted by GitBox <gi...@apache.org>.
beew commented on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-687674781
>
> INFO_FILE=info.txt
> git rev-parse --verify HEAD >${INFO_FILE}
> source /opt/intel/bin/compilervars.sh intel64
> env >>${INFO_FILE}
> scanelf -l -s cblas_ddot |grep dot >>${INFO_FILE}
Actually there is no /opt/intel/bin/compilervars.sh intel64
Intelmkl on this machine was installed with the deb repository
https://software.intel.com/content/www/us/en/develop/articles/installing-intel-free-libs-and-python-apt-repo.html
So in this case the library paths should already in standard locations ( e,g /usr/lib/x86_64-linux-gnu)
I appreciate the help, but probably isn't worth the troubles. I cannot test very much on this laptop since it took me ~7 -8 hours to build mxnet from source here while my main machine is currently not accessible. Thanks again, I will try when I get my main machine back in a few weeks.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org
[GitHub] [incubator-mxnet] beew edited a comment on issue #18879: mxnet python gpu not work
Posted by GitBox <gi...@apache.org>.
beew edited a comment on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-683290662
@anko-intel
I have no access for my machine for a month, I tried with another machine, this one has MKL 2020.0.166.1 (the latest installed via the deb repository), Ubuntu20.04 but no cuda. I built mxnet with the options
```
make -j4 USE_OPENCV=1 USE_BLAS=mkl USE_CUDA=0 USE_CUDNN=1 USE_JEMALLOC=1
```
It failed with
```
/usr/bin/ld: build/src/operator/tensor/dot.o: undefined reference to symbol 'cblas_ddot'
/usr/bin/ld: /lib/x86_64-linux-gnu/libblas.so.3: error adding symbols: DSO missing from command line
collect2: error: ld returned 1 exit status
make: *** [Makefile:647: bin/im2rec] Error 1
```
But building with openblas was successful.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-mxnet] beew edited a comment on issue #18879: mxnet python gpu not work
Posted by GitBox <gi...@apache.org>.
beew edited a comment on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-683290662
@anko-intel
I have no access to my machine for a month, I tried with another machine, this one has MKL 2020.0.166.1 (the latest installed via the deb repository), Ubuntu20.04 but no cuda. I built mxnet with the options
```
make -j4 USE_OPENCV=1 USE_BLAS=mkl USE_CUDA=0 USE_mkDNN=1 USE_JEMALLOC=1
```
It failed with
```
/usr/bin/ld: build/src/operator/tensor/dot.o: undefined reference to symbol 'cblas_ddot'
/usr/bin/ld: /lib/x86_64-linux-gnu/libblas.so.3: error adding symbols: DSO missing from command line
collect2: error: ld returned 1 exit status
make: *** [Makefile:647: bin/im2rec] Error 1
```
But building with USE_BLAS=openblas was successful.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-mxnet] beew edited a comment on issue #18879: mxnet python gpu not work
Posted by GitBox <gi...@apache.org>.
beew edited a comment on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-683290662
@anko-intel
I have no access to my machine for a month, I tried with another machine, this one has MKL 2020.0.166.1 (the latest installed via the deb repository), Ubuntu20.04 but no cuda. I built mxnet with the options
```
make -j4 USE_OPENCV=1 USE_BLAS=mkl USE_CUDA=0 USE_CUDNN=1 USE_JEMALLOC=1
```
It failed with
```
/usr/bin/ld: build/src/operator/tensor/dot.o: undefined reference to symbol 'cblas_ddot'
/usr/bin/ld: /lib/x86_64-linux-gnu/libblas.so.3: error adding symbols: DSO missing from command line
collect2: error: ld returned 1 exit status
make: *** [Makefile:647: bin/im2rec] Error 1
```
But building with USE_BLAS=openblas was successful.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-mxnet] beew edited a comment on issue #18879: mxnet python gpu not work
Posted by GitBox <gi...@apache.org>.
beew edited a comment on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-670987830
@szha
Hi,
1.without source /opt/intel/bin/compilervars.sh intel64, build with openblas instead of mkl (since can't find mkl root and headers)
python gpu examples works (from here https://mxnet.apache.org/versions/1.6/get_started/validate_mxnet.html)
2 With source /opt/intel/bin/compilervars.sh intel6 and build with mkl then gpu examples hang as reported here, regardless whether JEMALLOC is turned on or off. If it is turned off, got an additional warning "Significant performance increases can be achieved by installing and enabling gperftools or jemalloc development packages"
So it seems that mkl somehow doesn't play well with cuda...
I haven't tried the nightly branch as it seems that the prebuild mxnet python wheels only support cuda > 10.0 after version 1.5.1, so instead I tried pip install mxnet-cu100mkl==1.5.1 I was able to create the mxnet array in gpu context with no problem .
P.S except for the prebuild mxnet wheels, all other tests were done with mxnet 1.7 from source "git checkout -b v1.7 origin/v1.7.x"
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-mxnet] bgawrych closed issue #18879: mxnet python gpu not work
Posted by GitBox <gi...@apache.org>.
bgawrych closed issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org