You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2020/08/08 02:56:02 UTC

[GitHub] [incubator-mxnet] beew opened a new issue #18879: mxnet python gpu not work

beew opened a new issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879


   ## Description
   When trying to create a nd array with the gpu context python appears to hang.
   
   ## To Reproduce
   just create an array
   
   ```
   import mxnet as mx
   
    a = mx.nd.ones((2, 3), mx.gpu())
   
   ```
   
   then it all stops as if it hangs, python was running in the background but no output.  Without the gpu context the process is immediate
   
   ## Environment
   
   We recommend using our script for collecting the diagnositc information. Run the following command and paste the outputs below:
   ```
   curl --retry 10 -s https://raw.githubusercontent.com/dmlc/gluon-nlp/master/tools/diagnose.py | python
   
   # paste outputs here
   ```
   ----------Python Info----------
   Version      : 3.7.8
   Compiler     : GCC 5.4.0 20160609
   Build        : ('default', 'Jul 28 2020 15:58:46')
   Arch         : ('64bit', 'ELF')
   ------------Pip Info-----------
   Version      : 20.2.1
   Directory    : /home/bernard/opt/python37/lib/python3.7/site-packages/pip
   ----------MXNet Info-----------
   Version      : 1.7.0
   Directory    : /home/bernard/opt/python37/lib/python3.7/site-packages/mxnet
   ```
   
   This is all I have since the script apparently takes forever to run, may be hanging too.
   
   System is Ubuntu Linux
   
   GPU Nvidia GTX 1070
   
   CUDA10.0, libcudnn7.6
   cmake 3.17
   
   mxnet build from source  as follows
   
   ```
   source /opt/intel/bin/compilervars.sh intel64
   
   git clone --recursive https://github.com/apache/incubator-mxnet.git
   
   cd incubator-mxnet && git checkout -b v1.7 origin/v1.7.x
   
   git submodule sync
   
   git submodule update --init --recursive
   
   make -j8 USE_OPENCV=1 USE_BLAS=mkl USE_CUDA=1 USE_CUDA_PATH=/usr/local/cuda USE_CUDNN=0 CUDA_ARCH=-gencode=arch=compute_61,code=sm_61 USE_JEMALLOC=1
   
   ```
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] sfraczek commented on issue #18879: mxnet python gpu not work

Posted by GitBox <gi...@apache.org>.

sfraczek commented on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-874192033


   Hi. 
   I hope it worked out for you on your main machine. Is this issue still relevant or can it be closed?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org

[GitHub] [incubator-mxnet] anko-intel commented on issue #18879: mxnet python gpu not work

Posted by GitBox <gi...@apache.org>.

anko-intel commented on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-686552643


   Hi @beew, @gzuchow will try to help you with the second issue.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org

[GitHub] [incubator-mxnet] beew edited a comment on issue #18879: mxnet python gpu not work

Posted by GitBox <gi...@apache.org>.

beew edited a comment on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-675014980


   @anko-intel 
   
   Hi,
   
   1.I am sticking to MKL 2018.2.199 because of this issue https://forums.developer.nvidia.com/t/nvblas-numpy-intelmkl-2018-3-not-work/66090 Not sure if there is any solution at the moment.
   
   2. Nvidia-smi doesn't tell you which cuda is actually used, rather what is the highest cuda version the driver supports (in this case it is 10.1) Since the change from Cuda 10.0 to Cuda 10.1 breaks a lot of things because it changes the names of the lib files and the folder structures. https://github.com/tensorflow/tensorflow/issues/26289 I would have to compile a lot of stuffs from scratch and test them, it will takes me several days if not weeks considering that it takes 3 hours just to compile tensorflow and probably run into other compatibility issues along the way, I don't think it is feasible for me right now, I would wait til next year to upgrade the OS to 20.04. At the moment using mxnet with openblas works for me.
   
   Thanks for the help.
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] beew edited a comment on issue #18879: mxnet python gpu not work

Posted by GitBox <gi...@apache.org>.

beew edited a comment on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-675014980


   @anko-intel 
   
   Hi,
   
   1.I am sticking to MKL 2018.2.199 because of this issue https://forums.developer.nvidia.com/t/nvblas-numpy-intelmkl-2018-3-not-work/66090 Not sure if there is any solution at the moment.
   
   2. Nvidia-smi doesn't tell you which cuda is actually used, rather what is the highest cuda version the driver supports (in this case it is 10.1) Since the change from Cuda 10 to Cuda 10.1 breaks a lot of things because it changes the names of the lib files and the folder structures. https://github.com/tensorflow/tensorflow/issues/26289 I would have to compile a lot of stuffs from scratch and test them, it will takes me several days considering that it takes 3 hours just to compile tensorflow, I don't think it is feasible for me right now, I would wait til next year to upgrade the OS to 20.04. At the moment using mxnet with openblas works for me.
   
   Thanks for the help.
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] szha commented on issue #18879: mxnet python gpu not work

Posted by GitBox <gi...@apache.org>.

szha commented on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-670988014


   @beew thanks for testing these settings, it's really helpful. the mxnet-cu100mkl doesn't contain MKL as blas, but only has mkldnn in it. The blas library used for that build is openblas.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] sfraczek commented on issue #18879: mxnet python gpu not work

Posted by GitBox <gi...@apache.org>.

sfraczek commented on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-874192033


   Hi. 
   I hope it worked out for you on your main machine. Is this issue still relevant or can it be closed?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org

[GitHub] [incubator-mxnet] beew edited a comment on issue #18879: mxnet python gpu not work

Posted by GitBox <gi...@apache.org>.

beew edited a comment on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-675564235


   @anko-intel 
   
   Yes, I can also set up update-alternatives just as I switch between mkl and openblas, I will give give it another shot  in a few days and report back. Thanks.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] szha commented on issue #18879: mxnet python gpu not work

Posted by GitBox <gi...@apache.org>.

szha commented on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-670830449


   @beew hi Bernard. Sorry to hear that you are facing this issue. Looks like you have the right CUDA_ARCH for your GPU card. Could you try a couple of things to rule out factors such as GPU, compiler, and dependency library? I'd like to know:
   - if the pre-compiled binary works on your platform. You can install a nightly build on the v1.x branch from https://dist.mxnet.io/python. You can use this command: `pip3 install --pre 'mxnet-cu100<2' -f https://dist.mxnet.io/python`
   - if it works on your build from source with JEMALLOC turned off.
   - if it works on your build from source without `source /opt/intel/bin/compilervars.sh intel64`
   
   For the later two, try varying **only** that option to rule out these specific causes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] beew edited a comment on issue #18879: mxnet python gpu not work

Posted by GitBox <gi...@apache.org>.

beew edited a comment on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-670987830


   @szha 
   Hi, 
   
   1.without source /opt/intel/bin/compilervars.sh intel64, build with openblas instead of mkl (since can't find mkl root and headers) 
   python gpu examples works (from here https://mxnet.apache.org/versions/1.6/get_started/validate_mxnet.html)
   
   2 With source /opt/intel/bin/compilervars.sh intel6 and build with mkl then gpu examples hang as reported here, regardless whether JEMALLOC is turned on or off.  If it is turned off, got an additional warning "Significant performance increases can be achieved by installing and enabling gperftools or jemalloc development packages"
   
   So it seems that mkl somehow doesn't play well with cuda...
   
   I haven't tried the nightly branch as it seems that the prebuild mxnet python wheels only support cuda > 10.0 after version 1.5.1,  so instead I tried pip install mxnet-cu100mkl==1.5.1  I was able to create the mxnet array in gpu context with no problem with 
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] pengzhao-intel commented on issue #18879: mxnet python gpu not work

Posted by GitBox <gi...@apache.org>.

pengzhao-intel commented on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-671165617


   > cc @PatricZhao whose team likely has the knowledge to find out why MKL build doesn't work in GPU builds.
   
   It sounds wired. We will take an investigation for the issue.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] beew commented on issue #18879: mxnet python gpu not work

Posted by GitBox <gi...@apache.org>.

beew commented on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-683290662


   @anko-intel 
   
   I have no access for my machine for a month, I tried with another machine, this one has MKL 2020.0.166.1 (the  latest installed via the deb repository but no cuda. I built mxnet with the options
   ```
   make -j4 USE_OPENCV=1 USE_BLAS=mkl USE_CUDA=0 USE_CUDNN=1 USE_JEMALLOC=1
   ```
   It failed with
   
   ```
   /usr/bin/ld: build/src/operator/tensor/dot.o: undefined reference to symbol 'cblas_ddot'
   /usr/bin/ld: /lib/x86_64-linux-gnu/libblas.so.3: error adding symbols: DSO missing from command line
   collect2: error: ld returned 1 exit status
   make: *** [Makefile:647: bin/im2rec] Error 1
   ```
   But building with openblas was successful.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] beew commented on issue #18879: mxnet python gpu not work

Posted by GitBox <gi...@apache.org>.

beew commented on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-675564235


   @anko-intel 
   
   Yes, I can also set up update-alternatives just as I switch between mkl and openblas, I will give it another try in a few days to give it another shot and report back. Thanks.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] beew edited a comment on issue #18879: mxnet python gpu not work

Posted by GitBox <gi...@apache.org>.

beew edited a comment on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-670987830


   @szha 
   Hi, 
   
   1.without source /opt/intel/bin/compilervars.sh intel64, build with openblas instead of mkl (since can't find mkl root and headers) 
   python gpu examples works (from here https://mxnet.apache.org/versions/1.6/get_started/validate_mxnet.html)
   
   2 With source /opt/intel/bin/compilervars.sh intel6 and build with mkl then gpu examples hang as reported here, regardless whether JEMALLOC is turned on or off.  If it is turned off, got an additional warning "Significant performance increases can be achieved by installing and enabling gperftools or jemalloc development packages"
   
   So it seems that mkl somehow doesn't play well with cuda...
   
   I haven't tried the nightly branch as it seems that the prebuild mxnet python wheels only support cuda > 10.0 after version 1.5.1,  so instead I tried pip install mxnet-cu100mkl==1.5.1  I was able to create the mxnet array in gpu context with no problem . 
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] beew edited a comment on issue #18879: mxnet python gpu not work

Posted by GitBox <gi...@apache.org>.

beew edited a comment on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-675014980

@anko-intel

Hi,

1.I am sticking to MKL 2018.2.199 because of this issue https://forums.developer.nvidia.com/t/nvblas-numpy-intelmkl-2018-3-not-work/66090 Not sure if there is any solution at the moment.

2. Nvidia-smi doesn't tell you which cuda is actually used, rather what is the highest cuda version the driver supports (in this case it is 10.1) Since the change from Cuda 10.0 to Cuda 10.1 breaks a lot of things because it changes the names of the lib files and the folder structures. https://github.com/tensorflow/tensorflow/issues/26289 I would have to compile a lot of stuffs from scratch and test them, it will takes me several days if not weeks considering that it takes 3 hours just to compile tensorflow and probably run into other compatibility issues along the way, I don't think it makes sense at this point since my system just works and is highly optimized and customized. I would wait til next year to upgrade the OS to 20.04. At the moment using mxnet with openblas works for me.

Thanks for the help.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] beew edited a comment on issue #18879: mxnet python gpu not work

Posted by GitBox <gi...@apache.org>.

beew edited a comment on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-675014980


   @anko-intel 
   
   Hi,
   
   1.I am sticking to MKL 2018.2.199 because of this issue https://forums.developer.nvidia.com/t/nvblas-numpy-intelmkl-2018-3-not-work/66090
   
   Not sure if there is any solution at the moment.
   
   2. Nvidia-smi doesn't tell you which cuda is actually used, rather what is the highest cuda version the driver supports (in this case it is 10.1) Since the change from Cuda 10 to Cuda 10.1 breaks a lot of things because it changes the names of the lib files and the folder structures. https://github.com/tensorflow/tensorflow/issues/26289 I would have to compile a lot of stuffs from scratch and test them, it will takes me several days considering that it takes 3 hours just to compile tensorflow, I don't think it is feasible for me right now, I would wait til next year to upgrade the OS to 20.04. At the moment using mxnet with openblas works for me.
   
   Thanks for the help.
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] beew edited a comment on issue #18879: mxnet python gpu not work

Posted by GitBox <gi...@apache.org>.

beew edited a comment on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-690540912


   > Using this tutorial: https://software.intel.com/content/www/us/en/develop/articles/installing-intel-free-libs-and-python-apt-repo.html by default mkl packages are installed in /opt/intel/ location.
   
   No, they are not. Doesn't matter what the documentation says, it is probably outdated. I know where they are installed in my machine. ;) I don't know about intel-python, I haven't installed that.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org

[GitHub] [incubator-mxnet] beew commented on issue #18879: mxnet python gpu not work

Posted by GitBox <gi...@apache.org>.

beew commented on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-675014980


   @anko-intel 
   
   Hi,
   
   1.I am sticking to MKL 2018.2.199 because of this issue https://forums.developer.nvidia.com/t/nvblas-numpy-intelmkl-2018-3-not-work/66090
   Not sure if there is any solution at the moment.
   
   2. Nvidia-smi doesn't tell you which cuda is actually used, rather what is the highest cuda version the driver supports (in this case it is 10.1) Since the change from Cuda 10 to Cuda 10.1 breaks a lot of things because it changes the names of the lib files and the folder structures. https://github.com/tensorflow/tensorflow/issues/26289 I would have to compile a lot of stuffs from scratch and test them, it will takes me several days considering that it takes 3 hours just to compile tensorflow, I don't think it is feasible for me right now, I would wait til next year to upgrade the OS to 20.04. At the moment using mxnet with openblas works for me.
   
   Thanks for the help.
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] bgawrych commented on issue #18879: mxnet python gpu not work

Posted by GitBox <gi...@apache.org>.

bgawrych commented on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-987627798


   Closing due to lack of response and no reproduction. If this issue is still relevant feel free to reopen


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org

[GitHub] [incubator-mxnet] beew edited a comment on issue #18879: mxnet python gpu not work

Posted by GitBox <gi...@apache.org>.

beew edited a comment on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-683290662


   @anko-intel 
   
   I have no access to my machine for a month, I tried with another machine, this one has MKL 2020.0.166.1 (the  latest installed via the deb repository), Ubuntu20.04 but no cuda. I built mxnet with the options
   ```
   make -j4 USE_OPENCV=1 USE_BLAS=mkl USE_CUDA=0 USE_CUDNN=0 USE_MKLDNN=1 USE_JEMALLOC=1
   ```
   It failed with
   
   ```
   /usr/bin/ld: build/src/operator/tensor/dot.o: undefined reference to symbol 'cblas_ddot'
   /usr/bin/ld: /lib/x86_64-linux-gnu/libblas.so.3: error adding symbols: DSO missing from command line
   collect2: error: ld returned 1 exit status
   make: *** [Makefile:647: bin/im2rec] Error 1
   ```
   But building with USE_BLAS=openblas (keeping all other options unchanged) was successful.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] anko-intel commented on issue #18879: mxnet python gpu not work

Posted by GitBox <gi...@apache.org>.

anko-intel commented on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-673610353


   Hi @beew,
   
   I have tried to reproduce the issue locally on similar hardware (GeForce GTX 1060 6GB, Intel(R) Core(TM) i7-5960X CPU @   3.00GHz) and software (Ubuntu 18.04, Kernel:  4.15.0-88, Nvidia driver 410.48 , CUDA 10.0.130) but unfortunately I couldn't (the test works for me).
   Could you paste the MxNet sha and MKL version or (better) attach info.txt file produced by following script (run from mxnet home directory without sourcing earlier /opt/intel/bin/compilervars.sh)? :
   
   ```
   INFO_FILE=info.txt
   git rev-parse --verify HEAD >${INFO_FILE}
   env >>${INFO_FILE}
   echo -------------------------------- >>${INFO_FILE}
   source /opt/intel/bin/compilervars.sh intel64
   env >>${INFO_FILE}
   nvidia-smi -L >>${INFO_FILE}
   nvidia-smi >>${INFO_FILE}
   cat /usr/local/cuda/version.txt >>${INFO_FILE}
   ```
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] github-actions[bot] commented on issue #18879: mxnet python gpu not work

Posted by GitBox <gi...@apache.org>.

github-actions[bot] commented on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-670813705


   Welcome to Apache MXNet (incubating)! We are on a mission to democratize AI, and we are glad that you are contributing to it by opening this issue.
   Please make sure to include all the relevant context, and one of the @apache/mxnet-committers will be here shortly.
   If you are interested in contributing to our project, let us know! Also, be sure to check out our guide on [contributing to MXNet](https://mxnet.apache.org/community/contribute) and our [development guides wiki](https://cwiki.apache.org/confluence/display/MXNET/Developments).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] gzuchow commented on issue #18879: mxnet python gpu not work

Posted by GitBox <gi...@apache.org>.

gzuchow commented on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-687088080


   Hi @beew, 
   
   Sorry to hear that you have still a problem with MXNet.
   I have tried to reproduce second issue which you encountered. Using Ubuntu 18.04 and 20.04 and flags used by you:
   
   > make -j4 USE_OPENCV=1 USE_BLAS=mkl USE_CUDA=0 USE_CUDNN=0 USE_MKLDNN=1 USE_JEMALLOC=1
   
   It ran smoothly, even completed with cmake method.
   
   I want to ask about a version of used MXNet and other libiraries.
   PAX-utlis will be needed for this setp: 
   `apt install pax-utils`
   Then run this and post to us the result of info.txt file.
   ```
   INFO_FILE=info.txt
   git rev-parse --verify HEAD >${INFO_FILE}
   source /opt/intel/bin/compilervars.sh intel64
   env >>${INFO_FILE}
   scanelf -l -s cblas_ddot |grep dot >>${INFO_FILE}
   ```
   
   According to this [tutorial](https://mxnet.apache.org/get_started/build_from_source) you can set MXNet environment.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org

[GitHub] [incubator-mxnet] anko-intel commented on issue #18879: mxnet python gpu not work

Posted by GitBox <gi...@apache.org>.

anko-intel commented on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-675441896


   Hi @beew,
   Thanks for your answer.
   You probably already know it - you can also have several MKL versions and switch between them.
   Thanks to that you can switch to MKL 2019 just before compilation of Mxnet by :
   ```
   ls -la /opt/intel/compilers_and_libraries
   cd /opt/intel
   sudo rm compilers_and_libraries
   sudo ln -s compilers_and_libraries_2019.1.144 compilers_and_libraries
   ```
   and after MxNet compilation you can restore compilers_and_libraries link to your default MKL versions used by other software.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] beew edited a comment on issue #18879: mxnet python gpu not work

Posted by GitBox <gi...@apache.org>.

beew edited a comment on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-683290662


   @anko-intel 
   
   I have no access to my machine for a month, I tried with another machine, this one has MKL 2020.0.166.1 (the  latest installed via intel's deb repository), Ubuntu20.04 but no cuda. I built mxnet with the options
   ```
   make -j4 USE_OPENCV=1 USE_BLAS=mkl USE_CUDA=0 USE_CUDNN=0 USE_MKLDNN=1 USE_JEMALLOC=1
   ```
   It failed with
   
   ```
   /usr/bin/ld: build/src/operator/tensor/dot.o: undefined reference to symbol 'cblas_ddot'
   /usr/bin/ld: /lib/x86_64-linux-gnu/libblas.so.3: error adding symbols: DSO missing from command line
   collect2: error: ld returned 1 exit status
   make: *** [Makefile:647: bin/im2rec] Error 1
   ```
   But building with USE_BLAS=openblas (keeping all other options unchanged) was successful.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] gzuchow commented on issue #18879: mxnet python gpu not work

Posted by GitBox <gi...@apache.org>.

gzuchow commented on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-690100751


   Using this tutorial: https://software.intel.com/content/www/us/en/develop/articles/installing-intel-free-libs-and-python-apt-repo.html by default mkl packages are installed in /opt/intel/ location.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org

[GitHub] [incubator-mxnet] beew edited a comment on issue #18879: mxnet python gpu not work

Posted by GitBox <gi...@apache.org>.

beew edited a comment on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-675014980


   @anko-intel 
   
   Hi,
   
   1.I am sticking to MKL 2018.2.199 because of this issue https://forums.developer.nvidia.com/t/nvblas-numpy-intelmkl-2018-3-not-work/66090 Not sure if there is any solution at the moment.
   
   2. Nvidia-smi doesn't tell you which cuda is actually used, rather what is the highest cuda version the driver supports (in this case it is 10.1) Since the change from Cuda 10 to Cuda 10.1 breaks a lot of things because it changes the names of the lib files and the folder structures. https://github.com/tensorflow/tensorflow/issues/26289 I would have to compile a lot of stuffs from scratch and test them, it will takes me several days if not weeks considering that it takes 3 hours just to compile tensorflow and probably run into other compatibility issues along the way, I don't think it is feasible for me right now, I would wait til next year to upgrade the OS to 20.04. At the moment using mxnet with openblas works for me.
   
   Thanks for the help.
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] beew commented on issue #18879: mxnet python gpu not work

Posted by GitBox <gi...@apache.org>.

beew commented on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-674360922


   @anko-intel 
   
   Here it is
   
   ```
   3143aabb60038b555db2960d712fb2806b16d581
   XDG_VTNR=7
   XDG_SESSION_ID=c2
   CLUTTER_IM_MODULE=xim
   XDG_GREETER_DATA_DIR=/var/lib/lightdm-data/bernard
   GPG_AGENT_INFO=/home/bernard/.gnupg/S.gpg-agent:0:1
   SHELL=/bin/bash
   VTE_VERSION=4205
   TERM=xterm-256color
   MKL_THREADING_LAYER=TBB
   QT_LINUX_ACCESSIBILITY_ALWAYS_ON=1
   LD_PRELOAD=/usr/lib/libblas.so /usr/lib/liblapack.so
   WINDOWID=123731978
   GNOME_KEYRING_CONTROL=
   UPSTART_SESSION=unix:abstract=/com/ubuntu/upstart-session/1000/1492
   NVBLAS_CONFIG_FILE=/home/bernard/.config/nvblas.conf
   GTK_MODULES=gail:atk-bridge:unity-gtk-module
   PYTHONUSERBASE=/home/bernard/opt/python37
   USER=bernard
   LD_LIBRARY_PATH=/home/bernard/opt/python37/lib:/home/bernard/opt/libglvnd/lib:/usr/local/lib/petsc:/opt/openmpi-cuda/lib:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64:/usr/local/tensorrt/lib:/home/bernard/opt/libQGLViewer/lib::/usr/local/lib:/home/bernard/opt/opencv/lib:/home/bernard/opt/latte/lib
   QT_ACCESSIBILITY=1
   LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=00:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arc=01;31:*.arj=01;31:*.taz=01;31:*.lha=01;31:*.lz4=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.tzo=01;31:*.t7z=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.dz=01;31:*.gz=01;31:*.lrz=01;31:*.lz=01;31:*.lzo=01;31:*.xz=01;31:*.bz2=01;31:*.bz=01;31:*.tbz=01;31:*.tbz2=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.war=01;31:*.ear=01;31:*.sar=01;31:*.rar=01;31:*.alz=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.cab=01;31:*.jpg=01;35:*.jpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.webm=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35
 :*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=00;36:*.au=00;36:*.flac=00;36:*.m4a=00;36:*.mid=00;36:*.midi=00;36:*.mka=00;36:*.mp3=00;36:*.mpc=00;36:*.ogg=00;36:*.ra=00;36:*.wav=00;36:*.oga=00;36:*.opus=00;36:*.spx=00;36:*.xspf=00;36:
   XDG_SESSION_PATH=/org/freedesktop/DisplayManager/Session0
   XDG_SEAT_PATH=/org/freedesktop/DisplayManager/Seat0
   CPATH=/usr/local/include/petsc::/home/bernard/opt/opencv/include
   SSH_AUTH_SOCK=/run/user/1000/keyring/ssh
   DEFAULTS_PATH=/usr/share/gconf/ubuntu.default.path
   XDG_CONFIG_DIRS=/etc/xdg/xdg-ubuntu:/usr/share/upstart/xdg:/etc/xdg
   PATH=/home/bernard/opt/python37/bin:/opt/openmpi-cuda/bin:/usr/local/cuda/bin:/home/bernard/bin:/home/bernard/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/home/bernard/opt/qt5/bin:/home/bernard/opt/opencv/bin:/home/bernard/opt/latte/bin
   DESKTOP_SESSION=ubuntu
   QT_QPA_PLATFORMTHEME=appmenu-qt5
   QT_IM_MODULE=ibus
   JOB=gnome-session
   PWD=/home/bernard/opt/incubator-mxnet
   XDG_SESSION_TYPE=x11
   XMODIFIERS=@im=ibus
   LANG=en_CA.UTF-8
   GNOME_KEYRING_PID=
   MANDATORY_PATH=/usr/share/gconf/ubuntu.mandatory.path
   GDM_LANG=en_CA
   IM_CONFIG_PHASE=1
   COMPIZ_CONFIG_PROFILE=ubuntu
   GDMSESSION=ubuntu
   GTK2_MODULES=overlay-scrollbar
   SESSIONTYPE=gnome-session
   XDG_SEAT=seat0
   HOME=/home/bernard/opt/python37
   SHLVL=2
   LANGUAGE=en_CA:en
   GNOME_DESKTOP_SESSION_ID=this-is-deprecated
   UPSTART_INSTANCE=
   SLEPC_DIR=/home/bernard/opt/slepc
   LOGNAME=bernard
   XDG_SESSION_DESKTOP=ubuntu
   UPSTART_EVENTS=started starting
   PYTHONPATH=/home/bernard/opt/python37/lib/python3.7/site-packages:
   PREFIX=/home/bernard/opt/python37
   QT4_IM_MODULE=xim
   XDG_DATA_DIRS=/home/bernard/.local/share/flatpak/exports/share:/var/lib/flatpak/exports/share/:/usr/share/ubuntu:/usr/share/gnome:/home/bernard/.local/share/flatpak/exports/share:/var/lib/flatpak/exports/share:/usr/local/share:/usr/share
   DBUS_SESSION_BUS_ADDRESS=unix:abstract=/tmp/dbus-NJkEMOa0FZ
   PKG_CONFIG_PATH=:/home/bernard/opt/opencv/lib/pkgconfig
   LESSOPEN=| /usr/bin/lesspipe %s
   UPSTART_JOB=unity-settings-daemon
   INSTANCE=Unity
   DISPLAY=:0
   XDG_RUNTIME_DIR=/run/user/1000
   GTK_IM_MODULE=ibus
   XDG_CURRENT_DESKTOP=Unity
   PETSC_DIR=/home/bernard/opt/petsc
   LESSCLOSE=/usr/bin/lesspipe %s %s
   XAUTHORITY=/home/bernard/.Xauthority
   _=/usr/bin/env
   --------------------------------
   MKLROOT=/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl
   XDG_VTNR=7
   MANPATH=/opt/intel/compilers_and_libraries_2018.2.199/linux/mpi/man:/home/bernard/opt/python37/man:/opt/openmpi-cuda/share/man:/home/bernard/.local/share/man:/usr/local/man:/usr/local/share/man:/usr/share/man
   XDG_SESSION_ID=c2
   CLUTTER_IM_MODULE=xim
   XDG_GREETER_DATA_DIR=/var/lib/lightdm-data/bernard
   INTEL_LICENSE_FILE=/opt/intel/compilers_and_libraries_2018.2.199/linux/licenses:/opt/intel/licenses:/home/bernard/opt/python37/intel/licenses
   IPPROOT=/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp
   GPG_AGENT_INFO=/home/bernard/.gnupg/S.gpg-agent:0:1
   SHELL=/bin/bash
   VTE_VERSION=4205
   TERM=xterm-256color
   MKL_THREADING_LAYER=TBB
   LIBRARY_PATH=/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/lib/intel64:/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64_lin:/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64_lin:/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.7:/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.7:/opt/intel/compilers_and_libraries_2018.2.199/linux/daal/lib/intel64_lin:/opt/intel/compilers_and_libraries_2018.2.199/linux/daal/../tbb/lib/intel64_lin/gcc4.4
   QT_LINUX_ACCESSIBILITY_ALWAYS_ON=1
   LD_PRELOAD=/usr/lib/libblas.so /usr/lib/liblapack.so
   WINDOWID=123731978
   GNOME_KEYRING_CONTROL=
   UPSTART_SESSION=unix:abstract=/com/ubuntu/upstart-session/1000/1492
   NVBLAS_CONFIG_FILE=/home/bernard/.config/nvblas.conf
   GTK_MODULES=gail:atk-bridge:unity-gtk-module
   PYTHONUSERBASE=/home/bernard/opt/python37
   USER=bernard
   LD_LIBRARY_PATH=/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64:/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64_lin:/opt/intel/compilers_and_libraries_2018.2.199/linux/mpi/intel64/lib:/opt/intel/compilers_and_libraries_2018.2.199/linux/mpi/mic/lib:/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/lib/intel64:/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64_lin:/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64_lin:/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.7:/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/lib/intel64/gcc4.7:/opt/intel/compilers_and_libraries_2018.2.199/linux/daal/lib/intel64_lin:/opt/intel/compilers_and_libraries_2018.2.199/linux/daal/../tbb/lib/intel64_lin/gcc4.4:/home/bernard/opt/python37/lib:/home/bernard/opt/libglvnd/lib:/usr/local/lib/petsc:/opt/openmpi-cuda/lib:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64:/usr/loc
 al/tensorrt/lib:/home/bernard/opt/libQGLViewer/lib::/usr/local/lib:/home/bernard/opt/opencv/lib:/home/bernard/opt/latte/lib
   QT_ACCESSIBILITY=1
   LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=00:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arc=01;31:*.arj=01;31:*.taz=01;31:*.lha=01;31:*.lz4=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.tzo=01;31:*.t7z=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.dz=01;31:*.gz=01;31:*.lrz=01;31:*.lz=01;31:*.lzo=01;31:*.xz=01;31:*.bz2=01;31:*.bz=01;31:*.tbz=01;31:*.tbz2=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.war=01;31:*.ear=01;31:*.sar=01;31:*.rar=01;31:*.alz=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.cab=01;31:*.jpg=01;35:*.jpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.webm=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35
 :*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=00;36:*.au=00;36:*.flac=00;36:*.m4a=00;36:*.mid=00;36:*.midi=00;36:*.mka=00;36:*.mp3=00;36:*.mpc=00;36:*.ogg=00;36:*.ra=00;36:*.wav=00;36:*.oga=00;36:*.opus=00;36:*.spx=00;36:*.xspf=00;36:
   PSTLROOT=/opt/intel/compilers_and_libraries_2018.2.199/linux/pstl
   XDG_SESSION_PATH=/org/freedesktop/DisplayManager/Session0
   XDG_SEAT_PATH=/org/freedesktop/DisplayManager/Seat0
   CPATH=/opt/intel/compilers_and_libraries_2018.2.199/linux/ipp/include:/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/include:/opt/intel/compilers_and_libraries_2018.2.199/linux/pstl/include:/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/include:/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb/include:/opt/intel/compilers_and_libraries_2018.2.199/linux/daal/include:/usr/local/include/petsc::/home/bernard/opt/opencv/include
   SSH_AUTH_SOCK=/run/user/1000/keyring/ssh
   DEFAULTS_PATH=/usr/share/gconf/ubuntu.default.path
   XDG_CONFIG_DIRS=/etc/xdg/xdg-ubuntu:/usr/share/upstart/xdg:/etc/xdg
   NLSPATH=/opt/intel/compilers_and_libraries_2018.2.199/linux/compiler/lib/intel64/locale/%l_%t/%N:/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64_lin/locale/%l_%t/%N
   PATH=/opt/intel/compilers_and_libraries_2018.2.199/linux/bin/intel64:/opt/intel/compilers_and_libraries_2018.2.199/linux/mpi/intel64/bin:/home/bernard/opt/python37/bin:/opt/openmpi-cuda/bin:/usr/local/cuda/bin:/home/bernard/bin:/home/bernard/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/home/bernard/opt/qt5/bin:/home/bernard/opt/opencv/bin:/home/bernard/opt/latte/bin
   DESKTOP_SESSION=ubuntu
   QT_QPA_PLATFORMTHEME=appmenu-qt5
   QT_IM_MODULE=ibus
   TBBROOT=/opt/intel/compilers_and_libraries_2018.2.199/linux/tbb
   JOB=gnome-session
   PWD=/home/bernard/opt/incubator-mxnet
   XDG_SESSION_TYPE=x11
   XMODIFIERS=@im=ibus
   LANG=en_CA.UTF-8
   GNOME_KEYRING_PID=
   MANDATORY_PATH=/usr/share/gconf/ubuntu.mandatory.path
   GDM_LANG=en_CA
   IM_CONFIG_PHASE=1
   COMPIZ_CONFIG_PROFILE=ubuntu
   DAALROOT=/opt/intel/compilers_and_libraries_2018.2.199/linux/daal
   GDMSESSION=ubuntu
   GTK2_MODULES=overlay-scrollbar
   SESSIONTYPE=gnome-session
   XDG_SEAT=seat0
   HOME=/home/bernard/opt/python37
   SHLVL=2
   LANGUAGE=en_CA:en
   GNOME_DESKTOP_SESSION_ID=this-is-deprecated
   UPSTART_INSTANCE=
   SLEPC_DIR=/home/bernard/opt/slepc
   LOGNAME=bernard
   XDG_SESSION_DESKTOP=ubuntu
   UPSTART_EVENTS=started starting
   PYTHONPATH=/home/bernard/opt/python37/lib/python3.7/site-packages:
   PREFIX=/home/bernard/opt/python37
   CLASSPATH=/opt/intel/compilers_and_libraries_2018.2.199/linux/mpi/intel64/lib/mpi.jar:/opt/intel/compilers_and_libraries_2018.2.199/linux/daal/lib/daal.jar
   QT4_IM_MODULE=xim
   XDG_DATA_DIRS=/home/bernard/.local/share/flatpak/exports/share:/var/lib/flatpak/exports/share/:/usr/share/ubuntu:/usr/share/gnome:/home/bernard/.local/share/flatpak/exports/share:/var/lib/flatpak/exports/share:/usr/local/share:/usr/share
   DBUS_SESSION_BUS_ADDRESS=unix:abstract=/tmp/dbus-NJkEMOa0FZ
   PKG_CONFIG_PATH=/opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/bin/pkgconfig::/home/bernard/opt/opencv/lib/pkgconfig
   LESSOPEN=| /usr/bin/lesspipe %s
   UPSTART_JOB=unity-settings-daemon
   INSTANCE=Unity
   DISPLAY=:0
   XDG_RUNTIME_DIR=/run/user/1000
   GTK_IM_MODULE=ibus
   XDG_CURRENT_DESKTOP=Unity
   PETSC_DIR=/home/bernard/opt/petsc
   LESSCLOSE=/usr/bin/lesspipe %s %s
   I_MPI_ROOT=/opt/intel/compilers_and_libraries_2018.2.199/linux/mpi
   XAUTHORITY=/home/bernard/.Xauthority
   _=/usr/bin/env
   GPU 0: GeForce GTX 1070 (UUID: GPU-98bf2a5b-636b-5cbe-cc66-cdf024b8a920)
   Sat Aug 15 03:11:15 2020       
   +-----------------------------------------------------------------------------+
   | NVIDIA-SMI 418.56       Driver Version: 418.56       CUDA Version: 10.1     |
   |-------------------------------+----------------------+----------------------+
   | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
   | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
   |===============================+======================+======================|
   |   0  GeForce GTX 1070    Off  | 00000000:01:00.0  On |                  N/A |
   | N/A   42C    P8     6W /  N/A |    408MiB /  8111MiB |      3%      Default |
   +-------------------------------+----------------------+----------------------+
                                                                                  
   +-----------------------------------------------------------------------------+
   | Processes:                                                       GPU Memory |
   |  GPU       PID   Type   Process name                             Usage      |
   |=============================================================================|
   |    0      1142      G   /usr/lib/xorg/Xorg                           211MiB |
   |    0      2056      G   compiz                                       192MiB |
   |    0      3329      G   /usr/lib/firefox/firefox                       2MiB |
   +-----------------------------------------------------------------------------+
   CUDA Version 10.0.130
   CUDA Patch Version 10.0.130.1
   
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] beew commented on issue #18879: mxnet python gpu not work

Posted by GitBox <gi...@apache.org>.

beew commented on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-690540912


   > Using this tutorial: https://software.intel.com/content/www/us/en/develop/articles/installing-intel-free-libs-and-python-apt-repo.html by default mkl packages are installed in /opt/intel/ location.
   
   No, they are not. Doesn't matter what the documentation says, it is probably outdated. I know where they are installed in my machine. I don't know about intel-python, I haven't installed that.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org

[GitHub] [incubator-mxnet] szha commented on issue #18879: mxnet python gpu not work

Posted by GitBox <gi...@apache.org>.

szha commented on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-670988062


   cc @PatricZhao whose team likely has the knowledge to find out why MKL build doesn't work in GPU builds.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] beew commented on issue #18879: mxnet python gpu not work

Posted by GitBox <gi...@apache.org>.

beew commented on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-670987830


   @szha 
   Hi, 
   
   1.without source /opt/intel/bin/compilervars.sh intel64, build with openblas instead of mkl (since can't find mkl root and headers) 
   python gpu examples works (from here https://mxnet.apache.org/versions/1.6/get_started/validate_mxnet.html)
   
   2 With /opt/intel/bin/compilervars.sh intel6 and build with mkl then gpu examples hang as reported here, regardless whether JEMALLOC is turned on or off.  If it is turned off, got an additional warning "Significant performance increases can be achieved by installing and enabling gperftools or jemalloc development packages"
   
   So it seems that mkl somehow doesn't play well with cuda...
   
   I haven't tried the nightly branch as it seems that the prebuild mxnet python wheels only support cuda > 10.0 after version 1.5.1,  so instead I tried pip install mxnet-cu100mkl==1.5.1  I was able to create the mxnet array in gpu context with no problem with 
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] anko-intel commented on issue #18879: mxnet python gpu not work

Posted by GitBox <gi...@apache.org>.

anko-intel commented on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-675004193


   Hi @beew,
   
   Thanks for your answer. I checked out mxnet sha 3143aabb6 and installed MKL version 2019.1.144 but unfortunately I still cannot reproduce  the issue.
   Formerly I cannot install MKL 2018.2.199 which you use as it does not support Ubuntu 18.4. When I force to install it, python3 hangs during “import mxnet as mx” – but it could be a different issue than yours.
   I also noticed that your driver CUDA version is 10.1 (nvidia-smi) differs than software packet 10.0.130 (cat /usr/local/cuda/version.txt)
   So could I ask you to try :
   1. Upgrade MKL to 2019 version or newer (you can found it here: https://registrationcenter.intel.com/en/products/download/3685/ )
   2. upgrade your CUDA to 10.1 version to have  it consistent  with the driver
   (you can have a couple of CUDA versions in the system, see what is pointed by the link  /usr/local/cuda)
   
   Please let me know if it helps.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] beew edited a comment on issue #18879: mxnet python gpu not work

Posted by GitBox <gi...@apache.org>.

beew edited a comment on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-683290662


   @anko-intel 
   
   I have no access to my machine for a month, I tried with another machine, this one has MKL 2020.0.166.1 (the  latest installed via the deb repository), Ubuntu20.04 but no cuda. I built mxnet with the options
   ```
   make -j4 USE_OPENCV=1 USE_BLAS=mkl USE_CUDA=0 USE_CUDNN=0 USE_MKLDNN=1 USE_JEMALLOC=1
   ```
   It failed with
   
   ```
   /usr/bin/ld: build/src/operator/tensor/dot.o: undefined reference to symbol 'cblas_ddot'
   /usr/bin/ld: /lib/x86_64-linux-gnu/libblas.so.3: error adding symbols: DSO missing from command line
   collect2: error: ld returned 1 exit status
   make: *** [Makefile:647: bin/im2rec] Error 1
   ```
   But building with USE_BLAS=openblas was successful.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] beew edited a comment on issue #18879: mxnet python gpu not work

Posted by GitBox <gi...@apache.org>.

beew edited a comment on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-683290662


   @anko-intel 
   
   I have no access for my machine for a month, I tried with another machine, this one has MKL 2020.0.166.1 (the  latest installed via the deb repository), Ubuntu20.04 but no cuda. I built mxnet with the options
   ```
   make -j4 USE_OPENCV=1 USE_BLAS=mkl USE_CUDA=0 USE_CUDNN=1 USE_JEMALLOC=1
   ```
   It failed with
   
   ```
   /usr/bin/ld: build/src/operator/tensor/dot.o: undefined reference to symbol 'cblas_ddot'
   /usr/bin/ld: /lib/x86_64-linux-gnu/libblas.so.3: error adding symbols: DSO missing from command line
   collect2: error: ld returned 1 exit status
   make: *** [Makefile:647: bin/im2rec] Error 1
   ```
   But building with USE_BLAS=openblas was successful.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] beew commented on issue #18879: mxnet python gpu not work

Posted by GitBox <gi...@apache.org>.

beew commented on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-687674781


   > 
   > INFO_FILE=info.txt
   > git rev-parse --verify HEAD >${INFO_FILE}
   > source /opt/intel/bin/compilervars.sh intel64
   > env >>${INFO_FILE}
   > scanelf -l -s cblas_ddot |grep dot >>${INFO_FILE}
   
   Actually there is no  /opt/intel/bin/compilervars.sh intel64
   Intelmkl on this machine was installed with the deb repository
   https://software.intel.com/content/www/us/en/develop/articles/installing-intel-free-libs-and-python-apt-repo.html
   
   So in this case the library paths should already in standard locations ( e,g /usr/lib/x86_64-linux-gnu)
   
   I appreciate the help, but probably isn't worth the troubles. I cannot test very much on this laptop since it took me ~7 -8 hours to build mxnet from source here while my main machine is currently not accessible.  Thanks again, I will try when I get my main machine back in a few weeks.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org

[GitHub] [incubator-mxnet] beew edited a comment on issue #18879: mxnet python gpu not work

Posted by GitBox <gi...@apache.org>.

beew edited a comment on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-683290662


   @anko-intel 
   
   I have no access for my machine for a month, I tried with another machine, this one has MKL 2020.0.166.1 (the  latest installed via the deb repository), Ubuntu20.04 but no cuda. I built mxnet with the options
   ```
   make -j4 USE_OPENCV=1 USE_BLAS=mkl USE_CUDA=0 USE_CUDNN=1 USE_JEMALLOC=1
   ```
   It failed with
   
   ```
   /usr/bin/ld: build/src/operator/tensor/dot.o: undefined reference to symbol 'cblas_ddot'
   /usr/bin/ld: /lib/x86_64-linux-gnu/libblas.so.3: error adding symbols: DSO missing from command line
   collect2: error: ld returned 1 exit status
   make: *** [Makefile:647: bin/im2rec] Error 1
   ```
   But building with openblas was successful.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] beew edited a comment on issue #18879: mxnet python gpu not work

Posted by GitBox <gi...@apache.org>.

beew edited a comment on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-683290662


   @anko-intel 
   
   I have no access to my machine for a month, I tried with another machine, this one has MKL 2020.0.166.1 (the  latest installed via the deb repository), Ubuntu20.04 but no cuda. I built mxnet with the options
   ```
   make -j4 USE_OPENCV=1 USE_BLAS=mkl USE_CUDA=0 USE_mkDNN=1 USE_JEMALLOC=1
   ```
   It failed with
   
   ```
   /usr/bin/ld: build/src/operator/tensor/dot.o: undefined reference to symbol 'cblas_ddot'
   /usr/bin/ld: /lib/x86_64-linux-gnu/libblas.so.3: error adding symbols: DSO missing from command line
   collect2: error: ld returned 1 exit status
   make: *** [Makefile:647: bin/im2rec] Error 1
   ```
   But building with USE_BLAS=openblas was successful.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] beew edited a comment on issue #18879: mxnet python gpu not work

Posted by GitBox <gi...@apache.org>.

beew edited a comment on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-683290662


   @anko-intel 
   
   I have no access to my machine for a month, I tried with another machine, this one has MKL 2020.0.166.1 (the  latest installed via the deb repository), Ubuntu20.04 but no cuda. I built mxnet with the options
   ```
   make -j4 USE_OPENCV=1 USE_BLAS=mkl USE_CUDA=0 USE_CUDNN=1 USE_JEMALLOC=1
   ```
   It failed with
   
   ```
   /usr/bin/ld: build/src/operator/tensor/dot.o: undefined reference to symbol 'cblas_ddot'
   /usr/bin/ld: /lib/x86_64-linux-gnu/libblas.so.3: error adding symbols: DSO missing from command line
   collect2: error: ld returned 1 exit status
   make: *** [Makefile:647: bin/im2rec] Error 1
   ```
   But building with USE_BLAS=openblas was successful.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] beew edited a comment on issue #18879: mxnet python gpu not work

Posted by GitBox <gi...@apache.org>.

beew edited a comment on issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879#issuecomment-670987830


   @szha 
   Hi, 
   
   1.without source /opt/intel/bin/compilervars.sh intel64, build with openblas instead of mkl (since can't find mkl root and headers) 
   python gpu examples works (from here https://mxnet.apache.org/versions/1.6/get_started/validate_mxnet.html)
   
   2 With source /opt/intel/bin/compilervars.sh intel6 and build with mkl then gpu examples hang as reported here, regardless whether JEMALLOC is turned on or off.  If it is turned off, got an additional warning "Significant performance increases can be achieved by installing and enabling gperftools or jemalloc development packages"
   
   So it seems that mkl somehow doesn't play well with cuda...
   
   I haven't tried the nightly branch as it seems that the prebuild mxnet python wheels only support cuda > 10.0 after version 1.5.1,  so instead I tried pip install mxnet-cu100mkl==1.5.1  I was able to create the mxnet array in gpu context with no problem . 
   
   P.S except for the prebuild mxnet wheels, all other tests were done with mxnet 1.7 from source "git checkout -b v1.7 origin/v1.7.x"
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] bgawrych closed issue #18879: mxnet python gpu not work

Posted by GitBox <gi...@apache.org>.

bgawrych closed issue #18879:
URL: https://github.com/apache/incubator-mxnet/issues/18879


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org