You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2020/01/17 23:16:23 UTC

[GitHub] [incubator-mxnet] apeforest opened a new issue #17366: Coredump running unit test with MKL Blas

apeforest opened a new issue #17366: Coredump running unit test with MKL Blas
URL: https://github.com/apache/incubator-mxnet/issues/17366
 
 
   ## Description
   I build mxnet with MKL blas and MKLDNN. The build was successful, however, when I ran unit test, I got the following core dump. Note: I am using master branch so there is no local changes from me. 
   ### Error Message
   test_mkldnn.test_convolution ... [23:12:00] ../src/executor/graph_executor.cc:2062: Subgraph backend MKLDNN is activated.
   [23:12:00] ../src/executor/../operator/../common/utils.h:472:
   Storage type fallback detected:
   operator = Convolution
   input storage types = [row_sparse, row_sparse, row_sparse, ]
   output storage types = [default, ]
   params = {"num_filter" : 4, "kernel" : (3,), "stride" : 2, }
   context.dev_mask = cpu
   The operator with default storage type will be dispatched for execution. You're seeing this warning message because the operator above is unable to process the given ndarrays with specified storage types, context and parameter. Temporary dense ndarrays are generated in order to execute the operator. This does not affect the correctness of the programme. You can set environment variable MXNET_STORAGE_FALLBACK_LOG_VERBOSE to 0 to suppress this warning.
   [23:12:00] ../src/executor/../operator/../common/utils.h:472:
   Storage type fallback detected:
   operator = _backward_Convolution
   input storage types = [default, row_sparse, row_sparse, row_sparse, ]
   output storage types = [default, default, default, ]
   params = {"num_filter" : 4, "kernel" : (3,), "stride" : 2, }
   context.dev_mask = cpu
   The operator with default storage type will be dispatched for execution. You're seeing this warning message because the operator above is unable to process the given ndarrays with specified storage types, context and parameter. Temporary dense ndarrays are generated in order to execute the operator. This does not affect the correctness of the programme. You can set environment variable MXNET_STORAGE_FALLBACK_LOG_VERBOSE to 0 to suppress this warning.
   OMP: Error #15: Initializing libiomp5.so, but found libomp.so already initialized.
   OMP: Hint This means that multiple copies of the OpenMP runtime have been linked into the program. That is dangerous, since it can degrade performance or cause incorrect results. The best thing to do is to ensure that only a single OpenMP runtime is linked into the process, e.g. by avoiding static linking of the OpenMP runtime in any library. As an unsafe, unsupported, undocumented workaround you can set the environment variable KMP_DUPLICATE_LIB_OK=TRUE to allow the program to continue to execute, but that may cause crashes or silently produce incorrect results. For more information, please see http://www.intel.com/software/products/support/.
   Aborted (core dumped)
   
   ## To Reproduce
   1.
   ```
   rm -rf build
   mkdir -p build && cd build
   cmake -GNinja \
       -DUSE_CUDA=OFF \
       -DUSE_MKL_IF_AVAILABLE=ON \
       -DCMAKE_BUILD_TYPE=Release \
   ..
   ninja
   ```
   2.
   ```
   cd ..
   pip install -e python
   ```
   3.
   ```
   nosetests -v tests/python/mkl
   ```
   
   ## Environment
   
   We recommend using our script for collecting the diagnositc information. Run the following command and paste the outputs below:
   ```
   
   ----------Python Info----------
   Version      : 3.6.6
   Compiler     : GCC 7.2.0
   Build        : ('default', 'Jun 28 2018 17:14:51')
   Arch         : ('64bit', '')
   ------------Pip Info-----------
   Version      : 19.3.1
   Directory    : /home/ubuntu/.virtualenvs/mxnet/lib/python3.6/site-packages/pip
   ----------MXNet Info-----------
   Version      : 1.6.0
   Directory    : /home/ubuntu/src/mxnet/python/mxnet
   Num GPUs     : 0
   Hashtag not found. Not installed from pre-built package.
   ----------System Info----------
   Platform     : Linux-4.4.0-1100-aws-x86_64-with-debian-stretch-sid
   system       : Linux
   node         : ip-172-31-45-221
   release      : 4.4.0-1100-aws
   version      : #111-Ubuntu SMP Wed Dec 4 12:20:15 UTC 2019
   ----------Hardware Info----------
   machine      : x86_64
   processor    : x86_64
   Architecture:          x86_64
   CPU op-mode(s):        32-bit, 64-bit
   Byte Order:            Little Endian
   CPU(s):                36
   On-line CPU(s) list:   0-35
   Thread(s) per core:    2
   Core(s) per socket:    18
   Socket(s):             1
   NUMA node(s):          1
   Vendor ID:             GenuineIntel
   CPU family:            6
   Model:                 85
   Model name:            Intel(R) Xeon(R) Platinum 8124M CPU @ 3.00GHz
   Stepping:              4
   CPU MHz:               3000.000
   BogoMIPS:              6000.00
   Hypervisor vendor:     KVM
   Virtualization type:   full
   L1d cache:             32K
   L1i cache:             32K
   L2 cache:              1024K
   L3 cache:              25344K
   NUMA node0 CPU(s):     0-35
   Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon rep_good nopl xtopology nonstop_tsc aperfmperf tsc_known_freq pni pclmulqdq monitor ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single kaiser fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx avx512f rdseed adx smap clflushopt clwb avx512cd xsaveopt xsavec xgetbv1 ida arat pku
   ----------Network Test----------
   Setting timeout: 10
   Timing for MXNet: https://github.com/apache/incubator-mxnet, DNS: 0.0014 sec, LOAD: 0.5697 sec.
   Timing for GluonNLP GitHub: https://github.com/dmlc/gluon-nlp, DNS: 0.0005 sec, LOAD: 0.7626 sec.
   Timing for GluonNLP: http://gluon-nlp.mxnet.io, DNS: 0.1701 sec, LOAD: 0.6120 sec.
   Timing for D2L: http://d2l.ai, DNS: 0.0159 sec, LOAD: 0.0476 sec.
   Timing for D2L (zh-cn): http://zh.d2l.ai, DNS: 0.0120 sec, LOAD: 0.2423 sec.
   Timing for FashionMNIST: https://repo.mxnet.io/gluon/dataset/fashion-mnist/train-labels-idx1-ubyte.gz, DNS: 0.0208 sec, LOAD: 0.5693 sec.
   Timing for PYPI: https://pypi.python.org/pypi/pip, DNS: 0.0012 sec, LOAD: 0.4202 sec.
   Timing for Conda: https://repo.continuum.io/pkgs/free/, DNS: 0.0092 sec, LOAD: 0.0551 sec.
   ```
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] apeforest commented on issue #17366: Coredump running unit test with MKL Blas

Posted by GitBox <gi...@apache.org>.
apeforest commented on issue #17366: Coredump running unit test with MKL Blas
URL: https://github.com/apache/incubator-mxnet/issues/17366#issuecomment-575833134
 
 
   @PatricZhao @TaoLv I will appreciate if you guys could provide input to this problem. I am working on a PR https://github.com/apache/incubator-mxnet/pull/16735 recently that requires me to build with MKL blas and MKLDNN. However, I found it not a good experience using MKL. Please see my other issue opened: https://github.com/apache/incubator-mxnet/issues/13881.
   
   Since our man page encourages users to build mxnet with MKL blas, it's better to provide a seamless solution to them.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] apeforest commented on issue #17366: Coredump running unit test with MKL Blas

Posted by GitBox <gi...@apache.org>.
apeforest commented on issue #17366: Coredump running unit test with MKL Blas
URL: https://github.com/apache/incubator-mxnet/issues/17366#issuecomment-584486048
 
 
   export KMP_DUPLICATE_LIB_OK=TRUE works. Closing this issue.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] apeforest edited a comment on issue #17366: Coredump running unit test with MKL Blas

Posted by GitBox <gi...@apache.org>.
apeforest edited a comment on issue #17366: Coredump running unit test with MKL Blas
URL: https://github.com/apache/incubator-mxnet/issues/17366#issuecomment-575833134
 
 
   @PatricZhao @TaoLv I will appreciate if you guys could provide input to this problem. I am working on a PR https://github.com/apache/incubator-mxnet/pull/16735 recently that requires me to build with MKL blas and MKLDNN. However, I found it not a good experience using MKL. Please see my other issue opened: https://github.com/apache/incubator-mxnet/issues/13881.
   
   Since our man page encourages users to build mxnet with MKL blas, it would be better if they could use this feature seamlessly.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] TaoLv commented on issue #17366: Coredump running unit test with MKL Blas

Posted by GitBox <gi...@apache.org>.
TaoLv commented on issue #17366: Coredump running unit test with MKL Blas
URL: https://github.com/apache/incubator-mxnet/issues/17366#issuecomment-575897376
 
 
   Have you ever tried the suggestion in the error message to `export KMP_DUPLICATE_LIB_OK=TRUE` before running the test? This is also the interoperability problem of different omp runtimes I mentioned in https://github.com/apache/incubator-mxnet/issues/16891.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] apeforest closed issue #17366: Coredump running unit test with MKL Blas

Posted by GitBox <gi...@apache.org>.
apeforest closed issue #17366: Coredump running unit test with MKL Blas
URL: https://github.com/apache/incubator-mxnet/issues/17366
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services