You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2018/05/04 09:55:17 UTC

[GitHub] dwSun opened a new issue #10809: Check failed: format != mkl_mem_->GetFormat() (5 vs. 5)

dwSun opened a new issue #10809: Check failed: format != mkl_mem_->GetFormat() (5 vs. 5)
URL: https://github.com/apache/incubator-mxnet/issues/10809
 
 
   ## Description
   Crashed when training a model.
   
   With code from [this tutorial](http://mxnet.incubator.apache.org/tutorials/gluon/datasets.html), I try to train my own model with MobileNetV2. But it crashed with mxnet-mkl-1.2.0b20180503 from pypi.
   On mxnet-mkl-1.1.0 from pypi, this code works.
   
   Batch size 32 and 16 can reproduce this error, others like 8 or 32 seems can't. Smaller network can't reproduce this error.
   Not sure this error related to pr #10317 or not.
   
   And maybe this is a same error like issue #10807.
   
   ## Environment info (Required)
   This is the code
   [crash.zip](https://github.com/apache/incubator-mxnet/files/1973878/crash.zip)
   Run with
   ```py
   python3 fashion.py
   ```
   
   Package used (Python/R/Scala/Julia):
   ```
   % pip3 list
   Package         Version       
   --------------- --------------
   certifi         2018.4.16     
   chardet         3.0.4         
   graphviz        0.8.3         
   idna            2.6           
   mxnet-mkl       1.2.0b20180503
   numpy           1.14.3        
   pandas          0.22.0        
   pip             10.0.1        
   pkg-resources   0.0.0         
   python-dateutil 2.7.2         
   pytz            2018.4        
   requests        2.18.4        
   setuptools      39.1.0        
   six             1.11.0        
   urllib3         1.22          
   wheel           0.31.0        
   
   ```
   ## Error Message:
   ```
   % python3 fashion.py 
   [17:28:49] src/operator/nn/mkldnn/mkldnn_base.cc:60: Allocate 57344 bytes with malloc directly
   [17:28:49] src/operator/nn/mkldnn/mkldnn_base.cc:60: Allocate 4096 bytes with malloc directly
   [17:28:49] src/operator/nn/mkldnn/mkldnn_base.cc:60: Allocate 172032 bytes with malloc directly
   [17:28:50] src/operator/nn/mkldnn/mkldnn_base.cc:60: Allocate 57344 bytes with malloc directly
   [17:28:50] src/operator/nn/mkldnn/mkldnn_base.cc:60: Allocate 4096 bytes with malloc directly
   [17:28:50] src/operator/nn/mkldnn/mkldnn_base.cc:60: Allocate 172032 bytes with malloc directly
   Epoch 0, training loss: 2.55, validation loss: 2.31
   [17:28:50] src/operator/nn/mkldnn/mkldnn_base.cc:60: Allocate 57344 bytes with malloc directly
   [17:28:50] src/operator/nn/mkldnn/mkldnn_base.cc:60: Allocate 172032 bytes with malloc directly
   [17:28:50] src/operator/nn/mkldnn/mkldnn_base.cc:60: Allocate 1638400 bytes with malloc directly
   [17:28:50] src/operator/nn/mkldnn/mkldnn_base.cc:60: Allocate 1638400 bytes with malloc directly
   [17:28:50] src/operator/nn/mkldnn/mkldnn_base.cc:60: Allocate 57344 bytes with malloc directly
   [17:28:50] src/operator/nn/mkldnn/mkldnn_base.cc:60: Allocate 4096 bytes with malloc directly
   [17:28:50] src/operator/nn/mkldnn/mkldnn_base.cc:60: Allocate 172032 bytes with malloc directly
   Epoch 1, training loss: 2.56, validation loss: 2.35
   [17:28:50] src/operator/nn/mkldnn/mkldnn_base.cc:60: Allocate 57344 bytes with malloc directly
   [17:28:50] src/operator/nn/mkldnn/mkldnn_base.cc:60: Allocate 172032 bytes with malloc directly
   [17:28:50] src/operator/nn/mkldnn/mkldnn_base.cc:60: Allocate 1638400 bytes with malloc directly
   [17:28:50] src/operator/nn/mkldnn/mkldnn_base.cc:60: Allocate 1638400 bytes with malloc directly
   [17:28:51] src/operator/nn/mkldnn/mkldnn_base.cc:60: Allocate 57344 bytes with malloc directly
   [17:28:51] src/operator/nn/mkldnn/mkldnn_base.cc:60: Allocate 4096 bytes with malloc directly
   [17:28:51] src/operator/nn/mkldnn/mkldnn_base.cc:60: Allocate 172032 bytes with malloc directly
   Traceback (most recent call last):
     File "fashion.py", line 71, in <module>
       valid_loss = cumulative_valid_loss.asscalar()/valid_samples
     File "/home/david/.virtualenvs/mkl-dnn/local/lib/python3.6/site-packages/mxnet/ndarray/ndarray.py", line 1894, in asscalar
       return self.asnumpy()[0]
     File "/home/david/.virtualenvs/mkl-dnn/local/lib/python3.6/site-packages/mxnet/ndarray/ndarray.py", line 1876, in asnumpy
       ctypes.c_size_t(data.size)))
     File "/home/david/.virtualenvs/mkl-dnn/local/lib/python3.6/site-packages/mxnet/base.py", line 149, in check_call
       raise MXNetError(py_str(_LIB.MXGetLastError()))
   mxnet.base.MXNetError: [17:28:51] src/ndarray/ndarray.cc:351: Check failed: format != mkl_mem_->GetFormat() (5 vs. 5) 
   
   Stack trace returned 10 entries:
   [bt] (0) /home/david/.virtualenvs/mkl-dnn/local/lib/python3.6/site-packages/mxnet/libmxnet.so(+0x17009d) [0x7fba25e2f09d]
   [bt] (1) /home/david/.virtualenvs/mkl-dnn/local/lib/python3.6/site-packages/mxnet/libmxnet.so(+0x170468) [0x7fba25e2f468]
   [bt] (2) /home/david/.virtualenvs/mkl-dnn/local/lib/python3.6/site-packages/mxnet/libmxnet.so(+0x2a4a1b8) [0x7fba287091b8]
   [bt] (3) /home/david/.virtualenvs/mkl-dnn/local/lib/python3.6/site-packages/mxnet/libmxnet.so(+0x2a4a29e) [0x7fba2870929e]
   [bt] (4) /home/david/.virtualenvs/mkl-dnn/local/lib/python3.6/site-packages/mxnet/libmxnet.so(+0x2899644) [0x7fba28558644]
   [bt] (5) /home/david/.virtualenvs/mkl-dnn/local/lib/python3.6/site-packages/mxnet/libmxnet.so(+0x289d151) [0x7fba2855c151]
   [bt] (6) /home/david/.virtualenvs/mkl-dnn/local/lib/python3.6/site-packages/mxnet/libmxnet.so(+0x2899d0b) [0x7fba28558d0b]
   [bt] (7) /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xbbc90) [0x7fba1ba04c90]
   [bt] (8) /lib/x86_64-linux-gnu/libpthread.so.0(+0x75aa) [0x7fba37df35aa]
   [bt] (9) /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) [0x7fba36f3ecbf]
   
   ```
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services