You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mxnet.apache.org by GitBox <gi...@apache.org> on 2021/07/09 14:53:33 UTC
[GitHub] [incubator-mxnet] maybeLee commented on issue #20416: MXNetError: Check failed: assign(&dattr, vec.at(i)): Incompatible attr in node at 0-th output: expected [256], got [1]

maybeLee commented on issue #20416:
URL: https://github.com/apache/incubator-mxnet/issues/20416#issuecomment-877246558


   Hi, to make this bug easier to reproduce and understand. I simplify the triggering model into a very easy three-layer randomly generated model.
   The model includes three layers: 1 softmax, 1 max pooling, and 1 batch normalization. I create such models using keras and randomly generate weights for batch normalization layer.
   You can reproduce the bug by running following code with mxnet version 1.8.0, _you don't need to use all the other trained models:_
   
   ```
   import os
   import argparse
   import sys
   import warnings
   parse = argparse.ArgumentParser()
   parse.add_argument("--bk", type=str,default="mxnet", help="the name of backend")
   flags, _ = parse.parse_known_args(sys.argv[1:])
   os.environ["KERAS_BACKEND"]=flags.bk
   import keras
   from keras import initializers, layers
   import numpy as np
   warnings.filterwarnings("ignore", category=DeprecationWarning)
   warnings.filterwarnings("ignore", category=UserWarning)
   model_1 = keras.models.Sequential()
   model_1.add(layers.Softmax())
   model_1.add(layers.MaxPooling2D())
   model_1.add(layers.BatchNormalization())
   x = np.random.rand(1,3,3,256)
   pred = model_1.predict(x)
   print(pred)
   ```
   By running the following command (I assume you save the above toy program as `try.py`:
   - `python try --bk mxnet`
   You will meet a crash with the same symptom as I mentioned before:
   
   ```
   Traceback (most recent call last):
     File "try.py", line 23, in <module>
       pred = model_1.predict(x)
     File "/root/anaconda3/envs/diffcu_mxnet/lib/python3.6/site-packages/keras/engine/training.py", line 1184, in predict
       steps=steps)
     File "/root/anaconda3/envs/diffcu_mxnet/lib/python3.6/site-packages/keras/engine/training_arrays.py", line 295, in predict_loop
       batch_outs = f(ins_batch)
     File "/root/anaconda3/envs/diffcu_mxnet/lib/python3.6/site-packages/keras/backend/mxnet_backend.py", line 5645, in predict_function
       data, label, _, data_shapes, label_shapes = self._adjust_module(inputs, 'pred')
     File "/root/anaconda3/envs/diffcu_mxnet/lib/python3.6/site-packages/keras/backend/mxnet_backend.py", line 5525, in _adjust_module
       self._set_weights()
     File "/root/anaconda3/envs/diffcu_mxnet/lib/python3.6/site-packages/keras/backend/mxnet_backend.py", line 5573, in _set_weights
       allow_missing=True)
     File "/mxnet/incubator-mxnet/python/mxnet/module/bucketing_module.py", line 220, in set_params
       force_init=force_init, allow_extra=allow_extra)
     File "/mxnet/incubator-mxnet/python/mxnet/module/module.py", line 358, in set_params
       self._exec_group.set_params(arg_params, aux_params, allow_extra=allow_extra)
     File "/mxnet/incubator-mxnet/python/mxnet/module/executor_group.py", line 422, in set_params
       exec_.copy_params_from(arg_params, aux_params, allow_extra_params=allow_extra)
     File "/mxnet/incubator-mxnet/python/mxnet/executor.py", line 367, in copy_params_from
       array.astype(dst.dtype).copyto(dst)
     File "/mxnet/incubator-mxnet/python/mxnet/ndarray/ndarray.py", line 2663, in copyto
       return _internal._copyto(self, out=other)
     File "<string>", line 27, in _copyto
     File "/mxnet/incubator-mxnet/python/mxnet/_ctypes/ndarray.py", line 91, in _imperative_invoke
       ctypes.byref(out_stypes)))
     File "/mxnet/incubator-mxnet/python/mxnet/base.py", line 246, in check_call
       raise get_last_ffi_error()
   mxnet.base.MXNetError: Traceback (most recent call last):
     File "src/operator/numpy/linalg/./../../tensor/../elemwise_op_common.h", line 135
   MXNetError: Check failed: assign(&dattr, vec.at(i)): Incompatible attr in node  at 0-th output: expected [256], got [1]
   ```
   
   But if you run such program using CNTK as the backend (`python try.py --bk cntk`), everything works fine.
   
   One interesting thing is that: If I delete either `softmax layer`, `batch normalization` or `max pooling layer`, no crash will happen. 
   Further, I tried some investigations and guess this is a bug caused by the wrong shape inference of mxnet.
   
   When I change the shape of input to `x=np.random.rand(1, 3, 3, 1)` or `x = np.random.rand(1, 8, 8, 4)` or `x=np.random.rand(1, 5, 3, 1)` , everything works fine and mxnet will not crash. 
   **But if I set the shape of input to `x=np.random.rand(1,3,3,10)`, which the `-1 th` dimension does not match the `-2 th` dimension after max pooling, mxnet will crash and report such check failed issue.**
   Therefore, I assume in some code logic inner elementwise_op_common.h file, it assumes the `-1 th` dimension should be consistent with `-2 th` dimension. 
   
   Can you help check whether this is a true problem? And what's is the root cause of such issue?
   Indeed thanks for your help.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org