You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2018/05/01 16:40:52 UTC
[GitHub] ThomasDelteil opened a new issue #10766: Bug Cannot save/load
params with Gluon model
ThomasDelteil opened a new issue #10766: Bug Cannot save/load params with Gluon model
URL: https://github.com/apache/incubator-mxnet/issues/10766
See the issue reported here:
https://discuss.mxnet.io/t/problem-when-loading-saved-params/982/11
```python
self.model.load_params(self.model_path, ctx=ctx)
File “/home/name/virtualEnv/local/lib/python2.7/site-packages/mxnet/gluon/block.py”, line 317, in load_params
self.prefix)
File “/home/name/virtualEnv/local/lib/python2.7/site-packages/mxnet/gluon/parameter.py”, line 676, in load
self[name]._load_init(arg_dict[name], ctx)
File “/home/name/virtualEnv/local/lib/python2.7/site-packages/mxnet/gluon/parameter.py”, line 209, in _load_init
assert set(ctx) == set(self._deferred_init[1]),
IndexError: tuple index out of range
```
Code [here](https://github.com/ZhengzheYang/Reading-Comprehension/blob/19c591c767f7022730e8ec3aee6502f91f613266/model.py#L232)
Reproduce bug:
```python
embedding_size = 400
vocab_size = 100
batch_size = 1
ctx = mx.gpu()
# Create first model
test = RCv1(nd.ones((vocab_size, embedding_size)), vocab_size=vocab_size, batch_size=batch_size)
# Initialize params
test.model.collect_params().initialize(mx.init.Xavier(magnitude=2.24, rnd_type='gaussian'), ctx=ctx)
# Runs one batch
print('Model 1', test.model(nd.ones((batch_size, 25, 94, embedding_size), ctx), nd.ones((batch_size, 126), ctx)))
# Save params
test.model.save_params('test.params')
# Create second model
test2 = RCv1(nd.ones((vocab_size, embedding_size)), vocab_size=vocab_size, batch_size=batch_size)
# Load params
test2.model.load_params('test.params')
```
I was able to reproduce. Initializing the parameters from the newly created network prior to loading the parameters solved the issue. Couldn't figure out exactly what is wrong. I feel it could be related to the `embedding` layer being shared by the LSTM and CNN.
@piiswrong any ideas?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services