You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2020/01/30 12:30:37 UTC
[GitHub] [incubator-mxnet] RuRo edited a comment on issue #14373: Passing
parameters to HybridBlocks and not using them
RuRo edited a comment on issue #14373: Passing parameters to HybridBlocks and not using them
URL: https://github.com/apache/incubator-mxnet/issues/14373#issuecomment-580229041
There is a similar problem when there are unused parameters.
For example, you can have a model like this:
```python
class Test(mx.gluon.nn.HybridBlock):
def __init__(self, mode, *args, **kwargs):
super().__init__(*args, **kwargs)
self.mode = mode
with self.name_scope():
self.d1 = mx.gluon.nn.Dense(2)
self.d2 = mx.gluon.nn.Dense(3)
def hybrid_forward(self, F, x, *args, **kwargs):
o1 = self.d1(x)
o2 = self.d2(x)
if self.mode:
return o1 # output path o2 is not used
else:
return o1, o2
```
Currently, this model will not hybridize successfully, when `mode == True`, because the weights in the `o2` path are "unused".
<details>
```python
/usr/lib/python3.8/site-packages/mxnet/gluon/block.py:694: UserWarning: Parameter test4_dense1_weight, test4_dense1_bias is not used by any computation. Is this intended?
out = self.forward(*args)
---------------------------------------------------------------------------
DeferredInitializationError Traceback (most recent call last)
/usr/lib/python3.8/site-packages/mxnet/gluon/block.py in _call_cached_op(self, *args)
1012 try:
-> 1013 cargs = [args_without_none[i] if is_arg else i.data()
1014 for is_arg, i in self._cached_op_args]
/usr/lib/python3.8/site-packages/mxnet/gluon/block.py in <listcomp>(.0)
1012 try:
-> 1013 cargs = [args_without_none[i] if is_arg else i.data()
1014 for is_arg, i in self._cached_op_args]
/usr/lib/python3.8/site-packages/mxnet/gluon/parameter.py in data(self, ctx)
564 "instead." % (self.name, str(ctx), self._stype))
--> 565 return self._check_and_get(self._data, ctx)
566
/usr/lib/python3.8/site-packages/mxnet/gluon/parameter.py in _check_and_get(self, arr_list, ctx)
230 if self._deferred_init:
--> 231 raise DeferredInitializationError(
232 "Parameter '%s' has not been initialized yet because initialization was " \
DeferredInitializationError: Parameter 'test4_dense0_weight' has not been initialized yet because initialization was deferred. Actual initialization happens during the first forward pass. Please pass one batch of data through the network before accessing Parameters. You can also avoid deferred initialization by specifying in_units, num_features, etc., for network layers.
During handling of the above exception, another exception occurred:
KeyError Traceback (most recent call last)
/usr/lib/python3.8/site-packages/mxnet/gluon/block.py in _deferred_infer_shape(self, *args)
973 try:
--> 974 self.infer_shape(*args)
975 except Exception as e:
/usr/lib/python3.8/site-packages/mxnet/gluon/block.py in infer_shape(self, *args)
1074 """Infers shape of Parameters from inputs."""
-> 1075 self._infer_attrs('infer_shape', 'shape', *args)
1076
/usr/lib/python3.8/site-packages/mxnet/gluon/block.py in _infer_attrs(self, infer_fn, attr, *args)
1070 for i in self.collect_params().values():
-> 1071 setattr(i, attr, sdict[i.name])
1072
KeyError: 'test4_dense1_weight'
During handling of the above exception, another exception occurred:
ValueError Traceback (most recent call last)
<ipython-input-48-a18f0aa96b25> in <module>
----> 1 t(mx.nd.array([10]))
/usr/lib/python3.8/site-packages/mxnet/gluon/block.py in __call__(self, *args)
692 hook(self, args)
693
--> 694 out = self.forward(*args)
695
696 for hook in self._forward_hooks.values():
/usr/lib/python3.8/site-packages/mxnet/gluon/block.py in forward(self, x, *args)
1150 'Find all contexts = {}'.format(ctx_set))
1151 with ctx:
-> 1152 return self._call_cached_op(x, *args)
1153 with ctx:
1154 try:
/usr/lib/python3.8/site-packages/mxnet/gluon/block.py in _call_cached_op(self, *args)
1014 for is_arg, i in self._cached_op_args]
1015 except DeferredInitializationError:
-> 1016 self._deferred_infer_shape(*args)
1017 cargs = []
1018 for is_arg, i in self._cached_op_args:
/usr/lib/python3.8/site-packages/mxnet/gluon/block.py in _deferred_infer_shape(self, *args)
976 error_msg = "Deferred initialization failed because shape"\
977 " cannot be inferred. {}".format(e)
--> 978 raise ValueError(error_msg)
979
980 def _call_cached_op(self, *args):
ValueError: Deferred initialization failed because shape cannot be inferred. 'test4_dense1_weight'
```
</details>
Having unused parameters is useful since you might want your pretrain/finetune/evaluation networks to behave differently, but be compatible for `.save_parameters` and `.load_parameters` without `allow_missing` and `ignore_extra`.
I think this issue could be fixed without changing the inner workings too much by adding a `F.nodiscard(o2)` operator. It would be a no-op in `nd` mode and would somehow mark the output as a required computation during `sym` mode. Not sure, how feasible something like that is.
My current workaround is something like
```python
return F.broadcast_add(o1, F.sum(0.0 * o2)) # output path o2 is not used
```
which is both really ugly and potentially inefficient, since it forces the unneeded computation.
If the `F.nodiscard` option is too hard to implement, something like
```python
o1 = F.depends_on(o1, o2)
```
could also work. It would basically be the same as `F.broadcast_add(o1, F.sum(0.0 * o2))` but without any computations.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services