You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2020/04/15 22:58:35 UTC

[GitHub] [incubator-mxnet] leezu opened a new issue #18077: Parameter fusion support in Gluon

leezu opened a new issue #18077: Parameter fusion support in Gluon
URL: https://github.com/apache/incubator-mxnet/issues/18077
 
 
   ## Description
   It's common that the parameters declared by a Block in Gluon don't exactly match the format used by operators in the backend. Thus we have examples where some parameters are concatenated every forward pass
   - *RNN*
     https://github.com/apache/incubator-mxnet/blob/c3b0baaa27e2215eae7ed7676009ea5f4bf49013/python/mxnet/gluon/rnn/rnn_layer.py#L278
   - *BERT*
     https://github.com/dmlc/gluon-nlp/pull/1136#discussion_r377480471
   
   A naive approach is to refactor the respective Gluon Blocks, to declare the concatenated version of the parameter. This does not work in all cases, as we wish to initialize different parameters differently. For example, RNN biases should be initialized differently from RNN weights.
   
   The status quo, where in such cases concatenation / fusion has to happen at every forward pass is not acceptable either.
   
   Proposed solution: Introduce `Block.fuse()` and `Block.unfuse()` APIs. By default, they represent no-ops. User can overwrite `fuse` and `unfuse` to declare how to fuse the Block's parameters into a new set (or single) parameter. `fuse` is called prior to the first `forward`, after the `infer_shape`.
   `export` will require fused parameters. Prior to `save_parameters` or `load_parameters`, the Block is unfused.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] zixuanweeei commented on issue #18077: Parameter fusion support in Gluon

Posted by GitBox <gi...@apache.org>.
zixuanweeei commented on issue #18077: Parameter fusion support in Gluon
URL: https://github.com/apache/incubator-mxnet/issues/18077#issuecomment-616012849
 
 
   > What about optimizer? Would optimizers see fused or unfused params?
   
   From the view of RNN operator, I think the optimizers will see the fused parameters. In both forward and backward scenario, only the fused parameter exists in the arguments dict. And we can use `Block.unfuse()` to overwrite the values of unfused parameters.
   
   Both `RNN` and `_backward_RNN` operator receive a NDArray holder for fused parameter:
   + RNN (line 412)
   https://github.com/apache/incubator-mxnet/blob/dcada9b9c145d2e93d51790d234f0a2ddc7091df/src/operator/rnn.cc#L411-L416
   + _backward_RNN (line 213)
   https://github.com/apache/incubator-mxnet/blob/dcada9b9c145d2e93d51790d234f0a2ddc7091df/src/operator/rnn.cc#L207-L214
   
   If a model use [`mx.rnn.FusedRNNCell`](https://github.com/apache/incubator-mxnet/blob/master/python/mxnet/rnn/rnn_cell.py#L535), the optimizer will apply the gradients to the fused parameter directly. But it's not true with rnn layer [`mx.gluon.rnn.***`](https://github.com/apache/incubator-mxnet/blob/master/python/mxnet/gluon/rnn/rnn_layer.py#L33). It always has a `_rnn_param_concat` operator prior to the fused parameter. So the optimizer or Backward pass just deliver the fused gradients to the unfused parameters individually. There are several memcpy operations behind this.
   
   But after all, when recording the gradients w.r.t. specific unfused parameter, the problem does arise. Anyway, it's a very helpful feature for the forward pass regarding the performance.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] eric-haibin-lin commented on issue #18077: Parameter fusion support in Gluon

Posted by GitBox <gi...@apache.org>.
eric-haibin-lin commented on issue #18077: Parameter fusion support in Gluon
URL: https://github.com/apache/incubator-mxnet/issues/18077#issuecomment-616030507
 
 
   Some optimizers may require global information and involve reductions ops, such as LARS or LAMB cc @szhengac 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] leezu commented on issue #18077: Parameter fusion support in Gluon

Posted by GitBox <gi...@apache.org>.
leezu commented on issue #18077: Parameter fusion support in Gluon
URL: https://github.com/apache/incubator-mxnet/issues/18077#issuecomment-616013055
 
 
   > What about optimizer? Would optimizers see fused or unfused params?
   
   Optimizer would see only fused parameters. In general, you can consider all existing parameters as "fused". Unfused parameters are a new "type" of parameters that are not returned by `collect_params()` and we ensure in the Python frontend that users implement `fuse_parameters`, `unfuse_parameters` functions to implement whatever logic is needed to combine the set of unfused parameters into a set of fused parameters. Both fused and unfused parameters need to be declared inside `__init__`.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] leezu commented on issue #18077: Parameter fusion support in Gluon

Posted by GitBox <gi...@apache.org>.
leezu commented on issue #18077: Parameter fusion support in Gluon
URL: https://github.com/apache/incubator-mxnet/issues/18077#issuecomment-616042128
 
 
   Cases that need to expose the unfused parameter to the optimizer wouldn't be supported based on the proposed Gluon fusion API and would require more extensive changes to the backend to support views on the fused array. If that's required, we can reconsider the plan. The fused parameter can be of any shape though, which should facilitate reduction ops etc

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] szhengac commented on issue #18077: Parameter fusion support in Gluon

Posted by GitBox <gi...@apache.org>.
szhengac commented on issue #18077: Parameter fusion support in Gluon
URL: https://github.com/apache/incubator-mxnet/issues/18077#issuecomment-616068055
 
 
   That’s not needed for either lamb or Lars. At the moment, the new api
   allows one step function to get all the parameters so that global
   information can be obtained. It is only needed in other optimizers such as
   lbfgs. The fused parameter can help reduce the number of api calls and
   potentially accelerate the computation. But it is not urgent now as we
   haven’t implemented lbfgs like method.
   
   On Sat, 18 Apr 2020 at 22:44, Leonard Lausen <no...@github.com>
   wrote:
   
   > Cases that need to expose the unfused parameter to the optimizer wouldn't
   > be supported based on the proposed Gluon fusion API and would require more
   > extensive changes to the backend to support views on the fused array. If
   > that's required, we can reconsider the plan. The fused parameter can be of
   > any shape though, which should facilitate reduction ops etc
   >
   > —
   > You are receiving this because you were mentioned.
   > Reply to this email directly, view it on GitHub
   > <https://github.com/apache/incubator-mxnet/issues/18077#issuecomment-616042128>,
   > or unsubscribe
   > <https://github.com/notifications/unsubscribe-auth/AA6GZVDGAY2EZHVS4VQFDITRNKFTRANCNFSM4MI6WY2Q>
   > .
   >
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] eric-haibin-lin commented on issue #18077: Parameter fusion support in Gluon

Posted by GitBox <gi...@apache.org>.
eric-haibin-lin commented on issue #18077: Parameter fusion support in Gluon
URL: https://github.com/apache/incubator-mxnet/issues/18077#issuecomment-616000988
 
 
   What about optimizer? Would optimizers see fused or unfused params? 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] leezu edited a comment on issue #18077: Parameter fusion support in Gluon

Posted by GitBox <gi...@apache.org>.
leezu edited a comment on issue #18077: Parameter fusion support in Gluon
URL: https://github.com/apache/incubator-mxnet/issues/18077#issuecomment-616013055
 
 
   > What about optimizer? Would optimizers see fused or unfused params?
   
   My current draft implementation is as follows: Optimizer would see only fused parameters. In general, you can consider all existing parameters as "fused". Unfused parameters are a new "type" of parameters that are not returned by `collect_params()` and we ensure in the Python frontend that users implement `fuse_parameters`, `unfuse_parameters` functions to implement whatever logic is needed to combine the set of unfused parameters into a set of fused parameters. Both fused and unfused parameters need to be declared inside `__init__`.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services