You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2019/05/30 17:39:47 UTC

[GitHub] [incubator-mxnet] RuRo opened a new issue #15103: [Feature Request] Gluon SymbolBlock structured imports.

RuRo opened a new issue #15103: [Feature Request] Gluon SymbolBlock structured imports.
URL: https://github.com/apache/incubator-mxnet/issues/15103
 
 
   As far as I am aware, gluon provides only 2 ways to save/load models. Please correct me, if I am wrong.
   
   1) using `save/load_parameters`
   
   The problem with this option is that it doesn't save the model architecture, so you have to construct the model, before loading the parameters. You have to know **exactly** what the model was. which is really inconvenient.
   
   For example, if you are trying to do transfer learning, you have to know the exact number of classes the model was pretrained on and all the hyperparameters used during pretraining, which is really inelegant and annoying.
   
   Furthermore, when releasing to production, this means that you need to actually create the whole model and then load parameters, instead of just doing an `imports`. This essentially rules out this option unless you can ship the whole training code to production.
   
   2) using `export/imports`
   
   This is a much saner option for exporting your model for inference. But it is (as far as I can tell) completely useless for experiments/training/transfer learning. That is because `export/imports` doesn't actually preserve the Block hierarchy and instead squashes the whole model into one block.
   
   For example:
   
   ```python
   from mxnet import gluon, ndarray
   M = gluon.nn.HybridSequential()
   with M.name_scope():
       M.add(gluon.nn.Dense(10))
       M.add(gluon.nn.Dense(1))
   M.initialize()
   M.hybridize()
   M(ndarray.arange(10))
   ```
   
   At this point, `M` is a Hybrid Block, with structure
   ```
   HybridSequential(
     (0): Dense(1 -> 10, linear)
     (1): Dense(10 -> 1, linear)
   )
   ```
   
   And I can access the various children by indexing (or by names if it's not a sequential block).
   However after I do
   
   ```python
   M.export('model')
   M = gluon.nn.SymbolBlock.imports('model-symbol.json', ['data'], 'model-0000.params')
   ```
   
   `M` is a Symbol Block without any structure, so if I wanted to access only the first dense layer, I couldn't.
   ```
   SymbolBlock(
   
   )
   ```
   
   ---
   
   In the end, I am left with two options 
   
   1) rebuild a model from scratch every time I want to load it (which is a complete nightmare if your model could potentially change while you are experimenting with the architecture and requires you to ship training code to production)
   2) give up transfer learning
   
   Please correct me if I am wrong. Otherwise this is a feature request to somehow save the HybridBlock structure, when using `export/imports`.
   
   ---
   
   Temporary Workarounds:
   
   - use both (1) and (2), load from 1 during development, ship 2 to production (I'm currently doing this)
   - use (2) and write your own thing to extract the weights from `M.params` for transfer learning (this sounds like a huge pain)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services