You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by th...@apache.org on 2019/07/05 10:43:47 UTC

[incubator-mxnet] branch master updated: [TUTORIAL] Revise Naming tutorial (#15365)

This is an automated email from the ASF dual-hosted git repository.

thomasdelteil pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git


The following commit(s) were added to refs/heads/master by this push:
     new 74dbadb  [TUTORIAL] Revise Naming tutorial (#15365)
74dbadb is described below

commit 74dbadbc3fc56e8a9e91f9724d92c32afea1bb54
Author: Sergey Sokolov <Se...@gmail.com>
AuthorDate: Fri Jul 5 03:43:16 2019 -0700

    [TUTORIAL] Revise Naming tutorial (#15365)
    
    * Revise Naming tutorial
    
    * Code review fixes
    
    * Force build
---
 docs/tutorials/gluon/naming.md | 148 ++++++++++++-----------------------------
 1 file changed, 43 insertions(+), 105 deletions(-)

diff --git a/docs/tutorials/gluon/naming.md b/docs/tutorials/gluon/naming.md
index e667ad3..c2293a9 100644
--- a/docs/tutorials/gluon/naming.md
+++ b/docs/tutorials/gluon/naming.md
@@ -15,14 +15,12 @@
 <!--- specific language governing permissions and limitations -->
 <!--- under the License. -->
 
-
 # Naming of Gluon Parameter and Blocks
 
-In gluon, each Parameter or Block has a name (and prefix). Parameter names are specified by users and Block names can be either specified by users or automatically created.
+In Gluon, each [`Parameter`](https://mxnet.incubator.apache.org/versions/master/api/python/gluon/gluon.html#mxnet.gluon.Parameter) has a name and each [`Block`](https://mxnet.incubator.apache.org/versions/master/api/python/gluon/gluon.html#mxnet.gluon.Block) has a prefix. `Parameter` names are specified by users, and `Block` names can be either specified by users or automatically created.
 
 In this tutorial we talk about the best practices on naming. First, let's import MXNet and Gluon:
 
-
 ```python
 from __future__ import print_function
 import mxnet as mx
@@ -31,60 +29,49 @@ from mxnet import gluon
 
 ## Naming Blocks
 
-When creating a block, you can assign a prefix to it:
-
+When you create a `Block`, you can assign it a prefix:
 
 ```python
 mydense = gluon.nn.Dense(100, prefix='mydense_')
 print(mydense.prefix)
+print(mydense.name)
 ```
 
-    mydense_
-
-
-When no prefix is given, Gluon will automatically generate one:
+The prefix of the `Block` acts like its name. As you can see from the example above, the only difference is that the last `_` is removed from the `Block`'s name.
 
+When no prefix is given, `Gluon` will automatically generate one:
 
 ```python
 dense0 = gluon.nn.Dense(100)
-print(dense0.prefix)
+dense0.prefix
 ```
 
-    dense0_
-
-
-When you create more Blocks of the same kind, they will be named with incrementing suffixes to avoid collision:
-
+When you create more Blocks of the same kind, they will be named with an incrementing indices to avoid naming collision. To illustrate that, let's create a new `Dense` layer without specifying the prefix:
 
 ```python
 dense1 = gluon.nn.Dense(100)
-print(dense1.prefix)
+dense1.prefix
 ```
 
-    dense1_
-
+As we can see prefixes of `dense0` and `dense1` blocks are different.
 
 ## Naming Parameters
 
-Parameters within a Block will be named by prepending the prefix of the Block to the name of the Parameter:
-
+Parameters within a `Block` will be named by prepending the prefix of the `Block` to the name of the `Parameter`:
 
 ```python
-print(dense0.collect_params())
+dense0.collect_params()
 ```
 
-    dense0_ (
-      Parameter dense0_weight (shape=(100, 0), dtype=<type 'numpy.float32'>)
-      Parameter dense0_bias (shape=(100,), dtype=<type 'numpy.float32'>)
-    )
+As we can see, both `weight` and `bias` parameters of the `Dense` block have the same prefix `dense0_`.
 
+If you create a new `Parameter`, for example, when you [create a custom block](https://mxnet.incubator.apache.org/versions/master/tutorials/gluon/custom_layer.html#parameters-of-a-custom-layer), you have to specify its name as Gluon cannot set a default one.
 
 ## Name scopes
 
-To manage the names of nested Blocks, each Block has a `name_scope` attached to it. All Blocks created within a name scope will have its parent Block's prefix prepended to its name.
-
-Let's demonstrate this by first defining a simple neural net:
+To manage the names of nested Blocks, each `Block` has a `name_scope` attached to it. All Blocks created within a name scope will have its parent `Block`'s prefix prepended to its name. We can retrieve the `name_scope` by using [`.name_scope()`](https://mxnet.incubator.apache.org/versions/master/api/python/gluon/gluon.html#mxnet.gluon.Block.name_scope) method of the `Block`.
 
+Let's demonstrate this by first defining a simple neural network:
 
 ```python
 class Model(gluon.Block):
@@ -101,12 +88,7 @@ class Model(gluon.Block):
         return mx.nd.relu(self.mydense(x))
 ```
 
-Now let's instantiate our neural net.
-
-- Note that `model0.dense0` is named as `model0_dense0_` instead of `dense0_`.
-
-- Also note that although we specified `mydense_` as prefix for `model.mydense`, its parent's prefix is automatically prepended to generate the prefix `model0_mydense_`.
-
+Here we defined three `Dense` layers, out of which only the last one has a prefix. Now, let's create and initialize our network.
 
 ```python
 model0 = Model()
@@ -118,16 +100,13 @@ print(model0.dense1.prefix)
 print(model0.mydense.prefix)
 ```
 
-    model0_
-    model0_dense0_
-    model0_dense1_
-    model0_mydense_
-
+- Note that `model0.dense0` is named as `model0_dense0_` instead of `dense0_`.
 
-If we instantiate `Model` again, it will be given a different name like shown before for `Dense`.
+- As before, to avoid naming collision, the next `Dense` layer got automatic prefix `dense1_`, and its full prefix is `model0_dense1_`.
 
-- Note that `model1.dense0` is still named as `dense0_` instead of `dense2_`, following dense layers in previously created `model0`. This is because each instance of model's name scope is independent of each other.
+- Also note that although we specified `mydense_` as prefix for `model.mydense`, its parent's prefix is automatically prepended to generate the prefix `model0_mydense_`.
 
+If we create another instance of `Model`, it will be given a different name like shown before in the example with the `Dense` layer.
 
 ```python
 model1 = Model()
@@ -137,16 +116,11 @@ print(model1.dense1.prefix)
 print(model1.mydense.prefix)
 ```
 
-    model1_
-    model1_dense0_
-    model1_dense1_
-    model1_mydense_
-
-
-**It is recommended that you manually specify a prefix for the top level Block, i.e. `model = Model(prefix='mymodel_')`, to avoid potential confusions in naming.**
+Note that `model1.dense0` is still named as `dense0_` instead of `dense2_`, following `Dense` layers in previously created `model0`. This is because each instance of model's name scope is independent of each other.
 
-The same principle also applies to container blocks like Sequential. `name_scope` can be used inside `__init__` as well as out side of `__init__`:
+**It is recommended that you manually specify a prefix for the top level `Block`, i.e. `model = Model(prefix='mymodel_')`, to avoid potential confusion in naming.**
 
+The same principle applies to container blocks like [`Sequential`](https://mxnet.incubator.apache.org/versions/master/api/python/gluon/gluon.html#mxnet.gluon.nn.Sequential). `name_scope` can be used inside as well as outside of `__init__`. It is recommended to use `name_scope` with container blocks as the naming of the parameters inside of it tends to be more clear.
 
 ```python
 net = gluon.nn.Sequential()
@@ -158,13 +132,7 @@ print(net[0].prefix)
 print(net[1].prefix)
 ```
 
-    sequential0_
-    sequential0_dense0_
-    sequential0_dense1_
-
-
-`gluon.model_zoo` also behaves similarly:
-
+Models loaded from [`gluon.model_zoo`](https://mxnet.incubator.apache.org/versions/master/api/python/gluon/model_zoo.html#module-mxnet.gluon.model_zoo) behave in a similar way:
 
 ```python
 net = gluon.nn.Sequential()
@@ -174,40 +142,16 @@ with net.name_scope():
 print(net.prefix, net[0].prefix, net[1].prefix)
 ```
 
-    sequential1_ sequential1_alexnet0_ sequential1_alexnet1_
-
-
-## Saving and loading
-
-Because model0 and model1 have different prefixes, their parameters also have different names:
+## Saving and loading of parameters
 
+Because `model0` and `model1` have different prefixes, their parameters also have different names:
 
 ```python
 print(model0.collect_params(), '\n')
 print(model1.collect_params())
 ```
 
-    model0_ (
-      Parameter model0_dense0_weight (shape=(20L, 20L), dtype=<type 'numpy.float32'>)
-      Parameter model0_dense0_bias (shape=(20L,), dtype=<type 'numpy.float32'>)
-      Parameter model0_dense1_weight (shape=(20L, 20L), dtype=<type 'numpy.float32'>)
-      Parameter model0_dense1_bias (shape=(20L,), dtype=<type 'numpy.float32'>)
-      Parameter model0_mydense_weight (shape=(20L, 20L), dtype=<type 'numpy.float32'>)
-      Parameter model0_mydense_bias (shape=(20L,), dtype=<type 'numpy.float32'>)
-    ) 
-    
-    model1_ (
-      Parameter model1_dense0_weight (shape=(20, 0), dtype=<type 'numpy.float32'>)
-      Parameter model1_dense0_bias (shape=(20,), dtype=<type 'numpy.float32'>)
-      Parameter model1_dense1_weight (shape=(20, 0), dtype=<type 'numpy.float32'>)
-      Parameter model1_dense1_bias (shape=(20,), dtype=<type 'numpy.float32'>)
-      Parameter model1_mydense_weight (shape=(20, 0), dtype=<type 'numpy.float32'>)
-      Parameter model1_mydense_bias (shape=(20,), dtype=<type 'numpy.float32'>)
-    )
-
-
-As a result, if you try to save parameters from model0 and load it with model1, you'll get an error due to unmatching names:
-
+As a result, if you try to save parameters of `model0` and then load it into `model1`, you'll get an error due to unmatching names:
 
 ```python
 model0.collect_params().save('model.params')
@@ -217,11 +161,7 @@ except Exception as e:
     print(e)
 ```
 
-    Parameter 'model1_dense0_weight' is missing in file 'model.params', which contains parameters: 'model0_mydense_weight', 'model0_dense1_bias', 'model0_dense1_weight', 'model0_dense0_weight', 'model0_dense0_bias', 'model0_mydense_bias'. Please make sure source and target networks have the same prefix.
-
-
-To solve this problem, we use `save_parameters`/`load_parameters` instead of `collect_params` and `save`/`load`. `save_parameters` uses model structure, instead of parameter name, to match parameters.
-
+To solve this problem, we use `save_parameters`/`load_parameters` instead of `collect_params` and `save`/`load`. The `save_parameters` method uses model structure instead of parameter names to match parameters.
 
 ```python
 model0.save_parameters('model.params')
@@ -229,20 +169,11 @@ model1.load_parameters('model.params')
 print(mx.nd.load('model.params').keys())
 ```
 
-    ['dense0.bias', 'mydense.bias', 'dense1.bias', 'dense1.weight', 'dense0.weight', 'mydense.weight']
-
-
-## Replacing Blocks from networks and fine-tuning
-
-Sometimes you may want to load a pretrained model, and replace certain Blocks in it for fine-tuning.
+## Replacing Blocks in networks and fine-tuning
 
-For example, the alexnet in model zoo has 1000 output dimensions, but maybe you only have 100 classes in your application.
-
-To see how to do this, we first load a pretrained AlexNet.
-
-- In Gluon model zoo, all image classification models follow the format where the feature extraction layers are named `features` while the output layer is named `output`.
-- Note that the output layer is a dense block with 1000 dimension outputs.
+Sometimes you may want to load a pretrained model, and replace certain Blocks in it for fine-tuning. For example, the [`AlexNet`](https://mxnet.incubator.apache.org/versions/master/api/python/gluon/model_zoo.html#vision) model in the model zoo has 1000 output dimensions, but maybe you have only 100 classes in your application. Let's see how to change the number of output dimensions from 1000 to 100. 
 
+The first step is to load a pretrained `AlexNet`:
 
 ```python
 alexnet = gluon.model_zoo.vision.alexnet(pretrained=True)
@@ -250,12 +181,13 @@ print(alexnet.output)
 print(alexnet.output.prefix)
 ```
 
-    Dense(4096 -> 1000, linear)
-    alexnet0_dense2_
+- In Gluon Model Zoo, all image classification models follow the format where the feature extraction layers are named `features` while the output layer is named `output`.
 
+- Note that the output layer is a `Dense` block with 1000 dimension outputs.
 
-To change the output to 100 dimension, we replace it with a new block.
+- Some networks, including `AlexNet`, provide a way to directly specify number of output classes by using `classes` argument. We will still show how to do it in a replacing block manner for demonstration purposes.
 
+To change the output to 100 dimension, we replace it with a new `Block` which has only 100 units.
 
 ```python
 with alexnet.name_scope():
@@ -265,8 +197,14 @@ print(alexnet.output)
 print(alexnet.output.prefix)
 ```
 
-    Dense(None -> 100, linear)
-    alexnet0_dense3_
+## Conclusion
+
+Each `Block` has a `prefix` and each `Parameter` has a `name`. Gluon can automatically generate unique prefixes for blocks, but it is up to you to specify Parameter's name if you create a `Parameter` yourself. Prefixes act like name spaces and create a hierarchy following the nesting of blocks in your model. To save and load model parameters, it is better to use `save_parameters`/`load_parameters` methods on the parent `Block`.
+
+## Recommended Next Steps
+
+- Learn more about [model serialization](https://mxnet.incubator.apache.org/versions/master/tutorials/gluon/save_load_params.html) in Gluon.
 
+- [Create custom blocks](https://mxnet.incubator.apache.org/versions/master/tutorials/gluon/custom_layer.html) with your own parameters.
 
 <!-- INSERT SOURCE DOWNLOAD BUTTONS -->