You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2019/09/27 10:17:10 UTC

[GitHub] [incubator-mxnet] shesung opened a new issue #16297: incorrect grad of gluon.nn.BatchNorm when scale=False

shesung opened a new issue #16297: incorrect grad of gluon.nn.BatchNorm when scale=False
URL: https://github.com/apache/incubator-mxnet/issues/16297
 
 
   
   When using gluon.nn.BatchNorm(scale=False) on gpu,  the computed grad for beta is not correct. The grad of beta seem to be accumulated between iterations. 
   
   When setting scale=True or running on cpu, it goes correctly.
   
   This problem may make network hard to converge during trainning.
   
   ## Environment info (Required)
   CentOS Linux release 7.2.1511 (Core)
   GTX 1080Ti
   Driver Version: 384.69
   CUDA Version 9.0.176
   
   installed with pip:
   numpy                              1.17.2
   mxnet-cu90                         1.5.0
   
   
   
   ## Code
   In this example, the grad of beta shuold be [1, 1, 1] at each iteration.
   
   ```python
   import mxnet as mx
   from mxnet import gluon, autograd
   
   ctx = mx.gpu()
   x = mx.nd.ones((1,3,1,1), ctx=ctx)
   
   net = gluon.nn.BatchNorm(scale=False, epsilon=2e-5, momentum=0.0)
   net.initialize(ctx=ctx)
   trainer = gluon.Trainer(params=net.collect_params(),
                           optimizer='sgd',
                           optimizer_params={'learning_rate': 0.01, 'wd': 0.0005, 'momentum': 0.9})
   net.hybridize()
   
   for i in range(10):
       with autograd.record():
           out = net(x)
       out.backward()
       trainer.step(x.shape[0])
       for name, param in net.collect_params().items():
           if 'beta' in name:
               print(name, param.grad(ctx).asnumpy())
   ```
   output:
   
   ```
   batchnorm0_beta [1. 1. 1.]
   batchnorm0_beta [2. 2. 2.]
   batchnorm0_beta [3. 3. 3.]
   batchnorm0_beta [4. 4. 4.]
   batchnorm0_beta [5. 5. 5.]
   batchnorm0_beta [6. 6. 6.]
   batchnorm0_beta [7. 7. 7.]
   batchnorm0_beta [8. 8. 8.]
   batchnorm0_beta [9. 9. 9.]
   batchnorm0_beta [10. 10. 10.]
   ```
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services