You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2020/07/22 23:47:44 UTC

[GitHub] [incubator-mxnet] gilbertfrancois edited a comment on issue #18751: gluon.nn.BatchNorm seems to swap updated values of moving_mean and moving_var on GPU.

gilbertfrancois edited a comment on issue #18751:
URL: https://github.com/apache/incubator-mxnet/issues/18751#issuecomment-662750082


   Ok, I see that. But I guess it is the same intended behaviour as pyTorch nn.BatchNorm1d for Dense layers, which takes as input (N, C). The normalization is done over C features. E.g.:
   
   ```
   >>> import torch
   >>> import torch.nn as nn
   
   >>> bn = nn.BatchNorm1d(32)
   >>> x = torch.randn(2, 32)
   >>> y = bn(x)
   
   >>> bn.running_mean.shape
   torch.Size([32])
   
   >>> bn.running_var.shape
   torch.Size([32])
   ```
   This proofs that the BN layer has C running means and C running vars.
   
   When defining `x = torch.randn(1, 32)` and do a forward pass, it gives an error, like expected. Same as @TristonC mentioned in https://github.com/apache/incubator-mxnet/issues/18751#issuecomment-662730136.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org