You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2018/01/18 18:54:38 UTC

[GitHub] Bestehorn commented on issue #9478: Loss becomes NAN with larger, more layered network

Bestehorn commented on issue #9478: Loss becomes NAN with larger, more layered network
URL: https://github.com/apache/incubator-mxnet/issues/9478#issuecomment-358745629
 
 
   Initialization and other stuff has not been changed compared to the tutorial on MXNET [here](http://gluon.mxnet.io/chapter03_deep-neural-networks/mlp-gluon.html#Faster-modeling-with-gluon.nn.Sequential). Here is a copy of the corresponding code from the attached jupyter notebook:
   ```
   net.collect_params().initialize(mx.init.Normal(sigma=.1), ctx=ctx)
   softmax_cross_entropy = gluon.loss.SoftmaxCrossEntropyLoss()
   trainer = gluon.Trainer(net.collect_params(), 'sgd', {'learning_rate': .01})
   ```
   As stated above, the initialization works fine if the network is "small" in terms of nodes/layers, so I do not think that is really the issue. The input is also normalized during the loading procedure used by the MXNet tutorial by dividing all feature values by 255 (RGB).
   ```
   batch_size = 64
   num_inputs = 784
   num_outputs = 10
   num_examples = 60000
   def transform(data, label):
       return **data.astype(np.float32)/255**, label.astype(np.float32)
   train_data = mx.gluon.data.DataLoader(mx.gluon.data.vision.MNIST(train=True, transform=transform),
                                         batch_size, shuffle=True)
   test_data = mx.gluon.data.DataLoader(mx.gluon.data.vision.MNIST(train=False, transform=transform),
                                        batch_size, shuffle=False)
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services