You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2017/12/05 22:36:36 UTC
[GitHub] CorneliusHagmeister opened a new issue #8959: Loss becomes nan when using correlation loss with MakeLoss on 2 images.
CorneliusHagmeister opened a new issue #8959: Loss becomes nan when using correlation loss with MakeLoss on 2 images.
URL: https://github.com/apache/incubator-mxnet/issues/8959
I am trying to use an image similarity as loss function for my network. For some reason though, the loss I get is always nan. I already reduced my network to a minimal example, yet I still get the error.
## Environment info
python 3.4.5
mxnet 0.12.1
```
## Error Message:
INFO:root:Epoch[0] Train-loss=nan
INFO:root:Epoch[0] Time cost=0.759
INFO:root:Epoch[1] Train-loss=nan
INFO:root:Epoch[1] Time cost=0.785
INFO:root:Epoch[2] Train-loss=nan
INFO:root:Epoch[2] Time cost=0.798
INFO:root:Epoch[3] Train-loss=nan
INFO:root:Epoch[3] Time cost=0.763
INFO:root:Epoch[4] Train-loss=nan
INFO:root:Epoch[4] Time cost=0.773
INFO:root:Epoch[5] Train-loss=nan
INFO:root:Epoch[5] Time cost=0.887
INFO:root:Epoch[6] Train-loss=nan
INFO:root:Epoch[6] Time cost=0.917
INFO:root:Epoch[7] Train-loss=nan
INFO:root:Epoch[7] Time cost=0.801
INFO:root:Epoch[8] Train-loss=nan
INFO:root:Epoch[8] Time cost=0.860
INFO:root:Epoch[9] Train-loss=nan
INFO:root:Epoch[9] Time cost=1.064
```
## Minimum reproducible example
``` python
def conv_net_regressor(image_shape, bn_mom=0.9):
(nchannel, height, width) = image_shape
# We have 2 data sources and concatenate them
data_fixed = mx.sym.Variable(name='data_fixed')
data_moving = mx.sym.Variable(name='data_moving')
concat_data = mx.sym.concat(data_fixed,data_moving,dim=1)
batched = mx.sym.BatchNorm(data=concat_data, fix_gamma=True, eps=2e-5, momentum=bn_mom, name='bn_data')
body = mx.sym.Convolution(data=concat_data, num_filter=20, kernel=(3, 3), stride=(1, 1), pad=(0, 0),
no_bias=True, name="conv" + str(0))
body = mx.sym.Activation(data=body, act_type='tanh', name='relu' + str(0))
fc2 = mx.sym.FullyConnected(data=body, num_hidden=10)
#fc2= mx.sym.BlockGrad(fc2)
cor = mx.sym.Correlation(data1=data_fixed, data2=data_moving)
stnet = mx.sym.MakeLoss(cor, normalization='batch')
return stnet
def get_symbol(image_shape):
return conv_net_regressor(image_shape)
if __name__ == '__main__':
mnist_shape = (1, 28, 28)
iterators = get_mnist_data_iterator()
net = get_symbol(mnist_shape)
model = mx.mod.Module(symbol=net, context=mx.cpu(),
label_names=None, data_names=['data_fixed', 'data_moving'])
# a = mx.viz.plot_network(net)
# a.render()
model.fit(iterators[0],
optimizer='sgd',
optimizer_params={'learning_rate': 0.1},
eval_metric=mx.metric.Loss(),
num_epoch=10)
```
I assume there is some mistake with my usage of MakeLoss or my iterator doesnt work the way I expect. Maybe someone has got an idea what's causing this.
If you need more information let me know.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services