You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2017/11/25 17:04:14 UTC

[GitHub] altosaar opened a new issue #8816: Example of debugging gradient of op in HybridBlock

altosaar opened a new issue #8816: Example of debugging gradient of op in HybridBlock
URL: https://github.com/apache/incubator-mxnet/issues/8816
 
 
   I have a custom LSTMCell and I get nan gradients after a certain number of gradient updates. I used pdb to step through the forward pass and check for nans. Tensorflow has an operation that adds nan checks to all ops in the graph, which would be nice here.
   
   How can I debug to find out which op is causing the nans?
   
   After googling, here is an approach. Is this the recommended approach? Or is there a simpler way?
   
   1. Export the HybridBlock to json using: https://mxnet.incubator.apache.org/api/python/gluon/gluon.html?highlight=export#mxnet.gluon.HybridBlock.export
   
   2. Add a monitor to a symbol executor, which will print gradients of all ops: https://github.com/apache/incubator-mxnet/blob/master/example/python-howto/monitor_weights.py
   
   However, I'm not sure if hybird recurrent architectures can be exported.
   
   Thanks in advance!

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services