You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2019/12/07 13:35:04 UTC
[GitHub] [incubator-mxnet] xidulu opened a new issue #17004: [RFC] [Gluon] Accumulating loss in the forward phase

xidulu opened a new issue #17004: [RFC] [Gluon] Accumulating loss in the forward phase
URL: https://github.com/apache/incubator-mxnet/issues/17004
 
 
   ## Description
   
   In `tf.keras`, users could call `add_loss` method to create some non-standard loss function (when I say standard, I mean loss function that takes parameters other than `y_true` and `y_pred`), e.g. loss function that involves the input.
   
   https://www.tensorflow.org/api_docs/python/tf/keras/layers/Layer#add_loss
   
   A practical example would be Bayesian Neural Network:
   ```python
   model = tf.keras.Sequential([
         tfp.layers.DenseReparameterization(512, activation=tf.nn.relu),
         tfp.layers.DenseReparameterization(10),
     ])
   logits = model(features)
   neg_log_likelihood = tf.nn.softmax_cross_entropy_with_logits(
         labels=labels, logits=logits)
   kl = sum(model.losses)
   loss = neg_log_likelihood + kl
   train_op = tf.train.AdamOptimizer().minimize(loss)
   ```
   source: https://github.com/tensorflow/probability/blob/r0.8/tensorflow_probability/python/layers/dense_variational.py#L356
   
   In this case, the loss is composed of two parts: classification error and the loss inside `DenseReparameterization`(which is the KL divergence between the posterior and prior of weights in each layer)(i.e. model.losses). This is achieved by utilizing `add_loss` method.
   
   _______________________________
   However, this feature is currently not supported by Gluon.
   
   In order to implement it, I tired the following code :
   ```python
   class StochasticBlock(nn.HybridBlock):
     def __init__(self):
       super(StochasticBlock, self).__init__()
       self._losses = []
   
     def add_loss(self, loss):
       self._losses.append(loss)
   
     @property
     def losses(self):
       collected_losses = []
       collected_losses.extend(self._losses)
       for child in self._children.values():
         if hasattr(child, '_losses'):
           collected_losses.extend(getattr(child, '_losses'))
       return collected_losses
   
   class DiagGaussian(StochasticBlock):
     def __init__(self):
       super(DiagGaussian, self).__init__()
   
     def hybrid_forward(self, F, loc, scale):
       log_variance = F.np.log(1e-20 + scale ** 2)
       KL = 0.5 * F.np.sum(1 + log_variance - loc ** 2 - F.np.exp(log_variance), axis=1)
       self.add_loss(KL)
       return (F.np.random.normal(loc, scale))
   
   diagGaussian = DiagGaussian()
   loc = np.random.uniform(-10, 10, size=(2,2))
   scale = np.random.uniform(size=(2,2))
   diagGaussian.hybridize()
   print(diagGaussian(loc, scale))
   print(diagGaussian.losses[0])
   ```
   It worked well, if not turning hybridize, otherwise the `losses[0]` would become `<_Symbol diaggaussian0_multiply_scalar0>` instead of some concrete value.
   
   I am actively looking for other solutions to this problem, a potential workaround would be forcing `losses` to be one of the block's output. Not sure if it is gonna work in `Sequential`, it's also super not elegant =_=
   
   Having this feature could bring huge convenience for the implementation of deep generative models (such as VAE.）

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services