You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2019/12/07 13:35:04 UTC
[GitHub] [incubator-mxnet] xidulu opened a new issue #17004: [RFC] [Gluon]
Accumulating loss in the forward phase
xidulu opened a new issue #17004: [RFC] [Gluon] Accumulating loss in the forward phase
URL: https://github.com/apache/incubator-mxnet/issues/17004
## Description
In `tf.keras`, users could call `add_loss` method to create some non-standard loss function (when I say standard, I mean loss function that takes parameters other than `y_true` and `y_pred`), e.g. loss function that involves the input.
https://www.tensorflow.org/api_docs/python/tf/keras/layers/Layer#add_loss
A practical example would be Bayesian Neural Network:
```python
model = tf.keras.Sequential([
tfp.layers.DenseReparameterization(512, activation=tf.nn.relu),
tfp.layers.DenseReparameterization(10),
])
logits = model(features)
neg_log_likelihood = tf.nn.softmax_cross_entropy_with_logits(
labels=labels, logits=logits)
kl = sum(model.losses)
loss = neg_log_likelihood + kl
train_op = tf.train.AdamOptimizer().minimize(loss)
```
source: https://github.com/tensorflow/probability/blob/r0.8/tensorflow_probability/python/layers/dense_variational.py#L356
In this case, the loss is composed of two parts: classification error and the loss inside `DenseReparameterization`(which is the KL divergence between the posterior and prior of weights in each layer)(i.e. model.losses). This is achieved by utilizing `add_loss` method.
_______________________________
However, this feature is currently not supported by Gluon.
In order to implement it, I tired the following code :
```python
class StochasticBlock(nn.HybridBlock):
def __init__(self):
super(StochasticBlock, self).__init__()
self._losses = []
def add_loss(self, loss):
self._losses.append(loss)
@property
def losses(self):
collected_losses = []
collected_losses.extend(self._losses)
for child in self._children.values():
if hasattr(child, '_losses'):
collected_losses.extend(getattr(child, '_losses'))
return collected_losses
class DiagGaussian(StochasticBlock):
def __init__(self):
super(DiagGaussian, self).__init__()
def hybrid_forward(self, F, loc, scale):
log_variance = F.np.log(1e-20 + scale ** 2)
KL = 0.5 * F.np.sum(1 + log_variance - loc ** 2 - F.np.exp(log_variance), axis=1)
self.add_loss(KL)
return (F.np.random.normal(loc, scale))
diagGaussian = DiagGaussian()
loc = np.random.uniform(-10, 10, size=(2,2))
scale = np.random.uniform(size=(2,2))
diagGaussian.hybridize()
print(diagGaussian(loc, scale))
print(diagGaussian.losses[0])
```
It worked well, if not turning hybridize, otherwise the `losses[0]` would become `<_Symbol diaggaussian0_multiply_scalar0>` instead of some concrete value.
I am actively looking for other solutions to this problem, a potential workaround would be forcing `losses` to be one of the block's output. Not sure if it is gonna work in `Sequential`, it's also super not elegant =_=
Having this feature could bring huge convenience for the implementation of deep generative models (such as VAE.)
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services