You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2019/02/22 01:47:25 UTC
[GitHub] AmigoCDT opened a new issue #14229: About the weighted softmax, the forward result is too small

AmigoCDT opened a new issue #14229: About the weighted softmax, the forward result is too small
URL: https://github.com/apache/incubator-mxnet/issues/14229
 
 
   I implementate the weighted softmax by `https://github.com/apache/incubator-mxnet/blob/v1.0.0/example/sparse/weighted_softmax_ce.py`
   
   Following is my implementation:
   `class WeightedSoftmaxCrossEntropyLoss(mx.operator.CustomOp):
       """
       softmax cross entropy weighted loss, where the loss is adjusted by \
       (class_weight) / sum_of_all_weights)
       """
       def __init__(self, class_weights):
           # parse initial weights from a string to separate items
           self.class_weights = mx.nd.array([float(x) for x in class_weights.split(',')])
           self.class_scales = self.class_weights
   
       def forward(self, is_train, req, in_data, out_data, aux):
           """Implements forward computation.
   
           is_train : bool, whether forwarding for training or testing.
           req : list of {'null', 'write', 'inplace', 'add'}, how to assign to out_data. 'null' means skip assignment, etc.
           in_data : list of NDArray, input data.
           out_data : list of NDArray, pre-allocated output buffers.
           aux : list of NDArray, mutable auxiliary states. Usually not used.
           """
           data = in_data[0]
           label = in_data[1]
           pred = mx.nd.SoftmaxOutput(data, label)
           print("pred is ", max(pred[0]))
           self.assign(out_data[0], req[0], pred)
   
       def backward(self, req, out_grad, in_data, out_data, in_grad, aux):
           """Implements backward computation
   
           req : list of {'null', 'write', 'inplace', 'add'}, how to assign to in_grad
           out_grad : list of NDArray, gradient w.r.t. output data.
           in_grad : list of NDArray, gradient w.r.t. input data. This is the output buffer.
           """
           label = in_data[1]
           pred = out_data[0]
           print("outgrad[0] is ", out_grad[0][0])
           # move to GPU context if needed
           class_scales = self.class_scales.as_in_context(label.context)
           dx = (pred - mx.nd.one_hot(label, class_scales.shape[0])) * out_grad[0]
           print("dx is ", dx[0])
           scale_factor = (class_scales[label]).reshape((pred.shape[0],1))
           rescaled_dx = scale_factor * dx
           self.assign(in_grad[0], req[0], rescaled_dx)`
   
   
   But when I use this customop as loss, I found that net not be trained, the pred of forward() is all about 2.05e-5 or 4.06e-29, this means my forward() does not work as I think.
   I dont know where the bug is? And how to solve it?
   Please, thank you all so much!

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services