You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mxnet.apache.org by GitBox <gi...@apache.org> on 2021/07/16 23:57:16 UTC

[GitHub] [incubator-mxnet] barry-jin edited a comment on issue #20293: Wrong gradients with C API?

barry-jin edited a comment on issue #20293:
URL: https://github.com/apache/incubator-mxnet/issues/20293#issuecomment-881776555


   I don't think the root cause is in CachedOp. As I was debugging this issue, the elemwise_add is using [CloneGradient](https://github.com/apache/incubator-mxnet/blob/3480ba2c6df02bb907d3a975d354efa8697c4e71/src/operator/tensor/elemwise_binary_op_basic.cc#L111), which means copy ograds multiple times for the inputs. 
   
   For cached_op, if the static_alloc is on, then it will construct backward graph with grad_graph outputs
   https://github.com/apache/incubator-mxnet/blob/3480ba2c6df02bb907d3a975d354efa8697c4e71/src/imperative/cached_op.cc#L270-L281
   In the case of elemwise_add(a, b), the grad_graph will be like this. 
   ![Screen Shot 2021-07-16 at 4 19 40 PM](https://user-images.githubusercontent.com/69359374/126018275-a3d1505f-69ee-43b3-9897-2dbb15428c7b.png)
   The gradient of b will be the copy of the gradient of a. So there will be divergence between (case1: a.grad_req = null, b.grad_req = write) and (case2: a.grad_req = write, b.grad_req = null) when constructing the new graph based on the grad_graph. 
   
   From my point of view, the solution of this bug is to change the elemwise_add gradient function to this
   ```
   .set_attr<nnvm::FGradient>("FGradient",
     [](const nnvm::ObjectPtr& n, const std::vector<nnvm::NodeEntry>& ograds) {
       std::vector<nnvm::NodeEntry> ret;
       const size_t input_count = n->inputs.size();
       ret.reserve(input_count);
       for (size_t i = 0; i < input_count; ++i) {
         ret.emplace_back(MakeNode("ones_like", n->attrs.name + "_grad_ones", {n->inputs[i]}, nullptr, &n));
       }
       return ret;
   });
   ```
   @KexinFeng FYI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org