You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2018/10/08 21:56:36 UTC
[GitHub] aidan-plenert-macdonald commented on issue #10002: General support of OPs for second-order gradient

aidan-plenert-macdonald commented on issue #10002: General support of OPs for second-order gradient
URL: https://github.com/apache/incubator-mxnet/issues/10002#issuecomment-427992178
 
 
   @JohnCalhoun I have experience doing things like this. Below is a little info I have from when I started. Unfortunately, I am doing a lot at the moment, so I can't be the main contributor.
   
   # Trig Example
   
   To start, let's just look at [unary trig ops](https://github.com/apache/incubator-mxnet/blob/master/src/operator/tensor/elemwise_unary_op_trig.cc). If you look at [this line](https://github.com/apache/incubator-mxnet/blob/master/src/operator/tensor/elemwise_unary_op_trig.cc#L47) you will notice the backwards op. Note that it is called "backward sin" instead of just being a kind of reference to Cosine.
   
   The macro references a series of macros starting [here](https://github.com/apache/incubator-mxnet/blob/ebe6ea8be97506dba7d00b5a25da58433e38caae/src/operator/tensor/elemwise_unary_op.h#L525), then [NNVM here](https://github.com/dmlc/nnvm/blob/master/include/nnvm/op.h#L382). We can see a rough usage instructions [here](https://github.com/dmlc/nnvm/blob/master/include/nnvm/op.h#L63).
   
   So it would appear that we need to fix the lines below like,
   
   ```
   .set_attr<nnvm::FGradient>("FGradient", ElemwiseGradUseIn{ "_backward_sin" });
   ```
   
   and replace the [`ElemwiseGradUseIn`](https://github.com/apache/incubator-mxnet/blob/f9f74169bb05f85d85dec5991aa5fc9050dec9f6/src/operator/elemwise_op_common.h#L186) with something that likes the gradients back to back.
   
   It appears that in the [following line](https://github.com/apache/incubator-mxnet/blob/master/src/operator/tensor/elemwise_unary_op_trig.cc#L49), the `_backward_sin` is registered using macros in the same way,
   
   ```
   MXNET_OPERATOR_REGISTER_BINARY_WITH_SPARSE_CPU_DR(_backward_sin, unary_bwd<mshadow_op::sin_grad>);
   ```
   
   It looks like all of these are registered [here](https://github.com/apache/incubator-mxnet/blob/2899715921612ef4dd147004292b5b5d0f83320b/src/operator/mshadow_op.h#L212). Notice that only the op registers a gradient,
   
   ```
   // sin
   MXNET_OPERATOR_REGISTER_UNARY_WITH_RSP_CSR(sin, cpu, mshadow_op::sin)
   .describe(R"code(Computes the element-wise sine of the input array.
   The input should be in radians (:math:`2\pi` rad equals 360 degrees).
   .. math::
      sin([0, \pi/4, \pi/2]) = [0, 0.707, 1]
   The storage type of ``sin`` output depends upon the input storage type:
      - sin(default) = default
      - sin(row_sparse) = row_sparse
      - sin(csr) = csr
   )code" ADD_FILELINE)
   .set_attr<nnvm::FGradient>("FGradient", ElemwiseGradUseIn{ "_backward_sin" });
   
   MXNET_OPERATOR_REGISTER_BINARY_WITH_SPARSE_CPU_DR(_backward_sin, unary_bwd<mshadow_op::sin_grad>);
   ```
   
   but the `_backward_sin` doesn't register a gradient. This is most likely why we only get one level of gradient.
   
   # Possible Solution
   
   Given this, my naive intuition is that a simple swap of,
   
   ```
   .set_attr<nnvm::FGradient>("FGradient", ElemwiseGradUseIn{ "_backward_sin" });
   ```
   
   for,
   
   ```
   .set_attr<nnvm::FGradient>("FGradient", ElemwiseGradUseIn{ "cos" });
   ```
   
   might give second order gradients for sin.
   
   Sorry if that is confusing. I did similar work to this for Tensorflow, so I can help out as needed.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services