You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2021/08/27 18:14:24 UTC

[GitHub] [incubator-mxnet] KexinFeng commented on a change in pull request #20559: [FEATURE] Add feature of attach_grad to nonleaf variables in HybridizedBlock.

KexinFeng commented on a change in pull request #20559:
URL: https://github.com/apache/incubator-mxnet/pull/20559#discussion_r697632922



##########
File path: tests/python/unittest/test_autograd.py
##########
@@ -519,3 +519,68 @@ def test_gradient():
     dx.backward()
     assert abs(x.grad.asscalar() - 2.71828175) < 1e-7
 
+def test_retain_grad_drop_grad():
+    x = nd.array([1,2,3,4])
+    x.attach_grad()
+    y = nd.array([5,6,7,8])
+    y.attach_grad()
+
+    with mx.autograd.record():
+        u = x * y
+        z = u * x
+
+    u.attach_grad()
+    z.attach_grad()
+    out_grad = nd.array([10, 10, 10, 10])
+    z.backward(out_grad, retain_graph=True)
+    
+    assert (u.grad == out_grad * x).asnumpy().all()
+    assert (z.grad == out_grad).asnumpy().all()
+    assert (x.grad == out_grad * 2 * x * y).asnumpy().all()
+    assert (y.grad == out_grad * x*x).asnumpy().all()
+
+    u.drop_grad()
+    z.drop_grad()
+    y.drop_grad()
+    out_grad = nd.array([0.1, 0.1, 0.1, 0.1])
+    z.backward(out_grad)
+
+    assert u.grad is None and z.grad is None and y.grad is None
+    assert (x.grad == out_grad * 2 * x * y).asnumpy().all()
+
+def test_retain_grad_drop_grad_gluon():
+    class CompBlock(mx.gluon.HybridBlock):
+        def __init__(self):
+            super().__init__()
+            self.marked_var = None
+        def forward(self, a, b):
+            out1 = a*b
+            out2 = out1 * a
+            self.marked_var = out1
+            return out2
+    x = mx.np.array([1,2,3,4])
+    y = mx.np.array([5,6,7,8])
+    x.attach_grad()
+    y.attach_grad()
+    block2 = CompBlock()
+    block2.initialize()
+    # block2.hybridize()
+    with mx.autograd.record():
+        z = block2(x, y)
+    u = block2.marked_var
+    u.attach_grad()
+    z.attach_grad()
+    z.backward(retain_graph=True)

Review comment:
       Hi Leo,
   Thanks for pointing out the relation between deferred compute and autograd API.  I just committed my first implementation of `attach_grad` on hybridize mode. Indeed, the major implementation is in the forward pass, ie the invoke of `CachedOp`. 
   
   The main idea is to utilize the handle of deferred ndarray (`out1`) to mark the nonleaf node which are still deferredcompute node. And then, when invoking `CachedOp`, the autograd computation nodes are retained and linked to the ndarrays like 'out1'. Thus these ndarrays can further call `attach_grad` which will the the same as imperative mode.
   
   But it is true that there is an incompatibility of hybridize computation with the imperative style of `attach_grad`. This incompatibility may lead to inconvenience when designing the python frontend use of this feature. This could be done next.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org