You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2021/08/27 01:33:24 UTC

[GitHub] [incubator-mxnet] KexinFeng opened a new pull request #20559: [FEATURE] Add feature of attach_grad to nonleaf variables in HybridizedBlock.

KexinFeng opened a new pull request #20559:
URL: https://github.com/apache/incubator-mxnet/pull/20559


   ## Description ##
   The PR adds the support for fetching the gradients of intermediate variables in a gluon HybridizedBlock. This applies when `block.hybridize()` is on; otherwise, calling the block would be a simpler imperative computation where the original attach_grad() can be directly applied, as implemented in [PR#20500](https://github.com/apache/incubator-mxnet/pull/20500). 
   
   The motivation of this feature comes from this [issue#11865](https://github.com/apache/incubator-mxnet/issues/11865).
   
   ## Checklist ##
   ### Essentials ###
   - [x] PR's title starts with a category (e.g. [BUGFIX], [MODEL], [TUTORIAL], [FEATURE], [DOC], etc)
   - [ ] Changes are complete (i.e. I finished coding on this PR)
   - [ ] All changes have test coverage
   - [ ] Code is well-documented
   
   ### Changes ###
   - [ ] Feature1, tests, (and when applicable, API doc)
   - [ ] Feature2, tests, (and when applicable, API doc)
   
   ## Comments ##
   - This feature is built on top of  [PR#20500](https://github.com/apache/incubator-mxnet/pull/20500), where the modification is mainly in invoke of CachedOp computation.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #20559: [FEATURE] Add feature of attach_grad to nonleaf variables in HybridizedBlock.

Posted by GitBox <gi...@apache.org>.
mxnet-bot commented on pull request #20559:
URL: https://github.com/apache/incubator-mxnet/pull/20559#issuecomment-906853813


   Hey @KexinFeng , Thanks for submitting the PR 
   All tests are already queued to run once. If tests fail, you can trigger one or more tests again with the following commands: 
   - To trigger all jobs: @mxnet-bot run ci [all] 
   - To trigger specific jobs: @mxnet-bot run ci [job1, job2] 
   *** 
   **CI supported jobs**: [sanity, windows-gpu, clang, edge, unix-cpu, windows-cpu, unix-gpu, centos-cpu, website, miscellaneous, centos-gpu]
   *** 
   _Note_: 
    Only following 3 categories can trigger CI :PR Author, MXNet Committer, Jenkins Admin. 
   All CI tests must pass before the PR can be merged. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #20559: [FEATURE] Add feature of attach_grad to nonleaf variables in HybridizedBlock.

Posted by GitBox <gi...@apache.org>.
mxnet-bot commented on pull request #20559:
URL: https://github.com/apache/incubator-mxnet/pull/20559#issuecomment-908439620


   None of the jobs entered are supported. 
   Jobs entered by user: [linkcheck]
   CI supported Jobs: [edge, centos-gpu, unix-cpu, sanity, clang, centos-cpu, miscellaneous, windows-cpu, windows-gpu, website, unix-gpu]
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] KexinFeng commented on a change in pull request #20559: [FEATURE] Add feature of attach_grad to nonleaf variables in HybridizedBlock.

Posted by GitBox <gi...@apache.org>.
KexinFeng commented on a change in pull request #20559:
URL: https://github.com/apache/incubator-mxnet/pull/20559#discussion_r697707852



##########
File path: python/mxnet/gluon/block.py
##########
@@ -1635,6 +1643,46 @@ def reset_ctx(self, ctx):
         for p in params.values():
             p.reset_ctx(ctx)
 
+    def mark_vars(self, var_arrays):
+        """Mark the nleaf nodes and with deferred computation turned off
+
+        Parameters
+        ----------
+        vars: NDArrays or List[NDArrays] with nonempty deferredcomputation_entry_
+        """
+        if not self._active:
+            raise RuntimeError('Hybridize must be active in order to use mark_vars')

Review comment:
       Now both hybridized and unhybrized are unified. This is shown in the last two examples in unittest/test_autograd.py




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] barry-jin commented on a change in pull request #20559: [FEATURE] Add feature of attach_grad to nonleaf variables in HybridizedBlock.

Posted by GitBox <gi...@apache.org>.
barry-jin commented on a change in pull request #20559:
URL: https://github.com/apache/incubator-mxnet/pull/20559#discussion_r698622760



##########
File path: python/mxnet/gluon/block.py
##########
@@ -1635,6 +1637,47 @@ def reset_ctx(self, ctx):
         for p in params.values():
             p.reset_ctx(ctx)
 
+    def mark_vars(self, var_arrays):
+        """Mark the intermediate nodes
+
+        Parameters
+        ----------
+        vars: NDArrays or List[NDArrays] with nonempty deferredcomputation_entry_

Review comment:
       Could you follow the [example](https://github.com/apache/incubator-mxnet/blob/49c47380fac3c8c8e7ee8eb237000165e22828ef/python/mxnet/numpy/multiarray.py#L2621-L2645) to write the docstring for mark_vars? 

##########
File path: python/mxnet/gluon/block.py
##########
@@ -1635,6 +1637,47 @@ def reset_ctx(self, ctx):
         for p in params.values():
             p.reset_ctx(ctx)
 
+    def mark_vars(self, var_arrays):
+        """Mark the intermediate nodes
+
+        Parameters
+        ----------
+        vars: NDArrays or List[NDArrays] with nonempty deferredcomputation_entry_
+        """
+        if not self._active:
+            var_arrays = _as_list(var_arrays)
+            self._nleaf_vars.extend(var_arrays)
+        else:
+            prev_val = dc.set_deferred_compute(False)
+            var_arrays = _as_list(var_arrays)
+            # Prepare ctypes array types
+            import ctypes
+            var_handles_type = ctypes.c_void_p * len(var_arrays)
+            # Convert handles
+            var_handles = var_handles_type(*[arr.handle for arr in var_arrays])
+            check_call(_LIB.MXNDArrayMarkDCVariables(var_handles, len(var_arrays), len(self._nleaf_vars)))
+            self._nleaf_vars.extend(var_arrays)
+            dc.set_deferred_compute(prev_val)
+
+    def get_mark_vars(self, mark_ids):
+        """Retrieve the marked ndarrays according to the order they are marked
+
+        Parameters
+        ----------
+        mark_ids: int or List[int], the order by which the ndarray is marked

Review comment:
       Could you follow the [example](https://github.com/apache/incubator-mxnet/blob/49c47380fac3c8c8e7ee8eb237000165e22828ef/python/mxnet/numpy/multiarray.py#L2621-L2645) to write the docstring for get_mark_vars? 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] KexinFeng commented on pull request #20559: [FEATURE] Add feature of attach_grad to nonleaf variables in HybridizedBlock.

Posted by GitBox <gi...@apache.org>.
KexinFeng commented on pull request #20559:
URL: https://github.com/apache/incubator-mxnet/pull/20559#issuecomment-907731598


   @mxnet-bot run ci [website]


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] KexinFeng commented on a change in pull request #20559: [FEATURE] Add feature of attach_grad to nonleaf variables in HybridizedBlock.

Posted by GitBox <gi...@apache.org>.
KexinFeng commented on a change in pull request #20559:
URL: https://github.com/apache/incubator-mxnet/pull/20559#discussion_r701941863



##########
File path: src/imperative/imperative_utils.cc
##########
@@ -166,6 +167,16 @@ void RunGraph(
     if (callback) {
         mxnet::common::ExecuteMonOutputCallback(idx, arrays, i, callback);
     }
+    // set the autograd_entry_ in marked nleafs
+    if (nleafs.size()) {
+      auto it = node.source->attrs.dict.find("mark_id");
+      if (it != node.source->attrs.dict.end()) {
+        int mark_id = std::stoi(it->second);
+        CHECK_LT(mark_id, nleafs.size())
+          << "Mark_id exceeds the nonleaf list size.";
+        nleafs[mark_id]->copy_autograd_entry_(ndoutputs[0]);

Review comment:
       I have added the unit test which considers multiple outputs. This is in the commit 'multiple_outputs', also shown below. It shows that it works as expected. 
   ```
           def forward(self, a, b, c):
               out1 = self.intermediate(('out1_0', 'out1_1'), ((a+b)*c, a*b), grad_req='write')
               out2 = self.intermediate('out2', out1[1] * a)
               return out2
   ```
   
   In the above commented part, in line 177, indeed the use of `ndoutputs[0]` assumes that in the forward graph each computation node has only one output.  This is consistent with line 152. Based on the test above, for now it seems that this assumption works for the multiple output case: `((a+b)*c, a*b)`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] KexinFeng commented on a change in pull request #20559: [FEATURE] Add feature of attach_grad to nonleaf variables in HybridizedBlock.

Posted by GitBox <gi...@apache.org>.
KexinFeng commented on a change in pull request #20559:
URL: https://github.com/apache/incubator-mxnet/pull/20559#discussion_r697689732



##########
File path: python/mxnet/gluon/block.py
##########
@@ -1635,6 +1643,46 @@ def reset_ctx(self, ctx):
         for p in params.values():
             p.reset_ctx(ctx)
 
+    def mark_vars(self, var_arrays):
+        """Mark the nleaf nodes and with deferred computation turned off

Review comment:
       Got it. Thanks!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #20559: [FEATURE] Add feature of attach_grad to nonleaf variables in HybridizedBlock.

Posted by GitBox <gi...@apache.org>.
mxnet-bot commented on pull request #20559:
URL: https://github.com/apache/incubator-mxnet/pull/20559#issuecomment-907731604


   Jenkins CI successfully triggered : [website]


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] leezu commented on a change in pull request #20559: [FEATURE] Add feature of attach_grad to nonleaf variables in HybridizedBlock.

Posted by GitBox <gi...@apache.org>.
leezu commented on a change in pull request #20559:
URL: https://github.com/apache/incubator-mxnet/pull/20559#discussion_r697662709



##########
File path: python/mxnet/gluon/block.py
##########
@@ -1635,6 +1643,46 @@ def reset_ctx(self, ctx):
         for p in params.values():
             p.reset_ctx(ctx)
 
+    def mark_vars(self, var_arrays):
+        """Mark the nleaf nodes and with deferred computation turned off
+
+        Parameters
+        ----------
+        vars: NDArrays or List[NDArrays] with nonempty deferredcomputation_entry_
+        """
+        if not self._active:
+            raise RuntimeError('Hybridize must be active in order to use mark_vars')

Review comment:
       For a good user experience, it would be great if we have an API that supports both hybrid and imperative mode. It doesn't need to use the same backend implementation, but it would be good to have a consistent frontend API. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #20559: [FEATURE] Add feature of attach_grad to nonleaf variables in HybridizedBlock.

Posted by GitBox <gi...@apache.org>.
mxnet-bot commented on pull request #20559:
URL: https://github.com/apache/incubator-mxnet/pull/20559#issuecomment-909355530


   Jenkins CI successfully triggered : [centos-gpu]


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #20559: [FEATURE] Add feature of attach_grad to nonleaf variables in HybridizedBlock.

Posted by GitBox <gi...@apache.org>.
mxnet-bot commented on pull request #20559:
URL: https://github.com/apache/incubator-mxnet/pull/20559#issuecomment-908439620


   None of the jobs entered are supported. 
   Jobs entered by user: [linkcheck]
   CI supported Jobs: [edge, centos-gpu, unix-cpu, sanity, clang, centos-cpu, miscellaneous, windows-cpu, windows-gpu, website, unix-gpu]
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] leezu commented on a change in pull request #20559: [FEATURE] Add feature of attach_grad to nonleaf variables in HybridizedBlock.

Posted by GitBox <gi...@apache.org>.
leezu commented on a change in pull request #20559:
URL: https://github.com/apache/incubator-mxnet/pull/20559#discussion_r697558544



##########
File path: tests/python/unittest/test_autograd.py
##########
@@ -519,3 +519,68 @@ def test_gradient():
     dx.backward()
     assert abs(x.grad.asscalar() - 2.71828175) < 1e-7
 
+def test_retain_grad_drop_grad():
+    x = nd.array([1,2,3,4])
+    x.attach_grad()
+    y = nd.array([5,6,7,8])
+    y.attach_grad()
+
+    with mx.autograd.record():
+        u = x * y
+        z = u * x
+
+    u.attach_grad()
+    z.attach_grad()
+    out_grad = nd.array([10, 10, 10, 10])
+    z.backward(out_grad, retain_graph=True)
+    
+    assert (u.grad == out_grad * x).asnumpy().all()
+    assert (z.grad == out_grad).asnumpy().all()
+    assert (x.grad == out_grad * 2 * x * y).asnumpy().all()
+    assert (y.grad == out_grad * x*x).asnumpy().all()
+
+    u.drop_grad()
+    z.drop_grad()
+    y.drop_grad()
+    out_grad = nd.array([0.1, 0.1, 0.1, 0.1])
+    z.backward(out_grad)
+
+    assert u.grad is None and z.grad is None and y.grad is None
+    assert (x.grad == out_grad * 2 * x * y).asnumpy().all()
+
+def test_retain_grad_drop_grad_gluon():
+    class CompBlock(mx.gluon.HybridBlock):
+        def __init__(self):
+            super().__init__()
+            self.marked_var = None
+        def forward(self, a, b):
+            out1 = a*b
+            out2 = out1 * a
+            self.marked_var = out1
+            return out2
+    x = mx.np.array([1,2,3,4])
+    y = mx.np.array([5,6,7,8])
+    x.attach_grad()
+    y.attach_grad()
+    block2 = CompBlock()
+    block2.initialize()
+    # block2.hybridize()
+    with mx.autograd.record():
+        z = block2(x, y)
+    u = block2.marked_var
+    u.attach_grad()
+    z.attach_grad()
+    z.backward(retain_graph=True)

Review comment:
       Accessing the `marked_var` outside of the Block wouldn't work for hybridized Blocks. That's because the ndarray (`out1`) is just a temporary object used in the forward function for tracing the computation. Afterwards it's discarded and only the underlying symbolic graph is kept.
   
   Thus I think you'll need to support `attach_grad` inside of the forward function. WDYT?
   
   But keep in mind that `autograd` / `attach_grad` and deferred compute tracing are not supported at the same time currently:
   
   https://github.com/apache/incubator-mxnet/blob/71526851d857fde97660aa79ccdb77f2b1cf963f/src/imperative/imperative.cc#L301-L305




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #20559: [FEATURE] Add feature of attach_grad to nonleaf variables in HybridizedBlock.

Posted by GitBox <gi...@apache.org>.
mxnet-bot commented on pull request #20559:
URL: https://github.com/apache/incubator-mxnet/pull/20559#issuecomment-909355530


   Jenkins CI successfully triggered : [centos-gpu]


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] barry-jin commented on a change in pull request #20559: [FEATURE] Add feature of attach_grad to nonleaf variables in HybridizedBlock.

Posted by GitBox <gi...@apache.org>.
barry-jin commented on a change in pull request #20559:
URL: https://github.com/apache/incubator-mxnet/pull/20559#discussion_r698623547



##########
File path: python/mxnet/gluon/block.py
##########
@@ -1635,6 +1637,47 @@ def reset_ctx(self, ctx):
         for p in params.values():
             p.reset_ctx(ctx)
 
+    def mark_vars(self, var_arrays):
+        """Mark the intermediate nodes
+
+        Parameters
+        ----------
+        vars: NDArrays or List[NDArrays] with nonempty deferredcomputation_entry_
+        """
+        if not self._active:
+            var_arrays = _as_list(var_arrays)
+            self._nleaf_vars.extend(var_arrays)
+        else:
+            prev_val = dc.set_deferred_compute(False)
+            var_arrays = _as_list(var_arrays)
+            # Prepare ctypes array types
+            import ctypes
+            var_handles_type = ctypes.c_void_p * len(var_arrays)
+            # Convert handles
+            var_handles = var_handles_type(*[arr.handle for arr in var_arrays])
+            check_call(_LIB.MXNDArrayMarkDCVariables(var_handles, len(var_arrays), len(self._nleaf_vars)))
+            self._nleaf_vars.extend(var_arrays)
+            dc.set_deferred_compute(prev_val)
+
+    def get_mark_vars(self, mark_ids):
+        """Retrieve the marked ndarrays according to the order they are marked
+
+        Parameters
+        ----------
+        mark_ids: int or List[int], the order by which the ndarray is marked

Review comment:
       Could you follow the [example](https://github.com/apache/incubator-mxnet/blob/49c47380fac3c8c8e7ee8eb237000165e22828ef/python/mxnet/numpy/multiarray.py#L2621-L2645) to write the docstring for get_mark_vars? 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] leezu commented on a change in pull request #20559: [FEATURE] Add feature of attach_grad to nonleaf variables in HybridizedBlock.

Posted by GitBox <gi...@apache.org>.
leezu commented on a change in pull request #20559:
URL: https://github.com/apache/incubator-mxnet/pull/20559#discussion_r697558544



##########
File path: tests/python/unittest/test_autograd.py
##########
@@ -519,3 +519,68 @@ def test_gradient():
     dx.backward()
     assert abs(x.grad.asscalar() - 2.71828175) < 1e-7
 
+def test_retain_grad_drop_grad():
+    x = nd.array([1,2,3,4])
+    x.attach_grad()
+    y = nd.array([5,6,7,8])
+    y.attach_grad()
+
+    with mx.autograd.record():
+        u = x * y
+        z = u * x
+
+    u.attach_grad()
+    z.attach_grad()
+    out_grad = nd.array([10, 10, 10, 10])
+    z.backward(out_grad, retain_graph=True)
+    
+    assert (u.grad == out_grad * x).asnumpy().all()
+    assert (z.grad == out_grad).asnumpy().all()
+    assert (x.grad == out_grad * 2 * x * y).asnumpy().all()
+    assert (y.grad == out_grad * x*x).asnumpy().all()
+
+    u.drop_grad()
+    z.drop_grad()
+    y.drop_grad()
+    out_grad = nd.array([0.1, 0.1, 0.1, 0.1])
+    z.backward(out_grad)
+
+    assert u.grad is None and z.grad is None and y.grad is None
+    assert (x.grad == out_grad * 2 * x * y).asnumpy().all()
+
+def test_retain_grad_drop_grad_gluon():
+    class CompBlock(mx.gluon.HybridBlock):
+        def __init__(self):
+            super().__init__()
+            self.marked_var = None
+        def forward(self, a, b):
+            out1 = a*b
+            out2 = out1 * a
+            self.marked_var = out1
+            return out2
+    x = mx.np.array([1,2,3,4])
+    y = mx.np.array([5,6,7,8])
+    x.attach_grad()
+    y.attach_grad()
+    block2 = CompBlock()
+    block2.initialize()
+    # block2.hybridize()
+    with mx.autograd.record():
+        z = block2(x, y)
+    u = block2.marked_var
+    u.attach_grad()
+    z.attach_grad()
+    z.backward(retain_graph=True)

Review comment:
       Accessing the `marked_var` outside of the Block wouldn't work for hybridized Blocks. That's because the ndarray (`out1`) is just a temporary object used in the forward function for tracing the computation. Afterwards it's discarded and only the underlying symbolic graph is kept.
   
   Thus I think you'll need to support something like `attach_grad` inside of the forward function. WDYT?
   For that, I'd recommend you take a look at the CachedOp and how it handles gradients. You'll need a way to express that it should return a gradient buffer for the intermediate variable.
   
   Also keep in mind that `autograd` API currently only targets the imperative use-case (without `hybridize`) and the deferred compute tracing used during hybridization are can't be enabled at the same time as `autograd` currently:
   
   https://github.com/apache/incubator-mxnet/blob/71526851d857fde97660aa79ccdb77f2b1cf963f/src/imperative/imperative.cc#L301-L305




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #20559: [FEATURE] Add feature of attach_grad to nonleaf variables in HybridizedBlock.

Posted by GitBox <gi...@apache.org>.
mxnet-bot commented on pull request #20559:
URL: https://github.com/apache/incubator-mxnet/pull/20559#issuecomment-907542773


   Jenkins CI successfully triggered : [centos-cpu]


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] KexinFeng commented on a change in pull request #20559: [FEATURE] Add feature of attach_grad to nonleaf variables in HybridizedBlock.

Posted by GitBox <gi...@apache.org>.
KexinFeng commented on a change in pull request #20559:
URL: https://github.com/apache/incubator-mxnet/pull/20559#discussion_r701944084



##########
File path: python/mxnet/gluon/block.py
##########
@@ -1635,6 +1637,50 @@ def reset_ctx(self, ctx):
         for p in params.values():
             p.reset_ctx(ctx)
 
+    def mark_vars(self, var_arrays):
+        """Mark the intermediate nodes for autograd computation.
+
+        Parameters
+        ----------
+        vars : ndarrays or List[ndarrays]
+            The marked arrays used in deferredcomputation
+        """
+        if not self._active:
+            var_arrays = _as_list(var_arrays)
+            self._nleaf_vars.extend(var_arrays)
+        else:
+            prev_val = dc.set_deferred_compute(False)
+            var_arrays = _as_list(var_arrays)
+            # Prepare ctypes array types
+            import ctypes
+            var_handles_type = ctypes.c_void_p * len(var_arrays)
+            # Convert handles
+            var_handles = var_handles_type(*[arr.handle for arr in var_arrays])
+            check_call(_LIB.MXNDArrayMarkDCVariables(var_handles, len(var_arrays), len(self._nleaf_vars)))
+            self._nleaf_vars.extend(var_arrays)
+            dc.set_deferred_compute(prev_val)
+
+    def get_mark_vars(self, mark_ids):
+        """Retrieve the marked ndarrays according to the order by which they are marked.
+
+        Parameters
+        ----------
+        mark_ids : int or List[int]
+            The order by which the ndarray is marked.

Review comment:
       Now it is implemented with `OrderedDict`. So the intermediate variables can be retrieved by name.

##########
File path: tests/python/unittest/test_autograd.py
##########
@@ -519,3 +519,110 @@ def test_gradient():
     dx.backward()
     assert abs(x.grad.asscalar() - 2.71828175) < 1e-7
 
+def test_retain_grad_drop_grad():
+    x = nd.array([1,2,3,4])
+    x.attach_grad()
+    y = nd.array([5,6,7,8])
+    y.attach_grad()
+
+    with mx.autograd.record():
+        u = x * y
+        z = u * x
+
+    u.attach_grad()
+    z.attach_grad()
+    out_grad = nd.array([10, 10, 10, 10])
+    z.backward(out_grad, retain_graph=True)
+
+    assert (u.grad == out_grad * x).asnumpy().all()
+    assert (z.grad == out_grad).asnumpy().all()
+    assert (x.grad == out_grad * 2 * x * y).asnumpy().all()
+    assert (y.grad == out_grad * x*x).asnumpy().all()
+
+    u.drop_grad()
+    z.drop_grad()
+    y.drop_grad()
+    out_grad = nd.array([0.1, 0.1, 0.1, 0.1])
+    z.backward(out_grad)
+
+    assert u.grad is None and z.grad is None and y.grad is None
+    assert (x.grad == out_grad * 2 * x * y).asnumpy().all()
+
+def test_retain_grad_drop_grad_gluon():
+    class CompBlock(mx.gluon.HybridBlock):
+        def __init__(self):
+            super().__init__()
+
+        def forward(self, a, b):
+            out1 = a*b
+            out2 = out1 * a
+            self.mark_vars(out1)
+            return out2
+
+    x = mx.np.array([1,2,3,4])
+    y = mx.np.array([5,6,7,8])
+    x.attach_grad()
+    y.attach_grad()
+    block2 = CompBlock()
+    block2.initialize()
+    # block2.hybridize()

Review comment:
       Done.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] leezu commented on a change in pull request #20559: [FEATURE] Add feature of attach_grad to nonleaf variables in HybridizedBlock.

Posted by GitBox <gi...@apache.org>.
leezu commented on a change in pull request #20559:
URL: https://github.com/apache/incubator-mxnet/pull/20559#discussion_r701074106



##########
File path: tests/python/unittest/test_autograd.py
##########
@@ -519,3 +519,110 @@ def test_gradient():
     dx.backward()
     assert abs(x.grad.asscalar() - 2.71828175) < 1e-7
 
+def test_retain_grad_drop_grad():
+    x = nd.array([1,2,3,4])
+    x.attach_grad()
+    y = nd.array([5,6,7,8])
+    y.attach_grad()
+
+    with mx.autograd.record():
+        u = x * y
+        z = u * x
+
+    u.attach_grad()
+    z.attach_grad()
+    out_grad = nd.array([10, 10, 10, 10])
+    z.backward(out_grad, retain_graph=True)
+
+    assert (u.grad == out_grad * x).asnumpy().all()
+    assert (z.grad == out_grad).asnumpy().all()
+    assert (x.grad == out_grad * 2 * x * y).asnumpy().all()
+    assert (y.grad == out_grad * x*x).asnumpy().all()
+
+    u.drop_grad()
+    z.drop_grad()
+    y.drop_grad()
+    out_grad = nd.array([0.1, 0.1, 0.1, 0.1])
+    z.backward(out_grad)
+
+    assert u.grad is None and z.grad is None and y.grad is None
+    assert (x.grad == out_grad * 2 * x * y).asnumpy().all()
+
+def test_retain_grad_drop_grad_gluon():
+    class CompBlock(mx.gluon.HybridBlock):
+        def __init__(self):
+            super().__init__()
+
+        def forward(self, a, b):
+            out1 = a*b
+            out2 = out1 * a
+            self.mark_vars(out1)
+            return out2
+
+    x = mx.np.array([1,2,3,4])
+    y = mx.np.array([5,6,7,8])
+    x.attach_grad()
+    y.attach_grad()
+    block2 = CompBlock()
+    block2.initialize()
+    # block2.hybridize()

Review comment:
       Should this test be run both with and without hybridize? You can parameterize the test function https://docs.pytest.org/en/6.2.x/parametrize.html

##########
File path: src/imperative/imperative_utils.cc
##########
@@ -166,6 +167,16 @@ void RunGraph(
     if (callback) {
         mxnet::common::ExecuteMonOutputCallback(idx, arrays, i, callback);
     }
+    // set the autograd_entry_ in marked nleafs
+    if (nleafs.size()) {
+      auto it = node.source->attrs.dict.find("mark_id");
+      if (it != node.source->attrs.dict.end()) {
+        int mark_id = std::stoi(it->second);
+        CHECK_LT(mark_id, nleafs.size())
+          << "Mark_id exceeds the nonleaf list size.";
+        nleafs[mark_id]->copy_autograd_entry_(ndoutputs[0]);

Review comment:
       Have you verified this is correct if a marked variable has multiple outputs? I see your test case below only considers variables with a single output:
   
   ```
           def forward(self, a, b):
               out1 = a*b
               out2 = out1 * a
               self.mark_vars(out1)
               return out2
   ```

##########
File path: python/mxnet/gluon/block.py
##########
@@ -1635,6 +1637,50 @@ def reset_ctx(self, ctx):
         for p in params.values():
             p.reset_ctx(ctx)
 
+    def mark_vars(self, var_arrays):
+        """Mark the intermediate nodes for autograd computation.
+
+        Parameters
+        ----------
+        vars : ndarrays or List[ndarrays]
+            The marked arrays used in deferredcomputation
+        """
+        if not self._active:
+            var_arrays = _as_list(var_arrays)
+            self._nleaf_vars.extend(var_arrays)
+        else:
+            prev_val = dc.set_deferred_compute(False)
+            var_arrays = _as_list(var_arrays)
+            # Prepare ctypes array types
+            import ctypes
+            var_handles_type = ctypes.c_void_p * len(var_arrays)
+            # Convert handles
+            var_handles = var_handles_type(*[arr.handle for arr in var_arrays])
+            check_call(_LIB.MXNDArrayMarkDCVariables(var_handles, len(var_arrays), len(self._nleaf_vars)))
+            self._nleaf_vars.extend(var_arrays)
+            dc.set_deferred_compute(prev_val)
+
+    def get_mark_vars(self, mark_ids):
+        """Retrieve the marked ndarrays according to the order by which they are marked.
+
+        Parameters
+        ----------
+        mark_ids : int or List[int]
+            The order by which the ndarray is marked.

Review comment:
       For some networks, the order or total number of marked arrays could differ depending on constructor arguments. Would it be simpler to associate a name to the marked array during `mark_vars()`  (eg. `mark_vars(test_grad=var, test2_grad=var2)` and retrieval via get_mark_vars('test_grad', test2_grad')`?)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] leezu commented on a change in pull request #20559: [FEATURE] Add feature of attach_grad to nonleaf variables in HybridizedBlock.

Posted by GitBox <gi...@apache.org>.
leezu commented on a change in pull request #20559:
URL: https://github.com/apache/incubator-mxnet/pull/20559#discussion_r697558544



##########
File path: tests/python/unittest/test_autograd.py
##########
@@ -519,3 +519,68 @@ def test_gradient():
     dx.backward()
     assert abs(x.grad.asscalar() - 2.71828175) < 1e-7
 
+def test_retain_grad_drop_grad():
+    x = nd.array([1,2,3,4])
+    x.attach_grad()
+    y = nd.array([5,6,7,8])
+    y.attach_grad()
+
+    with mx.autograd.record():
+        u = x * y
+        z = u * x
+
+    u.attach_grad()
+    z.attach_grad()
+    out_grad = nd.array([10, 10, 10, 10])
+    z.backward(out_grad, retain_graph=True)
+    
+    assert (u.grad == out_grad * x).asnumpy().all()
+    assert (z.grad == out_grad).asnumpy().all()
+    assert (x.grad == out_grad * 2 * x * y).asnumpy().all()
+    assert (y.grad == out_grad * x*x).asnumpy().all()
+
+    u.drop_grad()
+    z.drop_grad()
+    y.drop_grad()
+    out_grad = nd.array([0.1, 0.1, 0.1, 0.1])
+    z.backward(out_grad)
+
+    assert u.grad is None and z.grad is None and y.grad is None
+    assert (x.grad == out_grad * 2 * x * y).asnumpy().all()
+
+def test_retain_grad_drop_grad_gluon():
+    class CompBlock(mx.gluon.HybridBlock):
+        def __init__(self):
+            super().__init__()
+            self.marked_var = None
+        def forward(self, a, b):
+            out1 = a*b
+            out2 = out1 * a
+            self.marked_var = out1
+            return out2
+    x = mx.np.array([1,2,3,4])
+    y = mx.np.array([5,6,7,8])
+    x.attach_grad()
+    y.attach_grad()
+    block2 = CompBlock()
+    block2.initialize()
+    # block2.hybridize()
+    with mx.autograd.record():
+        z = block2(x, y)
+    u = block2.marked_var
+    u.attach_grad()
+    z.attach_grad()
+    z.backward(retain_graph=True)

Review comment:
       Accessing the `marked_var` outside of the Block wouldn't work for hybridized Blocks. That's because the ndarray (`out1`) is just a temporary object used in the forward function for tracing the computation. Afterwards it's discarded and only the underlying symbolic graph is kept.
   
   Thus I think you'll need to support something like `attach_grad` inside of the forward function. WDYT?
   For that, I'd recommend you take a look at the CachedOp and how it handles gradients. You'll need a way to express that it should return a gradient buffer for the intermediate variable.
   
   For that, keep in mind that `autograd` / `attach_grad` and the deferred compute tracing used during hybridization are not supported at the same time currently:
   
   https://github.com/apache/incubator-mxnet/blob/71526851d857fde97660aa79ccdb77f2b1cf963f/src/imperative/imperative.cc#L301-L305




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] KexinFeng commented on a change in pull request #20559: [FEATURE] Add feature of attach_grad to nonleaf variables in HybridizedBlock.

Posted by GitBox <gi...@apache.org>.
KexinFeng commented on a change in pull request #20559:
URL: https://github.com/apache/incubator-mxnet/pull/20559#discussion_r697632922



##########
File path: tests/python/unittest/test_autograd.py
##########
@@ -519,3 +519,68 @@ def test_gradient():
     dx.backward()
     assert abs(x.grad.asscalar() - 2.71828175) < 1e-7
 
+def test_retain_grad_drop_grad():
+    x = nd.array([1,2,3,4])
+    x.attach_grad()
+    y = nd.array([5,6,7,8])
+    y.attach_grad()
+
+    with mx.autograd.record():
+        u = x * y
+        z = u * x
+
+    u.attach_grad()
+    z.attach_grad()
+    out_grad = nd.array([10, 10, 10, 10])
+    z.backward(out_grad, retain_graph=True)
+    
+    assert (u.grad == out_grad * x).asnumpy().all()
+    assert (z.grad == out_grad).asnumpy().all()
+    assert (x.grad == out_grad * 2 * x * y).asnumpy().all()
+    assert (y.grad == out_grad * x*x).asnumpy().all()
+
+    u.drop_grad()
+    z.drop_grad()
+    y.drop_grad()
+    out_grad = nd.array([0.1, 0.1, 0.1, 0.1])
+    z.backward(out_grad)
+
+    assert u.grad is None and z.grad is None and y.grad is None
+    assert (x.grad == out_grad * 2 * x * y).asnumpy().all()
+
+def test_retain_grad_drop_grad_gluon():
+    class CompBlock(mx.gluon.HybridBlock):
+        def __init__(self):
+            super().__init__()
+            self.marked_var = None
+        def forward(self, a, b):
+            out1 = a*b
+            out2 = out1 * a
+            self.marked_var = out1
+            return out2
+    x = mx.np.array([1,2,3,4])
+    y = mx.np.array([5,6,7,8])
+    x.attach_grad()
+    y.attach_grad()
+    block2 = CompBlock()
+    block2.initialize()
+    # block2.hybridize()
+    with mx.autograd.record():
+        z = block2(x, y)
+    u = block2.marked_var
+    u.attach_grad()
+    z.attach_grad()
+    z.backward(retain_graph=True)

Review comment:
       Hi Leo,
   Thanks for pointing out the relation between deferred compute and autograd API.  I just committed my first implementation of `attach_grad` on hybridize mode. Indeed, the major implementation is in the forward pass, ie the invoke of `CachedOp`. 
   
   The main idea is to utilize the handle of deferred ndarray (`out1`) to mark the nonleaf node which are still deferredcompute node. And then, when invoking `CachedOp`, the autograd computation nodes are retained and linked to the ndarrays like 'out1'. Thus these ndarrays can further call `attach_grad` which will the the same as imperative mode.
   
   But it is true that there is an incompatibility of hybridize computation with the imperative style of `attach_grad`. This incompatibility may lead to inconvenience when designing the python frontend use of this feature. This could be done next.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] barry-jin commented on a change in pull request #20559: [FEATURE] Add feature of attach_grad to nonleaf variables in HybridizedBlock.

Posted by GitBox <gi...@apache.org>.
barry-jin commented on a change in pull request #20559:
URL: https://github.com/apache/incubator-mxnet/pull/20559#discussion_r698622760



##########
File path: python/mxnet/gluon/block.py
##########
@@ -1635,6 +1637,47 @@ def reset_ctx(self, ctx):
         for p in params.values():
             p.reset_ctx(ctx)
 
+    def mark_vars(self, var_arrays):
+        """Mark the intermediate nodes
+
+        Parameters
+        ----------
+        vars: NDArrays or List[NDArrays] with nonempty deferredcomputation_entry_

Review comment:
       Could you follow the [example](https://github.com/apache/incubator-mxnet/blob/49c47380fac3c8c8e7ee8eb237000165e22828ef/python/mxnet/numpy/multiarray.py#L2621-L2645) to write the docstring for mark_vars? 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] KexinFeng commented on pull request #20559: [FEATURE] Add feature of attach_grad to nonleaf variables in HybridizedBlock.

Posted by GitBox <gi...@apache.org>.
KexinFeng commented on pull request #20559:
URL: https://github.com/apache/incubator-mxnet/pull/20559#issuecomment-909355444


   @mxnet-bot run ci [centos-gpu]


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] KexinFeng commented on a change in pull request #20559: [FEATURE] Add feature of attach_grad to nonleaf variables in HybridizedBlock.

Posted by GitBox <gi...@apache.org>.
KexinFeng commented on a change in pull request #20559:
URL: https://github.com/apache/incubator-mxnet/pull/20559#discussion_r698727521



##########
File path: python/mxnet/gluon/block.py
##########
@@ -1635,6 +1637,47 @@ def reset_ctx(self, ctx):
         for p in params.values():
             p.reset_ctx(ctx)
 
+    def mark_vars(self, var_arrays):
+        """Mark the intermediate nodes
+
+        Parameters
+        ----------
+        vars: NDArrays or List[NDArrays] with nonempty deferredcomputation_entry_
+        """
+        if not self._active:
+            var_arrays = _as_list(var_arrays)
+            self._nleaf_vars.extend(var_arrays)
+        else:
+            prev_val = dc.set_deferred_compute(False)
+            var_arrays = _as_list(var_arrays)
+            # Prepare ctypes array types
+            import ctypes
+            var_handles_type = ctypes.c_void_p * len(var_arrays)
+            # Convert handles
+            var_handles = var_handles_type(*[arr.handle for arr in var_arrays])
+            check_call(_LIB.MXNDArrayMarkDCVariables(var_handles, len(var_arrays), len(self._nleaf_vars)))
+            self._nleaf_vars.extend(var_arrays)
+            dc.set_deferred_compute(prev_val)
+
+    def get_mark_vars(self, mark_ids):
+        """Retrieve the marked ndarrays according to the order they are marked
+
+        Parameters
+        ----------
+        mark_ids: int or List[int], the order by which the ndarray is marked

Review comment:
       Done.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] leezu commented on a change in pull request #20559: [FEATURE] Add feature of attach_grad to nonleaf variables in HybridizedBlock.

Posted by GitBox <gi...@apache.org>.
leezu commented on a change in pull request #20559:
URL: https://github.com/apache/incubator-mxnet/pull/20559#discussion_r697663091



##########
File path: python/mxnet/gluon/block.py
##########
@@ -1635,6 +1643,46 @@ def reset_ctx(self, ctx):
         for p in params.values():
             p.reset_ctx(ctx)
 
+    def mark_vars(self, var_arrays):
+        """Mark the nleaf nodes and with deferred computation turned off

Review comment:
       The "with deferred computation turned off" is an implementation detail and it may not need to be in the function summary :)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] KexinFeng commented on pull request #20559: [FEATURE] Add feature of attach_grad to nonleaf variables in HybridizedBlock.

Posted by GitBox <gi...@apache.org>.
KexinFeng commented on pull request #20559:
URL: https://github.com/apache/incubator-mxnet/pull/20559#issuecomment-907542760


   @mxnet-bot run ci [centos-cpu]


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] KexinFeng commented on pull request #20559: [FEATURE] Add feature of attach_grad to nonleaf variables in HybridizedBlock.

Posted by GitBox <gi...@apache.org>.
KexinFeng commented on pull request #20559:
URL: https://github.com/apache/incubator-mxnet/pull/20559#issuecomment-907704123


   @mxnet-bot run ci [unix-gpu]


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] leezu commented on a change in pull request #20559: [FEATURE] Add feature of attach_grad to nonleaf variables in HybridizedBlock.

Posted by GitBox <gi...@apache.org>.
leezu commented on a change in pull request #20559:
URL: https://github.com/apache/incubator-mxnet/pull/20559#discussion_r697558544



##########
File path: tests/python/unittest/test_autograd.py
##########
@@ -519,3 +519,68 @@ def test_gradient():
     dx.backward()
     assert abs(x.grad.asscalar() - 2.71828175) < 1e-7
 
+def test_retain_grad_drop_grad():
+    x = nd.array([1,2,3,4])
+    x.attach_grad()
+    y = nd.array([5,6,7,8])
+    y.attach_grad()
+
+    with mx.autograd.record():
+        u = x * y
+        z = u * x
+
+    u.attach_grad()
+    z.attach_grad()
+    out_grad = nd.array([10, 10, 10, 10])
+    z.backward(out_grad, retain_graph=True)
+    
+    assert (u.grad == out_grad * x).asnumpy().all()
+    assert (z.grad == out_grad).asnumpy().all()
+    assert (x.grad == out_grad * 2 * x * y).asnumpy().all()
+    assert (y.grad == out_grad * x*x).asnumpy().all()
+
+    u.drop_grad()
+    z.drop_grad()
+    y.drop_grad()
+    out_grad = nd.array([0.1, 0.1, 0.1, 0.1])
+    z.backward(out_grad)
+
+    assert u.grad is None and z.grad is None and y.grad is None
+    assert (x.grad == out_grad * 2 * x * y).asnumpy().all()
+
+def test_retain_grad_drop_grad_gluon():
+    class CompBlock(mx.gluon.HybridBlock):
+        def __init__(self):
+            super().__init__()
+            self.marked_var = None
+        def forward(self, a, b):
+            out1 = a*b
+            out2 = out1 * a
+            self.marked_var = out1
+            return out2
+    x = mx.np.array([1,2,3,4])
+    y = mx.np.array([5,6,7,8])
+    x.attach_grad()
+    y.attach_grad()
+    block2 = CompBlock()
+    block2.initialize()
+    # block2.hybridize()
+    with mx.autograd.record():
+        z = block2(x, y)
+    u = block2.marked_var
+    u.attach_grad()
+    z.attach_grad()
+    z.backward(retain_graph=True)

Review comment:
       Accessing the `marked_var` outside of the Block wouldn't work for hybridized Blocks. That's because the ndarray (`out1`) is just a temporary object used in the forward function for tracing the computation. Afterwards it's discarded and only the underlying symbolic graph is kept.
   
   Thus I think you'll need to support something like `attach_grad` inside of the forward function. WDYT?
   
   For that, keep in mind that `autograd` / `attach_grad` and the deferred compute tracing used during hybridization are not supported at the same time currently:
   
   https://github.com/apache/incubator-mxnet/blob/71526851d857fde97660aa79ccdb77f2b1cf963f/src/imperative/imperative.cc#L301-L305




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #20559: [FEATURE] Add feature of attach_grad to nonleaf variables in HybridizedBlock.

Posted by GitBox <gi...@apache.org>.
mxnet-bot commented on pull request #20559:
URL: https://github.com/apache/incubator-mxnet/pull/20559#issuecomment-907704126


   Jenkins CI successfully triggered : [unix-gpu]


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] KexinFeng commented on a change in pull request #20559: [FEATURE] Add feature of attach_grad to nonleaf variables in HybridizedBlock.

Posted by GitBox <gi...@apache.org>.
KexinFeng commented on a change in pull request #20559:
URL: https://github.com/apache/incubator-mxnet/pull/20559#discussion_r698727447



##########
File path: python/mxnet/gluon/block.py
##########
@@ -1635,6 +1637,47 @@ def reset_ctx(self, ctx):
         for p in params.values():
             p.reset_ctx(ctx)
 
+    def mark_vars(self, var_arrays):
+        """Mark the intermediate nodes
+
+        Parameters
+        ----------
+        vars: NDArrays or List[NDArrays] with nonempty deferredcomputation_entry_

Review comment:
       Done.

##########
File path: python/mxnet/gluon/block.py
##########
@@ -1635,6 +1637,47 @@ def reset_ctx(self, ctx):
         for p in params.values():
             p.reset_ctx(ctx)
 
+    def mark_vars(self, var_arrays):
+        """Mark the intermediate nodes
+
+        Parameters
+        ----------
+        vars: NDArrays or List[NDArrays] with nonempty deferredcomputation_entry_
+        """
+        if not self._active:
+            var_arrays = _as_list(var_arrays)
+            self._nleaf_vars.extend(var_arrays)
+        else:
+            prev_val = dc.set_deferred_compute(False)
+            var_arrays = _as_list(var_arrays)
+            # Prepare ctypes array types
+            import ctypes
+            var_handles_type = ctypes.c_void_p * len(var_arrays)
+            # Convert handles
+            var_handles = var_handles_type(*[arr.handle for arr in var_arrays])
+            check_call(_LIB.MXNDArrayMarkDCVariables(var_handles, len(var_arrays), len(self._nleaf_vars)))
+            self._nleaf_vars.extend(var_arrays)
+            dc.set_deferred_compute(prev_val)
+
+    def get_mark_vars(self, mark_ids):
+        """Retrieve the marked ndarrays according to the order they are marked
+
+        Parameters
+        ----------
+        mark_ids: int or List[int], the order by which the ndarray is marked

Review comment:
       Done.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] KexinFeng commented on pull request #20559: [FEATURE] Add feature of attach_grad to nonleaf variables in HybridizedBlock.

Posted by GitBox <gi...@apache.org>.
KexinFeng commented on pull request #20559:
URL: https://github.com/apache/incubator-mxnet/pull/20559#issuecomment-909355444


   @mxnet-bot run ci [centos-gpu]


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] KexinFeng commented on pull request #20559: [FEATURE] Add feature of attach_grad to nonleaf variables in HybridizedBlock.

Posted by GitBox <gi...@apache.org>.
KexinFeng commented on pull request #20559:
URL: https://github.com/apache/incubator-mxnet/pull/20559#issuecomment-908439578


   @mxnet-bot run ci [linkcheck]


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] KexinFeng commented on pull request #20559: [FEATURE] Add feature of attach_grad to nonleaf variables in HybridizedBlock.

Posted by GitBox <gi...@apache.org>.
KexinFeng commented on pull request #20559:
URL: https://github.com/apache/incubator-mxnet/pull/20559#issuecomment-908439578


   @mxnet-bot run ci [linkcheck]


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] KexinFeng commented on a change in pull request #20559: [FEATURE] Add feature of attach_grad to nonleaf variables in HybridizedBlock.

Posted by GitBox <gi...@apache.org>.
KexinFeng commented on a change in pull request #20559:
URL: https://github.com/apache/incubator-mxnet/pull/20559#discussion_r698727447



##########
File path: python/mxnet/gluon/block.py
##########
@@ -1635,6 +1637,47 @@ def reset_ctx(self, ctx):
         for p in params.values():
             p.reset_ctx(ctx)
 
+    def mark_vars(self, var_arrays):
+        """Mark the intermediate nodes
+
+        Parameters
+        ----------
+        vars: NDArrays or List[NDArrays] with nonempty deferredcomputation_entry_

Review comment:
       Done.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] leezu commented on a change in pull request #20559: [FEATURE] Add feature of attach_grad to nonleaf variables in HybridizedBlock.

Posted by GitBox <gi...@apache.org>.
leezu commented on a change in pull request #20559:
URL: https://github.com/apache/incubator-mxnet/pull/20559#discussion_r697558544



##########
File path: tests/python/unittest/test_autograd.py
##########
@@ -519,3 +519,68 @@ def test_gradient():
     dx.backward()
     assert abs(x.grad.asscalar() - 2.71828175) < 1e-7
 
+def test_retain_grad_drop_grad():
+    x = nd.array([1,2,3,4])
+    x.attach_grad()
+    y = nd.array([5,6,7,8])
+    y.attach_grad()
+
+    with mx.autograd.record():
+        u = x * y
+        z = u * x
+
+    u.attach_grad()
+    z.attach_grad()
+    out_grad = nd.array([10, 10, 10, 10])
+    z.backward(out_grad, retain_graph=True)
+    
+    assert (u.grad == out_grad * x).asnumpy().all()
+    assert (z.grad == out_grad).asnumpy().all()
+    assert (x.grad == out_grad * 2 * x * y).asnumpy().all()
+    assert (y.grad == out_grad * x*x).asnumpy().all()
+
+    u.drop_grad()
+    z.drop_grad()
+    y.drop_grad()
+    out_grad = nd.array([0.1, 0.1, 0.1, 0.1])
+    z.backward(out_grad)
+
+    assert u.grad is None and z.grad is None and y.grad is None
+    assert (x.grad == out_grad * 2 * x * y).asnumpy().all()
+
+def test_retain_grad_drop_grad_gluon():
+    class CompBlock(mx.gluon.HybridBlock):
+        def __init__(self):
+            super().__init__()
+            self.marked_var = None
+        def forward(self, a, b):
+            out1 = a*b
+            out2 = out1 * a
+            self.marked_var = out1
+            return out2
+    x = mx.np.array([1,2,3,4])
+    y = mx.np.array([5,6,7,8])
+    x.attach_grad()
+    y.attach_grad()
+    block2 = CompBlock()
+    block2.initialize()
+    # block2.hybridize()
+    with mx.autograd.record():
+        z = block2(x, y)
+    u = block2.marked_var
+    u.attach_grad()
+    z.attach_grad()
+    z.backward(retain_graph=True)

Review comment:
       Accessing the `marked_var` outside of the Block wouldn't work for hybridized Blocks. That's because the ndarray (`out1`) is just a temporary object used in the forward function for tracing the computation. Afterwards it's discarded and only the underlying symbolic graph is kept.
   
   Thus I think you'll need to support `attach_grad` inside of the forward function. WDYT?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] KexinFeng commented on pull request #20559: [FEATURE] Add feature of attach_grad to nonleaf variables in HybridizedBlock.

Posted by GitBox <gi...@apache.org>.
KexinFeng commented on pull request #20559:
URL: https://github.com/apache/incubator-mxnet/pull/20559#issuecomment-907811119


   @mxnet-bot run ci [website]


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] KexinFeng commented on pull request #20559: [FEATURE] Add feature of attach_grad to nonleaf variables in HybridizedBlock.

Posted by GitBox <gi...@apache.org>.
KexinFeng commented on pull request #20559:
URL: https://github.com/apache/incubator-mxnet/pull/20559#issuecomment-912796123


   @mxnet-bot run ci [centos-gpu]


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #20559: [FEATURE] Add feature of attach_grad to nonleaf variables in HybridizedBlock.

Posted by GitBox <gi...@apache.org>.
mxnet-bot commented on pull request #20559:
URL: https://github.com/apache/incubator-mxnet/pull/20559#issuecomment-912796150


   Jenkins CI successfully triggered : [centos-gpu]


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #20559: [FEATURE] Add feature of attach_grad to nonleaf variables in HybridizedBlock.

Posted by GitBox <gi...@apache.org>.
mxnet-bot commented on pull request #20559:
URL: https://github.com/apache/incubator-mxnet/pull/20559#issuecomment-907811144


   Jenkins CI successfully triggered : [website]


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@mxnet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org