You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2020/04/29 22:02:46 UTC

[GitHub] [incubator-mxnet] D-Roberts opened a new pull request #18197: [Numpy] Add qr backward part 2 for wide matrices with m < n

D-Roberts opened a new pull request #18197:
URL: https://github.com/apache/incubator-mxnet/pull/18197


   ## Description ##
   This is the 2nd part of the QR backward implementation. The 1st part (merged) covered square and deep matrix shapes (nrows >= ncols) and part 2 now covers the remaining wide matrix shapes (ncols > nrows).
   
   References: 
   [Differential Programming Tensor Networks](https://journals.aps.org/prx/pdf/10.1103/PhysRevX.9.031041)
   [Auto-diff linear algebra](https://arxiv.org/abs/1710.08717)
   
   The added test includes a numerical check (via central differences) of the analytical gradient  since this is a novel implementation. The tests were run offline 1K times to insure against flakiness.
   
   ## Checklist ##
   - [ ] Changes are complete (i.e. I finished coding on this PR)
   - [ ] All changes have test coverage:
   - [ ] Code is well-documented: 
   - [ ] To the best of my knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change
   
   ### Changes ###
   - [ ] QR back part 2
   - [ ] Tests
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] hzfan commented on pull request #18197: [Numpy] Add qr backward part 2 for wide matrices with m < n

Posted by GitBox <gi...@apache.org>.

hzfan commented on pull request #18197:
URL: https://github.com/apache/incubator-mxnet/pull/18197#issuecomment-660226033


   Merged into master. Thanks @D-Roberts , @haojin2 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #18197: [Numpy] Add qr backward part 2 for wide matrices with m < n

Posted by GitBox <gi...@apache.org>.

mxnet-bot commented on pull request #18197:
URL: https://github.com/apache/incubator-mxnet/pull/18197#issuecomment-621488907


   Hey @D-Roberts , Thanks for submitting the PR 
   All tests are already queued to run once. If tests fail, you can trigger one or more tests again with the following commands: 
   - To trigger all jobs: @mxnet-bot run ci [all] 
   - To trigger specific jobs: @mxnet-bot run ci [job1, job2] 
   *** 
   **CI supported jobs**: [miscellaneous, centos-cpu, sanity, windows-gpu, website, centos-gpu, unix-cpu, edge, clang, windows-cpu, unix-gpu]
   *** 
   _Note_: 
    Only following 3 categories can trigger CI :PR Author, MXNet Committer, Jenkins Admin. 
   All CI tests must pass before the PR can be merged. 
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] hzfan commented on a change in pull request #18197: [Numpy] Add qr backward part 2 for wide matrices with m < n

Posted by GitBox <gi...@apache.org>.

hzfan commented on a change in pull request #18197:
URL: https://github.com/apache/incubator-mxnet/pull/18197#discussion_r424862190



##########
File path: src/operator/numpy/linalg/np_qr-inl.h
##########
@@ -514,15 +548,28 @@ struct qr_backward {
 
 template<typename xpu>
 size_t QrBackwardWorkspaceSize(const TBlob& a,
+                               const TBlob& q,
                                const TBlob& r,
                                const TBlob& grad_a) {
+  const mxnet::TShape& a_shape = a.shape_;
+  const int a_ndim = a_shape.ndim();
+  const int n = a.size(a_ndim - 1);
+  const int m = a.size(a_ndim - 2);
+
   if (0U == a.Size()) { return 0U; }
 
   MSHADOW_SGL_DBL_TYPE_SWITCH(grad_a.type_flag_, DType, {
     size_t work_space_size = 0;
-    // for grad a and M
     work_space_size += a.Size();
-    work_space_size += r.Size();
+    if (m >= n) {
+      work_space_size += r.Size();
+    } else {
+      const mxnet::TShape& q_shape = q.shape_;
+      mxnet::TShape v_shape(q_shape);
+      v_shape[a_ndim - 1] = n - m;
+      work_space_size += 5 * q.Size();

Review comment:
       Could you add some documents about what does `5` and `3` here stand for?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] D-Roberts commented on pull request #18197: [Numpy] Add qr backward part 2 for wide matrices with m < n

Posted by GitBox <gi...@apache.org>.

D-Roberts commented on pull request #18197:
URL: https://github.com/apache/incubator-mxnet/pull/18197#issuecomment-630127514


   @mxnet-bot run ci [unix-cpu]


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] D-Roberts commented on pull request #18197: [Numpy] Add qr backward part 2 for wide matrices with m < n

Posted by GitBox <gi...@apache.org>.

D-Roberts commented on pull request #18197:
URL: https://github.com/apache/incubator-mxnet/pull/18197#issuecomment-638394361


   Any updates on this? 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] D-Roberts commented on pull request #18197: [Numpy] Add qr backward part 2 for wide matrices with m < n

Posted by GitBox <gi...@apache.org>.

D-Roberts commented on pull request #18197:
URL: https://github.com/apache/incubator-mxnet/pull/18197#issuecomment-628641606


   The code follows the idea in the reference [Differential Programming Tensor Networks](https://journals.aps.org/prx/pdf/10.1103/PhysRevX.9.031041). At high level, partition/split the input A into 2 matrices X and Y and R (from A=QR decomposition) into 2 matrices U and V. Then X = QU and get X_grad by applying the gradient derivation from the square input case (m=n) with adjusted Q_grad. Also get Y_grad separately. Then A_grad is the concatenation of X_grad and Y_grad.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] D-Roberts commented on pull request #18197: [Numpy] Add qr backward part 2 for wide matrices with m < n

Posted by GitBox <gi...@apache.org>.

D-Roberts commented on pull request #18197:
URL: https://github.com/apache/incubator-mxnet/pull/18197#issuecomment-628768250


   @mxnet-bot run ci [centos-cpu, unix-cpu, windows-gpu]


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] leezu commented on pull request #18197: [Numpy] Add qr backward part 2 for wide matrices with m < n

Posted by GitBox <gi...@apache.org>.

leezu commented on pull request #18197:
URL: https://github.com/apache/incubator-mxnet/pull/18197#issuecomment-661249233


   @D-Roberts the recommendation is to comment with "at mxnet-bot run ci [all]" where at is @


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] D-Roberts commented on pull request #18197: [Numpy] Add qr backward part 2 for wide matrices with m < n

Posted by GitBox <gi...@apache.org>.

D-Roberts commented on pull request #18197:
URL: https://github.com/apache/incubator-mxnet/pull/18197#issuecomment-629227877


   @mxnet-bot run ci [centos-cpu, unix-cpu]


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] szha commented on pull request #18197: [Numpy] Add qr backward part 2 for wide matrices with m < n

Posted by GitBox <gi...@apache.org>.

szha commented on pull request #18197:
URL: https://github.com/apache/incubator-mxnet/pull/18197#issuecomment-660296403


   @D-Roberts @hzfan could you look into the issue that @ptrendx mentioned? If it can't be fixed in a couple of hours let's revert the change first.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] D-Roberts commented on pull request #18197: [Numpy] Add qr backward part 2 for wide matrices with m < n

Posted by GitBox <gi...@apache.org>.

D-Roberts commented on pull request #18197:
URL: https://github.com/apache/incubator-mxnet/pull/18197#issuecomment-629670338


   @mxnet-bot run ci [unix-cpu]


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] D-Roberts commented on pull request #18197: [Numpy] Add qr backward part 2 for wide matrices with m < n

Posted by GitBox <gi...@apache.org>.

D-Roberts commented on pull request #18197:
URL: https://github.com/apache/incubator-mxnet/pull/18197#issuecomment-660501841


   @hzfan Thank you for your prompt assistance, I appreciate it.
   
   @leezu @szha @DickJC123 I will resubmit a separate PR. For my future reference - what are your recommendations to avoid the "stale PR" situation? CI passed when first submitted about 3 months ago and I rebased and CI passed about 2 months ago when the PR was reviewed. All along I followed up on the PR every 2-3 weeks or so.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] haojin2 commented on pull request #18197: [Numpy] Add qr backward part 2 for wide matrices with m < n

Posted by GitBox <gi...@apache.org>.

haojin2 commented on pull request #18197:
URL: https://github.com/apache/incubator-mxnet/pull/18197#issuecomment-622563528


   @D-Roberts Thanks for your contribution! From now and on please @yzhliu for reviews of NumPy-related contributions since I'm moving my gravity from this project.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] hzfan commented on a change in pull request #18197: [Numpy] Add qr backward part 2 for wide matrices with m < n

Posted by GitBox <gi...@apache.org>.

hzfan commented on a change in pull request #18197:
URL: https://github.com/apache/incubator-mxnet/pull/18197#discussion_r424862190



##########
File path: src/operator/numpy/linalg/np_qr-inl.h
##########
@@ -514,15 +548,28 @@ struct qr_backward {
 
 template<typename xpu>
 size_t QrBackwardWorkspaceSize(const TBlob& a,
+                               const TBlob& q,
                                const TBlob& r,
                                const TBlob& grad_a) {
+  const mxnet::TShape& a_shape = a.shape_;
+  const int a_ndim = a_shape.ndim();
+  const int n = a.size(a_ndim - 1);
+  const int m = a.size(a_ndim - 2);
+
   if (0U == a.Size()) { return 0U; }
 
   MSHADOW_SGL_DBL_TYPE_SWITCH(grad_a.type_flag_, DType, {
     size_t work_space_size = 0;
-    // for grad a and M
     work_space_size += a.Size();
-    work_space_size += r.Size();
+    if (m >= n) {
+      work_space_size += r.Size();
+    } else {
+      const mxnet::TShape& q_shape = q.shape_;
+      mxnet::TShape v_shape(q_shape);
+      v_shape[a_ndim - 1] = n - m;
+      work_space_size += 5 * q.Size();

Review comment:
       Could you add some documents about what does `5` and `3` here stand for? Seem like magic numbers. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] hzfan merged pull request #18197: [Numpy] Add qr backward part 2 for wide matrices with m < n

Posted by GitBox <gi...@apache.org>.

hzfan merged pull request #18197:
URL: https://github.com/apache/incubator-mxnet/pull/18197


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] D-Roberts commented on pull request #18197: [Numpy] Add qr backward part 2 for wide matrices with m < n

Posted by GitBox <gi...@apache.org>.

D-Roberts commented on pull request #18197:
URL: https://github.com/apache/incubator-mxnet/pull/18197#issuecomment-630301323


   @mxnet-bot run ci [unix-cpu]


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #18197: [Numpy] Add qr backward part 2 for wide matrices with m < n

Posted by GitBox <gi...@apache.org>.

mxnet-bot commented on pull request #18197:
URL: https://github.com/apache/incubator-mxnet/pull/18197#issuecomment-630127568


   Jenkins CI successfully triggered : [unix-cpu]


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #18197: [Numpy] Add qr backward part 2 for wide matrices with m < n

Posted by GitBox <gi...@apache.org>.

mxnet-bot commented on pull request #18197:
URL: https://github.com/apache/incubator-mxnet/pull/18197#issuecomment-630301376


   Jenkins CI successfully triggered : [unix-cpu]


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] D-Roberts commented on pull request #18197: [Numpy] Add qr backward part 2 for wide matrices with m < n

Posted by GitBox <gi...@apache.org>.

D-Roberts commented on pull request #18197:
URL: https://github.com/apache/incubator-mxnet/pull/18197#issuecomment-660197307


   Hello @yzhliu - are we planning to merge this soon? This particular case of differentiable QR can be useful on batch, in place of LQ, or SVD in recent computer vision research for solving least squares.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] D-Roberts commented on a change in pull request #18197: [Numpy] Add qr backward part 2 for wide matrices with m < n

Posted by GitBox <gi...@apache.org>.

D-Roberts commented on a change in pull request #18197:
URL: https://github.com/apache/incubator-mxnet/pull/18197#discussion_r425140133



##########
File path: src/operator/numpy/linalg/np_qr-inl.h
##########
@@ -514,15 +548,28 @@ struct qr_backward {
 
 template<typename xpu>
 size_t QrBackwardWorkspaceSize(const TBlob& a,
+                               const TBlob& q,
                                const TBlob& r,
                                const TBlob& grad_a) {
+  const mxnet::TShape& a_shape = a.shape_;
+  const int a_ndim = a_shape.ndim();
+  const int n = a.size(a_ndim - 1);
+  const int m = a.size(a_ndim - 2);
+
   if (0U == a.Size()) { return 0U; }
 
   MSHADOW_SGL_DBL_TYPE_SWITCH(grad_a.type_flag_, DType, {
     size_t work_space_size = 0;
-    // for grad a and M
     work_space_size += a.Size();
-    work_space_size += r.Size();
+    if (m >= n) {
+      work_space_size += r.Size();
+    } else {
+      const mxnet::TShape& q_shape = q.shape_;
+      mxnet::TShape v_shape(q_shape);
+      v_shape[a_ndim - 1] = n - m;
+      work_space_size += 5 * q.Size();

Review comment:
       @hzfan Thank you for your review. Added comments and updated reference in the code.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] D-Roberts commented on a change in pull request #18197: [Numpy] Add qr backward part 2 for wide matrices with m < n

Posted by GitBox <gi...@apache.org>.

D-Roberts commented on a change in pull request #18197:
URL: https://github.com/apache/incubator-mxnet/pull/18197#discussion_r425903506



##########
File path: src/operator/numpy/linalg/np_qr-inl.h
##########
@@ -542,36 +591,119 @@ void QrBackwardImpl(const TBlob& grad_a,
                     const nnvm::NodeAttrs& attrs) {
   Stream<xpu> *s = ctx.get_stream<xpu>();
   const mxnet::TShape& a_shape = a.shape_;
+  const mxnet::TShape& q_shape = q.shape_;
   const mxnet::TShape& r_shape = r.shape_;
   const int a_ndim = a_shape.ndim();
+  const int m = a.size(a_ndim - 2);
   const int n = a.size(a_ndim - 1);
 
   if (kNullOp == req[0]) { return; }
 
   if (0U == a_shape.Size()) { return; }
 
   MSHADOW_SGL_DBL_TYPE_SWITCH(grad_a.type_flag_, DType, {
-    // case m >= n; Q of same shape with A and R is (n, n)
-    DType *m_ptr = reinterpret_cast<DType*>(workspace.dptr_);
-    DType *grad_a_ptr = m_ptr + r_shape.Size();
-    TBlob temp_m(m_ptr, r_shape, xpu::kDevMask);
+    // common for all shapes (m, n)
+    DType *grad_a_ptr = reinterpret_cast<DType*>(workspace.dptr_);
     TBlob grad_a_data(grad_a_ptr, a_shape, xpu::kDevMask);
-    // dR_T
-    mxnet_op::Kernel<QrTypeTransposeHelper, xpu>::Launch(
-      s, r_shape.Size(), grad_r.dptr<DType>(), m_ptr, n, n, n * n);
-
-    qr_backward::op(grad_a_data.FlatToKD<xpu, 3, DType>(s),
-                    grad_q.FlatToKD<xpu, 3, DType>(s),
-                    grad_r.FlatToKD<xpu, 3, DType>(s),
-                    a.FlatToKD<xpu, 3, DType>(s),
-                    q.FlatToKD<xpu, 3, DType>(s),
-                    r.FlatToKD<xpu, 3, DType>(s),
-                    temp_m.FlatToKD<xpu, 3, DType>(s),
-                    ctx, attrs);
-
+    if (m >= n) {
+      // Q of same shape with A (m, n) and R is (n, n)
+      DType *m_ptr = grad_a_ptr + a_shape.Size();
+      TBlob temp_m(m_ptr, r_shape, xpu::kDevMask);
+      // dR_T
+      mxnet_op::Kernel<QrTypeTransposeHelper, xpu>::Launch(
+        s, r_shape.Size(), grad_r.dptr<DType>(), m_ptr, n, n, n * n);
+      qr_backward::op(grad_a_data.FlatToKD<xpu, 3, DType>(s),
+                      grad_q.FlatToKD<xpu, 3, DType>(s),
+                      grad_r.FlatToKD<xpu, 3, DType>(s),
+                      q.FlatToKD<xpu, 3, DType>(s),
+                      r.FlatToKD<xpu, 3, DType>(s),
+                      temp_m.FlatToKD<xpu, 3, DType>(s),
+                      ctx, attrs);
+    } else {
+      // R is same shape with A (m, n) and Q is (m, m)
+      // Partition A = (X | Y); R = (U | V)
+      // X and U are (m, m); Y and V are (m, n - m)
+      mxnet::TShape v_shape(q_shape);
+      v_shape[a_ndim - 1] = n - m;
+
+      DType *m_ptr = grad_a_ptr + a_shape.Size();
+      DType *u_ptr = m_ptr + q_shape.Size();
+      DType *dq_prime_ptr = u_ptr + q_shape.Size();
+      DType *dv_ptr = dq_prime_ptr + q_shape.Size();
+      DType *y_ptr = dv_ptr + v_shape.Size();
+      DType *du_ptr = y_ptr + v_shape.Size();
+      DType *dx_ptr = du_ptr + q_shape.Size();
+      DType *dy_ptr = dx_ptr + q_shape.Size();
+
+      TBlob temp_m(m_ptr, q_shape, xpu::kDevMask);
+      TBlob u_data(u_ptr, q_shape, xpu::kDevMask);
+      TBlob dq_prime_data(dq_prime_ptr, q_shape, xpu::kDevMask);
+      TBlob dv_data(dv_ptr, v_shape, xpu::kDevMask);
+      TBlob y_data(y_ptr, v_shape, xpu::kDevMask);
+      TBlob du_data(du_ptr, q_shape, xpu::kDevMask);
+      TBlob dx_data(dx_ptr, q_shape, xpu::kDevMask);
+      TBlob dy_data(dy_ptr, v_shape, xpu::kDevMask);
+
+      Tensor<xpu, 3, DType> R = r.FlatToKD<xpu, 3, DType>(s);
+      Tensor<xpu, 3, DType> dR = grad_r.FlatToKD<xpu, 3, DType>(s);
+      Tensor<xpu, 3, DType> Q = q.FlatToKD<xpu, 3, DType>(s);
+      Tensor<xpu, 3, DType> dQ = grad_q.FlatToKD<xpu, 3, DType>(s);
+      Tensor<xpu, 3, DType> dQ_prime = dq_prime_data.FlatToKD<xpu, 3, DType>(s);
+      Tensor<xpu, 3, DType> A = a.FlatToKD<xpu, 3, DType>(s);
+      Tensor<xpu, 3, DType> dA = grad_a_data.FlatToKD<xpu, 3, DType>(s);
+      Tensor<xpu, 3, DType> U = u_data.FlatToKD<xpu, 3, DType>(s);
+      Tensor<xpu, 3, DType> dU = du_data.FlatToKD<xpu, 3, DType>(s);
+      Tensor<xpu, 3, DType> dV = dv_data.FlatToKD<xpu, 3, DType>(s);
+      Tensor<xpu, 3, DType> Y = y_data.FlatToKD<xpu, 3, DType>(s);
+      Tensor<xpu, 3, DType> dX = dx_data.FlatToKD<xpu, 3, DType>(s);
+      Tensor<xpu, 3, DType> dY = dy_data.FlatToKD<xpu, 3, DType>(s);
+      Tensor<xpu, 3, DType> M = temp_m.FlatToKD<xpu, 3, DType>(s);
+
+      // U
+      for (index_t i = 0; i < R.size(0); ++i) {
+        const Tensor<xpu, 2, DType>& Ri = R[i];
+        const Tensor<xpu, 2, DType>& Ui = U[i];
+        Tensor<xpu, 2, DType> Um(Ri.dptr_, Shape2(m, m), Ri.stride_, s);
+        Copy(Ui, Um, s);
+      }
+      // dU
+      for (index_t i = 0; i < dR.size(0); ++i) {
+        const Tensor<xpu, 2, DType>& dRi = dR[i];
+        const Tensor<xpu, 2, DType>& dUi = dU[i];
+        Tensor<xpu, 2, DType> dUm(dRi.dptr_, Shape2(m, m), dRi.stride_, s);
+        Copy(dUi, dUm, s);
+      }
+      // Y
+      mxnet_op::Kernel<QrBackHelper_G1, xpu>::Launch(
+        s, A.size(0), m, n, A.dptr_, A.stride_, Y.dptr_, Y.stride_);
+      // dV
+      mxnet_op::Kernel<QrBackHelper_G1, xpu>::Launch(
+        s, dR.size(0), m, n, dR.dptr_, dR.stride_, dV.dptr_, dV.stride_);
+      // store dU_T in M
+      mxnet_op::Kernel<QrTypeTransposeHelper, xpu>::Launch(
+        s, q_shape.Size(), dU.dptr_, m_ptr, m, m, m * m);
+      // dq_prime = dQ
+      Copy(dQ_prime, dQ, s);
+      // dq_prime = dQ+Y@dV.T
+      gemm::op(Y, dV, dQ_prime, DType(1.0), DType(1.0), false, true, s);
+      // dX = op call
+      qr_backward::op(dX,
+                      dQ_prime,
+                      dU,
+                      Q,
+                      U,
+                      M,
+                      ctx, attrs);
+      // dY = Q@dV
+      gemm::op(Q, dV, dY, DType(1.0), DType(0.0), false, false, s);

Review comment:
       Yes, absolutely, I'd missed that one. Done!




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] szha commented on pull request #18197: [Numpy] Add qr backward part 2 for wide matrices with m < n

Posted by GitBox <gi...@apache.org>.

szha commented on pull request #18197:
URL: https://github.com/apache/incubator-mxnet/pull/18197#issuecomment-660592664


   @D-Roberts we will likely need to automate it so that stale CI checks are invalidated. In the meantime, if the PR sits for a long time, feel free to ping me or any other committer to get more attention on it.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #18197: [Numpy] Add qr backward part 2 for wide matrices with m < n

Posted by GitBox <gi...@apache.org>.

mxnet-bot commented on pull request #18197:
URL: https://github.com/apache/incubator-mxnet/pull/18197#issuecomment-629227942


   Jenkins CI successfully triggered : [centos-cpu, unix-cpu]


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] hzfan commented on a change in pull request #18197: [Numpy] Add qr backward part 2 for wide matrices with m < n

Posted by GitBox <gi...@apache.org>.

hzfan commented on a change in pull request #18197:
URL: https://github.com/apache/incubator-mxnet/pull/18197#discussion_r425852127



##########
File path: src/operator/numpy/linalg/np_qr-inl.h
##########
@@ -542,36 +591,119 @@ void QrBackwardImpl(const TBlob& grad_a,
                     const nnvm::NodeAttrs& attrs) {
   Stream<xpu> *s = ctx.get_stream<xpu>();
   const mxnet::TShape& a_shape = a.shape_;
+  const mxnet::TShape& q_shape = q.shape_;
   const mxnet::TShape& r_shape = r.shape_;
   const int a_ndim = a_shape.ndim();
+  const int m = a.size(a_ndim - 2);
   const int n = a.size(a_ndim - 1);
 
   if (kNullOp == req[0]) { return; }
 
   if (0U == a_shape.Size()) { return; }
 
   MSHADOW_SGL_DBL_TYPE_SWITCH(grad_a.type_flag_, DType, {
-    // case m >= n; Q of same shape with A and R is (n, n)
-    DType *m_ptr = reinterpret_cast<DType*>(workspace.dptr_);
-    DType *grad_a_ptr = m_ptr + r_shape.Size();
-    TBlob temp_m(m_ptr, r_shape, xpu::kDevMask);
+    // common for all shapes (m, n)
+    DType *grad_a_ptr = reinterpret_cast<DType*>(workspace.dptr_);
     TBlob grad_a_data(grad_a_ptr, a_shape, xpu::kDevMask);
-    // dR_T
-    mxnet_op::Kernel<QrTypeTransposeHelper, xpu>::Launch(
-      s, r_shape.Size(), grad_r.dptr<DType>(), m_ptr, n, n, n * n);
-
-    qr_backward::op(grad_a_data.FlatToKD<xpu, 3, DType>(s),
-                    grad_q.FlatToKD<xpu, 3, DType>(s),
-                    grad_r.FlatToKD<xpu, 3, DType>(s),
-                    a.FlatToKD<xpu, 3, DType>(s),
-                    q.FlatToKD<xpu, 3, DType>(s),
-                    r.FlatToKD<xpu, 3, DType>(s),
-                    temp_m.FlatToKD<xpu, 3, DType>(s),
-                    ctx, attrs);
-
+    if (m >= n) {
+      // Q of same shape with A (m, n) and R is (n, n)
+      DType *m_ptr = grad_a_ptr + a_shape.Size();
+      TBlob temp_m(m_ptr, r_shape, xpu::kDevMask);
+      // dR_T
+      mxnet_op::Kernel<QrTypeTransposeHelper, xpu>::Launch(
+        s, r_shape.Size(), grad_r.dptr<DType>(), m_ptr, n, n, n * n);
+      qr_backward::op(grad_a_data.FlatToKD<xpu, 3, DType>(s),
+                      grad_q.FlatToKD<xpu, 3, DType>(s),
+                      grad_r.FlatToKD<xpu, 3, DType>(s),
+                      q.FlatToKD<xpu, 3, DType>(s),
+                      r.FlatToKD<xpu, 3, DType>(s),
+                      temp_m.FlatToKD<xpu, 3, DType>(s),
+                      ctx, attrs);
+    } else {
+      // R is same shape with A (m, n) and Q is (m, m)
+      // Partition A = (X | Y); R = (U | V)
+      // X and U are (m, m); Y and V are (m, n - m)
+      mxnet::TShape v_shape(q_shape);
+      v_shape[a_ndim - 1] = n - m;
+
+      DType *m_ptr = grad_a_ptr + a_shape.Size();
+      DType *u_ptr = m_ptr + q_shape.Size();
+      DType *dq_prime_ptr = u_ptr + q_shape.Size();
+      DType *dv_ptr = dq_prime_ptr + q_shape.Size();
+      DType *y_ptr = dv_ptr + v_shape.Size();
+      DType *du_ptr = y_ptr + v_shape.Size();
+      DType *dx_ptr = du_ptr + q_shape.Size();
+      DType *dy_ptr = dx_ptr + q_shape.Size();
+
+      TBlob temp_m(m_ptr, q_shape, xpu::kDevMask);
+      TBlob u_data(u_ptr, q_shape, xpu::kDevMask);
+      TBlob dq_prime_data(dq_prime_ptr, q_shape, xpu::kDevMask);
+      TBlob dv_data(dv_ptr, v_shape, xpu::kDevMask);
+      TBlob y_data(y_ptr, v_shape, xpu::kDevMask);
+      TBlob du_data(du_ptr, q_shape, xpu::kDevMask);
+      TBlob dx_data(dx_ptr, q_shape, xpu::kDevMask);
+      TBlob dy_data(dy_ptr, v_shape, xpu::kDevMask);
+
+      Tensor<xpu, 3, DType> R = r.FlatToKD<xpu, 3, DType>(s);
+      Tensor<xpu, 3, DType> dR = grad_r.FlatToKD<xpu, 3, DType>(s);
+      Tensor<xpu, 3, DType> Q = q.FlatToKD<xpu, 3, DType>(s);
+      Tensor<xpu, 3, DType> dQ = grad_q.FlatToKD<xpu, 3, DType>(s);
+      Tensor<xpu, 3, DType> dQ_prime = dq_prime_data.FlatToKD<xpu, 3, DType>(s);
+      Tensor<xpu, 3, DType> A = a.FlatToKD<xpu, 3, DType>(s);
+      Tensor<xpu, 3, DType> dA = grad_a_data.FlatToKD<xpu, 3, DType>(s);
+      Tensor<xpu, 3, DType> U = u_data.FlatToKD<xpu, 3, DType>(s);
+      Tensor<xpu, 3, DType> dU = du_data.FlatToKD<xpu, 3, DType>(s);
+      Tensor<xpu, 3, DType> dV = dv_data.FlatToKD<xpu, 3, DType>(s);
+      Tensor<xpu, 3, DType> Y = y_data.FlatToKD<xpu, 3, DType>(s);
+      Tensor<xpu, 3, DType> dX = dx_data.FlatToKD<xpu, 3, DType>(s);
+      Tensor<xpu, 3, DType> dY = dy_data.FlatToKD<xpu, 3, DType>(s);
+      Tensor<xpu, 3, DType> M = temp_m.FlatToKD<xpu, 3, DType>(s);
+
+      // U
+      for (index_t i = 0; i < R.size(0); ++i) {
+        const Tensor<xpu, 2, DType>& Ri = R[i];
+        const Tensor<xpu, 2, DType>& Ui = U[i];
+        Tensor<xpu, 2, DType> Um(Ri.dptr_, Shape2(m, m), Ri.stride_, s);
+        Copy(Ui, Um, s);
+      }
+      // dU
+      for (index_t i = 0; i < dR.size(0); ++i) {
+        const Tensor<xpu, 2, DType>& dRi = dR[i];
+        const Tensor<xpu, 2, DType>& dUi = dU[i];
+        Tensor<xpu, 2, DType> dUm(dRi.dptr_, Shape2(m, m), dRi.stride_, s);
+        Copy(dUi, dUm, s);
+      }
+      // Y
+      mxnet_op::Kernel<QrBackHelper_G1, xpu>::Launch(
+        s, A.size(0), m, n, A.dptr_, A.stride_, Y.dptr_, Y.stride_);
+      // dV
+      mxnet_op::Kernel<QrBackHelper_G1, xpu>::Launch(
+        s, dR.size(0), m, n, dR.dptr_, dR.stride_, dV.dptr_, dV.stride_);
+      // store dU_T in M
+      mxnet_op::Kernel<QrTypeTransposeHelper, xpu>::Launch(
+        s, q_shape.Size(), dU.dptr_, m_ptr, m, m, m * m);
+      // dq_prime = dQ
+      Copy(dQ_prime, dQ, s);
+      // dq_prime = dQ+Y@dV.T
+      gemm::op(Y, dV, dQ_prime, DType(1.0), DType(1.0), false, true, s);
+      // dX = op call
+      qr_backward::op(dX,
+                      dQ_prime,
+                      dU,
+                      Q,
+                      U,
+                      M,
+                      ctx, attrs);
+      // dY = Q@dV
+      gemm::op(Q, dV, dY, DType(1.0), DType(0.0), false, false, s);

Review comment:
       We may reuse `Y` for `dY` here, so the workspace can be reduced.
   
   Note that the use of `Y` ends after the computation of `dq_prime`, so it's safe to rewrite `Y` (it stands for `dY`, but we reuse the space of `Y`) as `Q@dV` here.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #18197: [Numpy] Add qr backward part 2 for wide matrices with m < n

Posted by GitBox <gi...@apache.org>.

mxnet-bot commented on pull request #18197:
URL: https://github.com/apache/incubator-mxnet/pull/18197#issuecomment-628768324


   Jenkins CI successfully triggered : [unix-cpu, windows-gpu, centos-cpu]


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] ptrendx commented on pull request #18197: [Numpy] Add qr backward part 2 for wide matrices with m < n

Posted by GitBox <gi...@apache.org>.

ptrendx commented on pull request #18197:
URL: https://github.com/apache/incubator-mxnet/pull/18197#issuecomment-660278451


   This PR broke master CPU pipelines and blocks PRs (test_np_linalg_qr fails), see e.g. http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Fcentos-cpu/detail/master/2093/pipeline 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] D-Roberts commented on pull request #18197: [Numpy] Add qr backward part 2 for wide matrices with m < n

Posted by GitBox <gi...@apache.org>.

D-Roberts commented on pull request #18197:
URL: https://github.com/apache/incubator-mxnet/pull/18197#issuecomment-632237899


   Hi @yzhliu - is there anything else you'd like me to do on this? tnx


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #18197: [Numpy] Add qr backward part 2 for wide matrices with m < n

Posted by GitBox <gi...@apache.org>.

mxnet-bot commented on pull request #18197:
URL: https://github.com/apache/incubator-mxnet/pull/18197#issuecomment-629670357


   Jenkins CI successfully triggered : [unix-cpu]


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-mxnet] D-Roberts commented on pull request #18197: [Numpy] Add qr backward part 2 for wide matrices with m < n

Posted by GitBox <gi...@apache.org>.

D-Roberts commented on pull request #18197:
URL: https://github.com/apache/incubator-mxnet/pull/18197#issuecomment-622562860


   @haojin2 PR ready for review, tnx.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org