You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2020/07/23 20:41:58 UTC

[GitHub] [incubator-mxnet] Zha0q1 opened a new pull request #18782: [WIP] Add Large Tensor Test for linalg_syrk

Zha0q1 opened a new pull request #18782:
URL: https://github.com/apache/incubator-mxnet/pull/18782


   ## Description ##
   Check if syrk_batch works correctly with large tensors. This test will fail with the current code base; https://github.com/apache/incubator-mxnet/pull/18752 should fix it.
   
   TODO:
   1. make test function naming consistent with other large tensor tests (need to rebase after any of those tests are merged)
   2. merge this only after the fix has been merged
   
   This test passes on both BLAS int32 and 64 builds.
   
   
   ```
   ubuntu@ip-172-31-43-103:~$ MXNET_TEST_COUNT=10000 nosetests --logging-level=DEBUG --verbose -s mxnet/tests/nightly/test_large_array.py:test_linalg_operators
   test_large_array.test_linalg_operators ... [23:14:57] ../src/storage/storage.cc:198: Using Pooled (Naive) StorageManager for CPU
   ok
   
   ----------------------------------------------------------------------
   Ran 1 test in 637.245s
   
   OK
   ```
   
   ## Checklist ##
   ### Essentials ###
   Please feel free to remove inapplicable items for your PR.
   - [ ] The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant [JIRA issue](https://issues.apache.org/jira/projects/MXNET/issues) created (except PRs with tiny changes)
   - [ ] Changes are complete (i.e. I finished coding on this PR)
   - [ ] All changes have test coverage:
   - Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
   - Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
   - Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
   - [ ] Code is well-documented: 
   - For user-facing API changes, API doc string has been updated. 
   - For new C++ functions in header files, their functionalities and arguments are documented. 
   - For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
   - Check the API doc at https://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
   - [ ] To the best of my knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change
   
   ### Changes ###
   - [ ] Feature1, tests, (and when applicable, API doc)
   - [ ] Feature2, tests, (and when applicable, API doc)
   
   ## Comments ##
   - If this change is a backward incompatible change, why must this change be made.
   - Interesting edge cases to note here
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] Zha0q1 commented on pull request #18782: [WIP] Add Large Tensor Test for linalg_syrk

Posted by GitBox <gi...@apache.org>.
Zha0q1 commented on pull request #18782:
URL: https://github.com/apache/incubator-mxnet/pull/18782#issuecomment-663259991


   > @Zha0q1 can you fix this issue with your PR "This branch is out-of-date with the base branch"
   
   Fixed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] ChaiBapchya commented on pull request #18782: Add Large Tensor Test for linalg_syrk

Posted by GitBox <gi...@apache.org>.
ChaiBapchya commented on pull request #18782:
URL: https://github.com/apache/incubator-mxnet/pull/18782#issuecomment-663763443


   @Zha0q1 also add your name to Contributors.md in the upcoming PR [let's not retrigger CI for that]. 
   
   Thanks a lot for your contributions! :)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] access2rohit commented on a change in pull request #18782: [WIP] Add Large Tensor Test for linalg_syrk

Posted by GitBox <gi...@apache.org>.
access2rohit commented on a change in pull request #18782:
URL: https://github.com/apache/incubator-mxnet/pull/18782#discussion_r460140244



##########
File path: tests/nightly/test_large_array.py
##########
@@ -1791,6 +1792,29 @@ def test_sparse_dot():
     assert out.shape == (2, 2)
 
 
+def test_linalg_operators():
+    def check_syrk_batch():
+        # test both forward and backward
+        # batch syrk will be applied to the last two dimensions
+        A = nd.zeros((2, LARGE_SQ_X, LARGE_SQ_X))
+        for i in range(LARGE_SQ_X):
+            A[0,i,i] = 1
+            A[1,i,i] = 0.1
+        A.attach_grad()
+        with mx.autograd.record():
+            out = nd.linalg.syrk(A, alpha=2, transpose=False)
+        for i in range(LARGE_SQ_X):
+            assert out[0,i,i] == 2
+            assert_almost_equal(out[1,i,i], nd.array([0.02]), rtol=1e-3, atol=1e-5)
+        out.backward()
+        for i in range(LARGE_SQ_X):
+            # check the first row
+            assert A.grad[0,0,i] == 4
+            assert_almost_equal(A.grad[1,0,i], nd.array([0.4]), rtol=1e-3, atol=1e-5)

Review comment:
       Question: Why did this become 0.4 and not 0.04 ? OR just let me know if this output os consistent with smaller inputs like 2x2 or 3x3.

##########
File path: tests/nightly/test_large_array.py
##########
@@ -1791,6 +1792,29 @@ def test_sparse_dot():
     assert out.shape == (2, 2)
 
 
+def test_linalg_operators():
+    def check_syrk_batch():
+        # test both forward and backward
+        # batch syrk will be applied to the last two dimensions
+        A = nd.zeros((2, LARGE_SQ_X, LARGE_SQ_X))
+        for i in range(LARGE_SQ_X):
+            A[0,i,i] = 1
+            A[1,i,i] = 0.1
+        A.attach_grad()
+        with mx.autograd.record():
+            out = nd.linalg.syrk(A, alpha=2, transpose=False)
+        for i in range(LARGE_SQ_X):
+            assert out[0,i,i] == 2
+            assert_almost_equal(out[1,i,i], nd.array([0.02]), rtol=1e-3, atol=1e-5)
+        out.backward()
+        for i in range(LARGE_SQ_X):
+            # check the first row
+            assert A.grad[0,0,i] == 4
+            assert_almost_equal(A.grad[1,0,i], nd.array([0.4]), rtol=1e-3, atol=1e-5)

Review comment:
       Question: Why did this become 0.4 and not 0.04 ? OR just let me know if this output is consistent with smaller inputs like 2x2 or 3x3.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] access2rohit commented on a change in pull request #18782: [WIP] Add Large Tensor Test for linalg_syrk

Posted by GitBox <gi...@apache.org>.
access2rohit commented on a change in pull request #18782:
URL: https://github.com/apache/incubator-mxnet/pull/18782#discussion_r460135990



##########
File path: tests/nightly/test_large_array.py
##########
@@ -1791,6 +1792,29 @@ def test_sparse_dot():
     assert out.shape == (2, 2)
 
 
+def test_linalg_operators():
+    def check_syrk_batch():
+        # test both forward and backward
+        # batch syrk will be applied to the last two dimensions
+        A = nd.zeros((2, LARGE_SQ_X, LARGE_SQ_X))
+        for i in range(LARGE_SQ_X):
+            A[0,i,i] = 1
+            A[1,i,i] = 0.1
+        A.attach_grad()
+        with mx.autograd.record():
+            out = nd.linalg.syrk(A, alpha=2, transpose=False)
+        for i in range(LARGE_SQ_X):
+            assert out[0,i,i] == 2
+            assert_almost_equal(out[1,i,i], nd.array([0.02]), rtol=1e-3, atol=1e-5)
+        out.backward()
+        for i in range(LARGE_SQ_X):

Review comment:
       Same




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] access2rohit commented on pull request #18782: [WIP] Add Large Tensor Test for linalg_syrk

Posted by GitBox <gi...@apache.org>.
access2rohit commented on pull request #18782:
URL: https://github.com/apache/incubator-mxnet/pull/18782#issuecomment-663250720


   @Zha0q1 can you fix this issue with your PR "This branch is out-of-date with the base branch"


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #18782: [WIP] Add Large Tensor Test for linalg_syrk

Posted by GitBox <gi...@apache.org>.
mxnet-bot commented on pull request #18782:
URL: https://github.com/apache/incubator-mxnet/pull/18782#issuecomment-663221997


   Hey @Zha0q1 , Thanks for submitting the PR 
   All tests are already queued to run once. If tests fail, you can trigger one or more tests again with the following commands: 
   - To trigger all jobs: @mxnet-bot run ci [all] 
   - To trigger specific jobs: @mxnet-bot run ci [job1, job2] 
   *** 
   **CI supported jobs**: [windows-cpu, centos-cpu, windows-gpu, sanity, edge, miscellaneous, clang, centos-gpu, website, unix-gpu, unix-cpu]
   *** 
   _Note_: 
    Only following 3 categories can trigger CI :PR Author, MXNet Committer, Jenkins Admin. 
   All CI tests must pass before the PR can be merged. 
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] ChaiBapchya edited a comment on pull request #18782: Add Large Tensor Test for linalg_syrk

Posted by GitBox <gi...@apache.org>.
ChaiBapchya edited a comment on pull request #18782:
URL: https://github.com/apache/incubator-mxnet/pull/18782#issuecomment-663763443


   @Zha0q1 also add your name to Contributors.md in the upcoming PR [let's not retrigger CI for that]. 
   https://github.com/apache/incubator-mxnet/blob/master/CONTRIBUTORS.md
   Thanks a lot for your contributions! :)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] ChaiBapchya commented on a change in pull request #18782: Add Large Tensor Test for linalg_syrk

Posted by GitBox <gi...@apache.org>.
ChaiBapchya commented on a change in pull request #18782:
URL: https://github.com/apache/incubator-mxnet/pull/18782#discussion_r460319390



##########
File path: tests/nightly/test_large_array.py
##########
@@ -37,7 +37,7 @@
 LARGE_X = 100000000
 SMALL_X = 100
 SMALL_Y = 50
-LARGE_SQ_X = 80000
+LARGE_SQ_X = 70000

Review comment:
       `70000*70000/2**32` is just over `2**32`
   80k is lot more




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] access2rohit commented on pull request #18782: [WIP] Add Large Tensor Test for linalg_syrk

Posted by GitBox <gi...@apache.org>.
access2rohit commented on pull request #18782:
URL: https://github.com/apache/incubator-mxnet/pull/18782#issuecomment-663609422


   Few comments ... overall code is good


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] Zha0q1 commented on a change in pull request #18782: Add Large Tensor Test for linalg_syrk

Posted by GitBox <gi...@apache.org>.
Zha0q1 commented on a change in pull request #18782:
URL: https://github.com/apache/incubator-mxnet/pull/18782#discussion_r460319489



##########
File path: tests/nightly/test_large_array.py
##########
@@ -37,7 +37,7 @@
 LARGE_X = 100000000
 SMALL_X = 100
 SMALL_Y = 50
-LARGE_SQ_X = 80000
+LARGE_SQ_X = 70000

Review comment:
       We figured 7000 is large enough to overflow int 32. This is a new constant just introduced so the few of us tweak it to 70000




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] Zha0q1 commented on pull request #18782: [WIP] Add Large Tensor Test for linalg_syrk

Posted by GitBox <gi...@apache.org>.
Zha0q1 commented on pull request #18782:
URL: https://github.com/apache/incubator-mxnet/pull/18782#issuecomment-663762007


   > Can you just check with smaller input run and let me know the results. That should be good enough .... overall LGTM!
   
   This has been tested with both small and large input tensors


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] access2rohit commented on a change in pull request #18782: [WIP] Add Large Tensor Test for linalg_syrk

Posted by GitBox <gi...@apache.org>.
access2rohit commented on a change in pull request #18782:
URL: https://github.com/apache/incubator-mxnet/pull/18782#discussion_r460135738



##########
File path: tests/nightly/test_large_array.py
##########
@@ -1791,6 +1792,29 @@ def test_sparse_dot():
     assert out.shape == (2, 2)
 
 
+def test_linalg_operators():
+    def check_syrk_batch():
+        # test both forward and backward
+        # batch syrk will be applied to the last two dimensions
+        A = nd.zeros((2, LARGE_SQ_X, LARGE_SQ_X))
+        for i in range(LARGE_SQ_X):
+            A[0,i,i] = 1
+            A[1,i,i] = 0.1
+        A.attach_grad()
+        with mx.autograd.record():
+            out = nd.linalg.syrk(A, alpha=2, transpose=False)
+        for i in range(LARGE_SQ_X):

Review comment:
       You can check in 2 places in (0,y,y) and (1,y,y). No need to check in 70000 locations




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] ChaiBapchya commented on a change in pull request #18782: Add Large Tensor Test for linalg_syrk

Posted by GitBox <gi...@apache.org>.
ChaiBapchya commented on a change in pull request #18782:
URL: https://github.com/apache/incubator-mxnet/pull/18782#discussion_r460319390



##########
File path: tests/nightly/test_large_array.py
##########
@@ -37,7 +37,7 @@
 LARGE_X = 100000000
 SMALL_X = 100
 SMALL_Y = 50
-LARGE_SQ_X = 80000
+LARGE_SQ_X = 70000

Review comment:
       70000*70000/2**32 is just over 2**32
   80k is lot more




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] szha merged pull request #18782: Add Large Tensor Test for linalg_syrk

Posted by GitBox <gi...@apache.org>.
szha merged pull request #18782:
URL: https://github.com/apache/incubator-mxnet/pull/18782


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] szha commented on a change in pull request #18782: Add Large Tensor Test for linalg_syrk

Posted by GitBox <gi...@apache.org>.
szha commented on a change in pull request #18782:
URL: https://github.com/apache/incubator-mxnet/pull/18782#discussion_r460319022



##########
File path: tests/nightly/test_large_array.py
##########
@@ -37,7 +37,7 @@
 LARGE_X = 100000000
 SMALL_X = 100
 SMALL_Y = 50
-LARGE_SQ_X = 80000
+LARGE_SQ_X = 70000

Review comment:
       why?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] Zha0q1 commented on a change in pull request #18782: [WIP] Add Large Tensor Test for linalg_syrk

Posted by GitBox <gi...@apache.org>.
Zha0q1 commented on a change in pull request #18782:
URL: https://github.com/apache/incubator-mxnet/pull/18782#discussion_r460171417



##########
File path: tests/nightly/test_large_array.py
##########
@@ -1791,6 +1792,29 @@ def test_sparse_dot():
     assert out.shape == (2, 2)
 
 
+def test_linalg_operators():
+    def check_syrk_batch():
+        # test both forward and backward
+        # batch syrk will be applied to the last two dimensions
+        A = nd.zeros((2, LARGE_SQ_X, LARGE_SQ_X))
+        for i in range(LARGE_SQ_X):
+            A[0,i,i] = 1
+            A[1,i,i] = 0.1
+        A.attach_grad()
+        with mx.autograd.record():
+            out = nd.linalg.syrk(A, alpha=2, transpose=False)
+        for i in range(LARGE_SQ_X):
+            assert out[0,i,i] == 2
+            assert_almost_equal(out[1,i,i], nd.array([0.02]), rtol=1e-3, atol=1e-5)
+        out.backward()
+        for i in range(LARGE_SQ_X):
+            # check the first row
+            assert A.grad[0,0,i] == 4
+            assert_almost_equal(A.grad[1,0,i], nd.array([0.4]), rtol=1e-3, atol=1e-5)

Review comment:
       This is the correct result I believe. I verified with hand-written calculation. Yeah it also struck as counter-intuitive to me.. I am going to dive deep in matrix grad when I find time




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] Zha0q1 commented on a change in pull request #18782: Add Large Tensor Test for linalg_syrk

Posted by GitBox <gi...@apache.org>.
Zha0q1 commented on a change in pull request #18782:
URL: https://github.com/apache/incubator-mxnet/pull/18782#discussion_r460319489



##########
File path: tests/nightly/test_large_array.py
##########
@@ -37,7 +37,7 @@
 LARGE_X = 100000000
 SMALL_X = 100
 SMALL_Y = 50
-LARGE_SQ_X = 80000
+LARGE_SQ_X = 70000

Review comment:
       We figured 7000 is large enough to overflow int 32. This is a new constant just introduced so the few of us decided to tweak it to 70000




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] access2rohit commented on a change in pull request #18782: [WIP] Add Large Tensor Test for linalg_syrk

Posted by GitBox <gi...@apache.org>.
access2rohit commented on a change in pull request #18782:
URL: https://github.com/apache/incubator-mxnet/pull/18782#discussion_r460189797



##########
File path: tests/nightly/test_large_array.py
##########
@@ -1791,6 +1792,29 @@ def test_sparse_dot():
     assert out.shape == (2, 2)
 
 
+def test_linalg_operators():
+    def check_syrk_batch():
+        # test both forward and backward
+        # batch syrk will be applied to the last two dimensions
+        A = nd.zeros((2, LARGE_SQ_X, LARGE_SQ_X))
+        for i in range(LARGE_SQ_X):
+            A[0,i,i] = 1
+            A[1,i,i] = 0.1
+        A.attach_grad()
+        with mx.autograd.record():
+            out = nd.linalg.syrk(A, alpha=2, transpose=False)
+        for i in range(LARGE_SQ_X):
+            assert out[0,i,i] == 2
+            assert_almost_equal(out[1,i,i], nd.array([0.02]), rtol=1e-3, atol=1e-5)
+        out.backward()
+        for i in range(LARGE_SQ_X):
+            # check the first row
+            assert A.grad[0,0,i] == 4
+            assert_almost_equal(A.grad[1,0,i], nd.array([0.4]), rtol=1e-3, atol=1e-5)

Review comment:
       Can you just check with smaller input run and let me know the results. That should be good enough




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] Zha0q1 commented on pull request #18782: Add Large Tensor Test for linalg_syrk

Posted by GitBox <gi...@apache.org>.
Zha0q1 commented on pull request #18782:
URL: https://github.com/apache/incubator-mxnet/pull/18782#issuecomment-663763640


   > @Zha0q1 also add your name to Contributors.md in the upcoming PR [let's not retrigger CI for that].
   > https://github.com/apache/incubator-mxnet/blob/master/CONTRIBUTORS.md
   > Thanks a lot for your contributions! :)
   
   will do!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org