You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2020/08/28 00:50:48 UTC

[GitHub] [incubator-mxnet] Zha0q1 opened a new pull request #19028: Add size check for numpy ctc loss

Zha0q1 opened a new pull request #19028:
URL: https://github.com/apache/incubator-mxnet/pull/19028


   This pr adds large size check for the npx ctc loss operator.
   
   Because of the way the current implementation is written, the operator only works when the total size of the input is < 2^31.  I.e. data is of shape (sequence_length, batch_size, alphabet_size), and the product of the tree dimension must < 2^31.
   
   I tried to only add the 2^31 upper bound to each of the 3 dimensions and change int->index_t wherever those dimensions are multiplied together, but this turned out to be unpractical since there are ~ 200 ints, ~10 different functions, and several control flow branches. With that effort one might as well refactor the whole thing to support int64 dimensions to begin with.
   
   @szha what do you think? 
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] szha merged pull request #19028: Add size check for numpy ctc loss

Posted by GitBox <gi...@apache.org>.
szha merged pull request #19028:
URL: https://github.com/apache/incubator-mxnet/pull/19028


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] szha commented on a change in pull request #19028: Add size check for numpy ctc loss

Posted by GitBox <gi...@apache.org>.
szha commented on a change in pull request #19028:
URL: https://github.com/apache/incubator-mxnet/pull/19028#discussion_r478778183



##########
File path: src/operator/nn/ctc_loss-inl.h
##########
@@ -238,6 +238,8 @@ inline bool CTCLossOpShape(const nnvm::NodeAttrs &attrs,
     CHECK_GE(dshape[0], lshape[1]) << "The max number of labels cannot exceed "
                                       "the maximum sequence length of the "
                                       "data.";
+    CHECK_LT(dshape.Size(), INT32_MAX) << "CTC Loss does not support large"

Review comment:
       ```suggestion
       CHECK_LT(dshape.Size(), INT32_MAX) << "ValueError: CTC Loss does not support large"
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] Zha0q1 commented on pull request #19028: Add size check for numpy ctc loss

Posted by GitBox <gi...@apache.org>.
Zha0q1 commented on pull request #19028:
URL: https://github.com/apache/incubator-mxnet/pull/19028#issuecomment-682270688


   > thanks. please add a test for the expected exception.
   
   there it is
   https://github.com/apache/incubator-mxnet/blob/6c15d2bad4e60fb8d1ddee49be1bccd61fa22129/tests/nightly/test_np_large_array.py#L913


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] szha commented on a change in pull request #19028: Add size check for numpy ctc loss

Posted by GitBox <gi...@apache.org>.
szha commented on a change in pull request #19028:
URL: https://github.com/apache/incubator-mxnet/pull/19028#discussion_r478855702



##########
File path: tests/nightly/test_np_large_array.py
##########
@@ -905,17 +905,27 @@ def batch_check(x, modes, params):
     assert type(out[0]).__name__ == 'ndarray'
 
 @use_np
-@pytest.mark.skip(reason='backward errors out')
 def test_ctc_loss():
-    A = np.ones((2, INT_OVERFLOW, 4))
+    def test_ctc_loss_size_check(A, label):
+        assertRaises(MXNetError, npx.ctc_loss, A, label)

Review comment:
       this should be ValueError




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] Zha0q1 commented on a change in pull request #19028: Add size check for numpy ctc loss

Posted by GitBox <gi...@apache.org>.
Zha0q1 commented on a change in pull request #19028:
URL: https://github.com/apache/incubator-mxnet/pull/19028#discussion_r478863339



##########
File path: tests/nightly/test_np_large_array.py
##########
@@ -905,17 +905,27 @@ def batch_check(x, modes, params):
     assert type(out[0]).__name__ == 'ndarray'
 
 @use_np
-@pytest.mark.skip(reason='backward errors out')
 def test_ctc_loss():
-    A = np.ones((2, INT_OVERFLOW, 4))
+    def test_ctc_loss_size_check(A, label):
+        assertRaises(MXNetError, npx.ctc_loss, A, label)

Review comment:
       Right! I did not know changing the error message actually controls the error type. I have changed it to ValueError.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] Zha0q1 removed a comment on pull request #19028: Add size check for numpy ctc loss

Posted by GitBox <gi...@apache.org>.
Zha0q1 removed a comment on pull request #19028:
URL: https://github.com/apache/incubator-mxnet/pull/19028#issuecomment-682270688


   > thanks. please add a test for the expected exception.
   
   there it is. Should i make it a separate test?
   https://github.com/apache/incubator-mxnet/blob/6c15d2bad4e60fb8d1ddee49be1bccd61fa22129/tests/nightly/test_np_large_array.py#L913


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] Zha0q1 commented on pull request #19028: Add size check for numpy ctc loss

Posted by GitBox <gi...@apache.org>.
Zha0q1 commented on pull request #19028:
URL: https://github.com/apache/incubator-mxnet/pull/19028#issuecomment-682276997


   > thanks. please add a test for the expected exception.
   
   I had it as part of the test; I have pushed a new commit to make it a nested function so now it stands out more


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #19028: Add size check for numpy ctc loss

Posted by GitBox <gi...@apache.org>.
mxnet-bot commented on pull request #19028:
URL: https://github.com/apache/incubator-mxnet/pull/19028#issuecomment-682263327


   Hey @Zha0q1 , Thanks for submitting the PR 
   All tests are already queued to run once. If tests fail, you can trigger one or more tests again with the following commands: 
   - To trigger all jobs: @mxnet-bot run ci [all] 
   - To trigger specific jobs: @mxnet-bot run ci [job1, job2] 
   *** 
   **CI supported jobs**: [unix-cpu, windows-gpu, sanity, clang, centos-cpu, centos-gpu, windows-cpu, edge, miscellaneous, unix-gpu, website]
   *** 
   _Note_: 
    Only following 3 categories can trigger CI :PR Author, MXNet Committer, Jenkins Admin. 
   All CI tests must pass before the PR can be merged. 
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] mxnet-bot commented on pull request #19028: Add size check for numpy ctc loss

Posted by GitBox <gi...@apache.org>.
mxnet-bot commented on pull request #19028:
URL: https://github.com/apache/incubator-mxnet/pull/19028#issuecomment-682342035


   Jenkins CI successfully triggered : [unix-gpu]


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] Zha0q1 commented on pull request #19028: Add size check for numpy ctc loss

Posted by GitBox <gi...@apache.org>.
Zha0q1 commented on pull request #19028:
URL: https://github.com/apache/incubator-mxnet/pull/19028#issuecomment-682342009


   @mxnet-bot run ci [unix-gpu]


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] szha commented on pull request #19028: Add size check for numpy ctc loss

Posted by GitBox <gi...@apache.org>.
szha commented on pull request #19028:
URL: https://github.com/apache/incubator-mxnet/pull/19028#issuecomment-682270241


   thanks. please add a test for the expected exception.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] Zha0q1 commented on pull request #19028: Add size check for numpy ctc loss

Posted by GitBox <gi...@apache.org>.
Zha0q1 commented on pull request #19028:
URL: https://github.com/apache/incubator-mxnet/pull/19028#issuecomment-682263644


   I updated the test case to use a tensor of size just a hair smaller than 2^31 
   ```
   ubuntu@ip-172-31-38-169:~/incubator-mxnet$ pytest tests/nightly/test_np_large_array.py::test_ctc_loss
   /home/ubuntu/anaconda3/lib/python3.7/importlib/_bootstrap.py:219: RuntimeWarning: numpy.ufunc size changed, may indicate binary incompatibility. Expected 192 from C header, got 216 from PyObject
     return f(*args, **kwds)
    /home/ubuntu/anaconda3/lib/python3.7/importlib/_bootstrap.py:219: RuntimeWarning: numpy.ufunc size changed, may indicate binary incompatibility. Expected 192 from C header, got 216 from PyObject
     return f(*args, **kwds)
    /home/ubuntu/anaconda3/lib/python3.7/importlib/_bootstrap.py:219: RuntimeWarning: numpy.ufunc size changed, may indicate binary incompatibility. Expected 192 from C header, got 216 from PyObject
     return f(*args, **kwds)
   ============================================ test session starts =============================================
   platform linux -- Python 3.7.7, pytest-5.4.1, py-1.8.1, pluggy-0.13.1
   rootdir: /home/ubuntu/incubator-mxnet, inifile: pytest.ini
   plugins: remotedata-0.3.2, openfiles-0.4.0, astropy-header-0.1.2, hypothesis-5.8.3, arraydiff-0.3, doctestplus-0.5.0
   collected 1 item                                                                                             
   
   tests/nightly/test_np_large_array.py .                                                                             [100%]
   
   ============================================== warnings summary ==============================================
   tests/nightly/test_np_large_array.py:89
     /home/ubuntu/incubator-mxnet/tests/nightly/test_np_large_array.py:89: DeprecationWarning: invalid escape sequence \ 
       '''
   
   tests/nightly/test_np_large_array.py:569
     /home/ubuntu/incubator-mxnet/tests/nightly/test_np_large_array.py:569: DeprecationWarning: invalid escape sequence \ 
       '''
   
   -- Docs: https://docs.pytest.org/en/latest/warnings.html
   ================================= 1 passed, 2 warnings in 216.51s (0:03:36) =================================
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] Zha0q1 edited a comment on pull request #19028: Add size check for numpy ctc loss

Posted by GitBox <gi...@apache.org>.
Zha0q1 edited a comment on pull request #19028:
URL: https://github.com/apache/incubator-mxnet/pull/19028#issuecomment-682270688


   > thanks. please add a test for the expected exception.
   
   there it is. Should i make it a separate test?
   https://github.com/apache/incubator-mxnet/blob/6c15d2bad4e60fb8d1ddee49be1bccd61fa22129/tests/nightly/test_np_large_array.py#L913


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org