You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2020/01/28 22:56:15 UTC

[GitHub] [incubator-mxnet] jonatan1626 opened a new pull request #17462: Updated PartialSortSmallK for LT support

jonatan1626 opened a new pull request #17462: Updated PartialSortSmallK for LT support
URL: https://github.com/apache/incubator-mxnet/pull/17462
 
 
   ## Description ##
   Updating the PartialSortSmallK function to use sizeof(index_t) instead of sizeof(int).
   
   Running Bert with Large Tensor would fail because of this. After changing it, Bert is not able to run with Large Tensor.
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] access2rohit commented on issue #17462: Updated PartialSortSmallK for LT support

Posted by GitBox <gi...@apache.org>.
access2rohit commented on issue #17462: Updated PartialSortSmallK for LT support
URL: https://github.com/apache/incubator-mxnet/pull/17462#issuecomment-579536913
 
 
   This operator returns indices when mode of execution is either "both" and "indices". In these cases temp workspace is created that is used to store indices. I am expecting increase in memory when running with large tensor build. This is expected. @apeforest do you think it's a concern since this increase in unaviodable. 
   
   IMO: for "both" we can expect 50% memory increase
            and for "indices" we can expect 100% increase in memory

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] ptrendx commented on a change in pull request #17462: Updated PartialSortSmallK for LT support

Posted by GitBox <gi...@apache.org>.
ptrendx commented on a change in pull request #17462: Updated PartialSortSmallK for LT support
URL: https://github.com/apache/incubator-mxnet/pull/17462#discussion_r381431132
 
 

 ##########
 File path: src/operator/tensor/ordering_op-inl.h
 ##########
 @@ -362,7 +362,7 @@ MSHADOW_FORCE_INLINE void TopKSort(const Tensor<gpu, 1, DType>& dat,
     }
   } else {
     const int nthreads(mshadow::cuda::kBaseThreadNum);
-    PartialSortSmallK<<<M, nthreads, nthreads*K*(sizeof(int)+sizeof(DType)),
+    PartialSortSmallK<<<M, nthreads, nthreads*K*(sizeof(IType)+sizeof(DType)),
 
 Review comment:
   Yeah, I don't believe the increase of GPU memory usage that you see comes from this particular change. Shared memory is part of L1 cache of the GPU, it is not counted towards the DRAM size.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] ptrendx commented on a change in pull request #17462: Updated PartialSortSmallK for LT support

Posted by GitBox <gi...@apache.org>.
ptrendx commented on a change in pull request #17462: Updated PartialSortSmallK for LT support
URL: https://github.com/apache/incubator-mxnet/pull/17462#discussion_r381099111
 
 

 ##########
 File path: src/operator/tensor/ordering_op-inl.h
 ##########
 @@ -362,7 +362,7 @@ MSHADOW_FORCE_INLINE void TopKSort(const Tensor<gpu, 1, DType>& dat,
     }
   } else {
     const int nthreads(mshadow::cuda::kBaseThreadNum);
-    PartialSortSmallK<<<M, nthreads, nthreads*K*(sizeof(int)+sizeof(DType)),
+    PartialSortSmallK<<<M, nthreads, nthreads*K*(sizeof(IType)+sizeof(DType)),
 
 Review comment:
   This is not a global memory of the gpu that is used here, it is shared memory amount assigned to this kernel. What is the value of nthreads here?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] access2rohit commented on a change in pull request #17462: Updated PartialSortSmallK for LT support

Posted by GitBox <gi...@apache.org>.
access2rohit commented on a change in pull request #17462: Updated PartialSortSmallK for LT support
URL: https://github.com/apache/incubator-mxnet/pull/17462#discussion_r397320540
 
 

 ##########
 File path: tests/nightly/test_large_array.py
 ##########
 @@ -1253,6 +1253,21 @@ def check_topk():
         l = nd.topk(b, k=1, axis=-1, dtype=np.int64, ret_typ="value")
         assert l.sum() == np.sum(np.arange(0, SMALL_Y))
 
+    def check_topk_small():
 
 Review comment:
   Can you add this input use case here instead: https://github.com/apache/incubator-mxnet/blob/master/tests/python/unittest/test_operator.py#L4414
   
   This file is for Large Tensor test cases only.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] ptrendx commented on a change in pull request #17462: Updated PartialSortSmallK for LT support

Posted by GitBox <gi...@apache.org>.
ptrendx commented on a change in pull request #17462: Updated PartialSortSmallK for LT support
URL: https://github.com/apache/incubator-mxnet/pull/17462#discussion_r381417450
 
 

 ##########
 File path: src/operator/tensor/ordering_op-inl.h
 ##########
 @@ -362,7 +362,7 @@ MSHADOW_FORCE_INLINE void TopKSort(const Tensor<gpu, 1, DType>& dat,
     }
   } else {
     const int nthreads(mshadow::cuda::kBaseThreadNum);
-    PartialSortSmallK<<<M, nthreads, nthreads*K*(sizeof(int)+sizeof(DType)),
+    PartialSortSmallK<<<M, nthreads, nthreads*K*(sizeof(IType)+sizeof(DType)),
 
 Review comment:
   Ok, thanks. So the only real limit here is that we can't request more than 48 kB of shared memory for this kernel, and the change from int to index_t should not matter be problematic here: `256 * (8 + sizeof(DType)) * K <= 256 * (8 + 8) * K = 4096 * K`. K is small (5), so it is nowhere near the limit.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] apeforest commented on issue #17462: Updated PartialSortSmallK for LT support

Posted by GitBox <gi...@apache.org>.
apeforest commented on issue #17462: Updated PartialSortSmallK for LT support
URL: https://github.com/apache/incubator-mxnet/pull/17462#issuecomment-586046568
 
 
   Could we decide the data type at runtime? This operator seems very general and we should try to prevent any memory regression if possible. @ptrendx please review it for the GPU performance.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] JonTanS commented on issue #17462: Updated PartialSortSmallK for LT support

Posted by GitBox <gi...@apache.org>.
JonTanS commented on issue #17462: Updated PartialSortSmallK for LT support
URL: https://github.com/apache/incubator-mxnet/pull/17462#issuecomment-589254247
 
 
   > > but they still seem pretty close.
   > 
   > Not sure what you mean by "pretty close". It still looks like 10% memory increase (both average and max) to me. Could you please clarify if I misread the table?
   
   @apeforest Yes, so I think what we are trying to figure out now is whether changing int -> index_t caused any regression. We aren't comparing between LT on and off because there may have been other changes that caused the memory regression.
   
   Looking at the max between the build with int and index_t without LT enabled we have 1245 vs 1256.
   
   Looking at  the max between LT_int and LT_index_t we have 1378 and 1380.
   
   These numbers are pretty similar meaning that the index_t change does not cause a regression, instead it may be due to other features of LT causing this.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] apeforest commented on issue #17462: Updated PartialSortSmallK for LT support

Posted by GitBox <gi...@apache.org>.
apeforest commented on issue #17462: Updated PartialSortSmallK for LT support
URL: https://github.com/apache/incubator-mxnet/pull/17462#issuecomment-589254124
 
 
   Could you please also add a test for this operator in our nightly test?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] JonTanS commented on a change in pull request #17462: Updated PartialSortSmallK for LT support

Posted by GitBox <gi...@apache.org>.
JonTanS commented on a change in pull request #17462: Updated PartialSortSmallK for LT support
URL: https://github.com/apache/incubator-mxnet/pull/17462#discussion_r380983585
 
 

 ##########
 File path: src/operator/tensor/ordering_op-inl.h
 ##########
 @@ -362,7 +362,7 @@ MSHADOW_FORCE_INLINE void TopKSort(const Tensor<gpu, 1, DType>& dat,
     }
   } else {
     const int nthreads(mshadow::cuda::kBaseThreadNum);
-    PartialSortSmallK<<<M, nthreads, nthreads*K*(sizeof(int)+sizeof(DType)),
+    PartialSortSmallK<<<M, nthreads, nthreads*K*(sizeof(IType)+sizeof(DType)),
 
 Review comment:
   Is it okay if there's a slight memory regression after changing it to index_t? I think @apeforest wanted to decide whether to use int or index_t at runtime.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] apeforest commented on a change in pull request #17462: Updated PartialSortSmallK for LT support

Posted by GitBox <gi...@apache.org>.
apeforest commented on a change in pull request #17462: Updated PartialSortSmallK for LT support
URL: https://github.com/apache/incubator-mxnet/pull/17462#discussion_r381547859
 
 

 ##########
 File path: src/operator/tensor/ordering_op-inl.h
 ##########
 @@ -362,7 +362,7 @@ MSHADOW_FORCE_INLINE void TopKSort(const Tensor<gpu, 1, DType>& dat,
     }
   } else {
     const int nthreads(mshadow::cuda::kBaseThreadNum);
-    PartialSortSmallK<<<M, nthreads, nthreads*K*(sizeof(int)+sizeof(DType)),
+    PartialSortSmallK<<<M, nthreads, nthreads*K*(sizeof(IType)+sizeof(DType)),
 
 Review comment:
   @JonTanS could you also move the memory comparison results to the PR description so it's more noticeable to reviewers? Thanks

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] ptrendx commented on a change in pull request #17462: Updated PartialSortSmallK for LT support

Posted by GitBox <gi...@apache.org>.
ptrendx commented on a change in pull request #17462: Updated PartialSortSmallK for LT support
URL: https://github.com/apache/incubator-mxnet/pull/17462#discussion_r380978304
 
 

 ##########
 File path: src/operator/tensor/ordering_op-inl.h
 ##########
 @@ -362,7 +362,7 @@ MSHADOW_FORCE_INLINE void TopKSort(const Tensor<gpu, 1, DType>& dat,
     }
   } else {
     const int nthreads(mshadow::cuda::kBaseThreadNum);
-    PartialSortSmallK<<<M, nthreads, nthreads*K*(sizeof(int)+sizeof(DType)),
+    PartialSortSmallK<<<M, nthreads, nthreads*K*(sizeof(IType)+sizeof(DType)),
 
 Review comment:
   Looking at the implementation of this kernel (which does not depend on your IType template), it should always be index_t here.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] access2rohit commented on issue #17462: Updated PartialSortSmallK for LT support

Posted by GitBox <gi...@apache.org>.
access2rohit commented on issue #17462: Updated PartialSortSmallK for LT support
URL: https://github.com/apache/incubator-mxnet/pull/17462#issuecomment-586007972
 
 
   @mxnet-label-bot update [pr-awaiting-merge]

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] jonatan1626 commented on issue #17462: Updated PartialSortSmallK for LT support

Posted by GitBox <gi...@apache.org>.
jonatan1626 commented on issue #17462: Updated PartialSortSmallK for LT support
URL: https://github.com/apache/incubator-mxnet/pull/17462#issuecomment-580374861
 
 
   @mxnet-label-bot add [pr-awaiting-review]

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] apeforest commented on issue #17462: Updated PartialSortSmallK for LT support

Posted by GitBox <gi...@apache.org>.
apeforest commented on issue #17462: Updated PartialSortSmallK for LT support
URL: https://github.com/apache/incubator-mxnet/pull/17462#issuecomment-589272828
 
 
   I c. Thanks for the clarification.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] JonTanS commented on a change in pull request #17462: Updated PartialSortSmallK for LT support

Posted by GitBox <gi...@apache.org>.
JonTanS commented on a change in pull request #17462: Updated PartialSortSmallK for LT support
URL: https://github.com/apache/incubator-mxnet/pull/17462#discussion_r381402464
 
 

 ##########
 File path: src/operator/tensor/ordering_op-inl.h
 ##########
 @@ -362,7 +362,7 @@ MSHADOW_FORCE_INLINE void TopKSort(const Tensor<gpu, 1, DType>& dat,
     }
   } else {
     const int nthreads(mshadow::cuda::kBaseThreadNum);
-    PartialSortSmallK<<<M, nthreads, nthreads*K*(sizeof(int)+sizeof(DType)),
+    PartialSortSmallK<<<M, nthreads, nthreads*K*(sizeof(IType)+sizeof(DType)),
 
 Review comment:
   nthreads seems to be a constant of value 256 and is defined in the mshadow/cuda/tensor_gpu-inl.cuh 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] access2rohit commented on issue #17462: Updated PartialSortSmallK for LT support

Posted by GitBox <gi...@apache.org>.
access2rohit commented on issue #17462: Updated PartialSortSmallK for LT support
URL: https://github.com/apache/incubator-mxnet/pull/17462#issuecomment-586007892
 
 
   @apeforest can you review and merge

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] JonTanS commented on a change in pull request #17462: Updated PartialSortSmallK for LT support

Posted by GitBox <gi...@apache.org>.
JonTanS commented on a change in pull request #17462: Updated PartialSortSmallK for LT support
URL: https://github.com/apache/incubator-mxnet/pull/17462#discussion_r381558595
 
 

 ##########
 File path: src/operator/tensor/ordering_op-inl.h
 ##########
 @@ -362,7 +362,7 @@ MSHADOW_FORCE_INLINE void TopKSort(const Tensor<gpu, 1, DType>& dat,
     }
   } else {
     const int nthreads(mshadow::cuda::kBaseThreadNum);
-    PartialSortSmallK<<<M, nthreads, nthreads*K*(sizeof(int)+sizeof(DType)),
+    PartialSortSmallK<<<M, nthreads, nthreads*K*(sizeof(IType)+sizeof(DType)),
 
 Review comment:
   @apeforest sure thing! I'm currently rebuilding mxnet and then will be running the memory usage. I'll move everything to the top once I have the complete results.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] apeforest commented on issue #17462: Updated PartialSortSmallK for LT support

Posted by GitBox <gi...@apache.org>.
apeforest commented on issue #17462: Updated PartialSortSmallK for LT support
URL: https://github.com/apache/incubator-mxnet/pull/17462#issuecomment-579512436
 
 
   Could you measure the GPU memory usage before and after this change on a topk operation with USE_INT64_TENSOR_SIZE compiler flag on? 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] access2rohit commented on a change in pull request #17462: Updated PartialSortSmallK for LT support

Posted by GitBox <gi...@apache.org>.
access2rohit commented on a change in pull request #17462: Updated PartialSortSmallK for LT support
URL: https://github.com/apache/incubator-mxnet/pull/17462#discussion_r397406718
 
 

 ##########
 File path: tests/python/unittest/test_operator.py
 ##########
 @@ -4541,6 +4541,22 @@ def get_large_matrix():
                 expected=[gt_topk(dat=a_npy, axis=1, ret_typ="mask", k=3,
                     is_ascend=True)])
 
+    def check_topk_small():
+        LARGE_XX = 1200
+        SMALL_YY = 500
+        ctx = mx.gpu(0) if mx.context.num_gpus() > 0 else mx.cpu()
 
 Review comment:
   not required !

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] jonatan1626 commented on issue #17462: Updated PartialSortSmallK for LT support

Posted by GitBox <gi...@apache.org>.
jonatan1626 commented on issue #17462: Updated PartialSortSmallK for LT support
URL: https://github.com/apache/incubator-mxnet/pull/17462#issuecomment-579961169
 
 
   My gpu memory usage results from running topk operation with Large Tensor and without Large Tensor.
   
     | With Large   Tensor | Without Large   Tensor
   -- | -- | --
     | Avg (Mb) | Max (Mb) | Avg (Mb) | Max (Mb)
   Both | 1001.66667 | 1410 | 945.333333 | 1264
   Indicies | 984 | 1410 | 828.333333 | 1250
   Value | 998.428571 | 1371 | 902.666667 | 1264
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] apeforest commented on a change in pull request #17462: Updated PartialSortSmallK for LT support

Posted by GitBox <gi...@apache.org>.
apeforest commented on a change in pull request #17462: Updated PartialSortSmallK for LT support
URL: https://github.com/apache/incubator-mxnet/pull/17462#discussion_r381492237
 
 

 ##########
 File path: src/operator/tensor/ordering_op-inl.h
 ##########
 @@ -362,7 +362,7 @@ MSHADOW_FORCE_INLINE void TopKSort(const Tensor<gpu, 1, DType>& dat,
     }
   } else {
     const int nthreads(mshadow::cuda::kBaseThreadNum);
-    PartialSortSmallK<<<M, nthreads, nthreads*K*(sizeof(int)+sizeof(DType)),
+    PartialSortSmallK<<<M, nthreads, nthreads*K*(sizeof(IType)+sizeof(DType)),
 
 Review comment:
   @ptrendx If we look at the [memory profiling result ](https://github.com/apache/incubator-mxnet/pull/17462#issuecomment-579961169) posted by @JonTanS, there IS a ~10% max memory increase and ~5% average memory increase caused by this solely change. 
   
   @jonatan1626 could you please re-run the profiler again to make sure the memory result is accurate?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] JonTanS commented on a change in pull request #17462: Updated PartialSortSmallK for LT support

Posted by GitBox <gi...@apache.org>.
JonTanS commented on a change in pull request #17462: Updated PartialSortSmallK for LT support
URL: https://github.com/apache/incubator-mxnet/pull/17462#discussion_r381419681
 
 

 ##########
 File path: src/operator/tensor/ordering_op-inl.h
 ##########
 @@ -362,7 +362,7 @@ MSHADOW_FORCE_INLINE void TopKSort(const Tensor<gpu, 1, DType>& dat,
     }
   } else {
     const int nthreads(mshadow::cuda::kBaseThreadNum);
-    PartialSortSmallK<<<M, nthreads, nthreads*K*(sizeof(int)+sizeof(DType)),
+    PartialSortSmallK<<<M, nthreads, nthreads*K*(sizeof(IType)+sizeof(DType)),
 
 Review comment:
   Ok great! So just changing the `int` to `index_t` should be okay even though we see a slight memory regression when doing so?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] access2rohit commented on a change in pull request #17462: Updated PartialSortSmallK for LT support

Posted by GitBox <gi...@apache.org>.
access2rohit commented on a change in pull request #17462: Updated PartialSortSmallK for LT support
URL: https://github.com/apache/incubator-mxnet/pull/17462#discussion_r397316403
 
 

 ##########
 File path: python/mxnet/test_utils.py
 ##########
 @@ -287,8 +287,8 @@ def assign_each2(input1, input2, function):
     return output
 
 # For testing Large Tensors having total size > 2^32 elements
-def create_2d_tensor(rows, columns, dtype=np.int64):
-    a = mx.nd.arange(0, rows, dtype=dtype).reshape(rows, 1)
+def create_2d_tensor(rows, columns, dtype=np.int64, ctx=None):
 
 Review comment:
   keep default context as CPU. If user wants to use GPU he can pass it.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] jonatan1626 edited a comment on issue #17462: Updated PartialSortSmallK for LT support

Posted by GitBox <gi...@apache.org>.
jonatan1626 edited a comment on issue #17462: Updated PartialSortSmallK for LT support
URL: https://github.com/apache/incubator-mxnet/pull/17462#issuecomment-579961169
 
 
   My gpu memory usage results from running topk operation with Large Tensor and without Large Tensor.
   
     | With Large Tensor |  | Without Large  Tensor | | 
   -- | -- | -- | -- | --
     | Avg (Mb) | Max (Mb) | Avg (Mb) | Max (Mb)
   Both | 1001.66667 | 1410 | 945.333333 | 1264
   Indicies | 984 | 1410 | 828.333333 | 1250
   Value | 998.428571 | 1371 | 902.666667 | 1264
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] access2rohit commented on a change in pull request #17462: Updated PartialSortSmallK for LT support

Posted by GitBox <gi...@apache.org>.
access2rohit commented on a change in pull request #17462: Updated PartialSortSmallK for LT support
URL: https://github.com/apache/incubator-mxnet/pull/17462#discussion_r397405221
 
 

 ##########
 File path: tests/python/unittest/test_operator.py
 ##########
 @@ -4541,6 +4541,22 @@ def get_large_matrix():
                 expected=[gt_topk(dat=a_npy, axis=1, ret_typ="mask", k=3,
                     is_ascend=True)])
 
+    def check_topk_small():
 
 Review comment:
   Please add this to exiting for loop and use check_symbolic_forward and check_symbolic_backward

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] apeforest commented on issue #17462: Updated PartialSortSmallK for LT support

Posted by GitBox <gi...@apache.org>.
apeforest commented on issue #17462: Updated PartialSortSmallK for LT support
URL: https://github.com/apache/incubator-mxnet/pull/17462#issuecomment-589274629
 
 
   Please add a test to nightly/test_large_vector.py. Otherwise LGTM.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] apeforest edited a comment on issue #17462: Updated PartialSortSmallK for LT support

Posted by GitBox <gi...@apache.org>.
apeforest edited a comment on issue #17462: Updated PartialSortSmallK for LT support
URL: https://github.com/apache/incubator-mxnet/pull/17462#issuecomment-589272828
 
 
   I c. Thanks for the clarification. In the case LT compiler flag is not turned on, index_t is actually int32_t, so I don't think you need to compare that.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] jonatan1626 commented on issue #17462: Updated PartialSortSmallK for LT support

Posted by GitBox <gi...@apache.org>.
jonatan1626 commented on issue #17462: Updated PartialSortSmallK for LT support
URL: https://github.com/apache/incubator-mxnet/pull/17462#issuecomment-579510768
 
 
   @access2rohit 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] apeforest commented on a change in pull request #17462: Updated PartialSortSmallK for LT support

Posted by GitBox <gi...@apache.org>.
apeforest commented on a change in pull request #17462: Updated PartialSortSmallK for LT support
URL: https://github.com/apache/incubator-mxnet/pull/17462#discussion_r381492237
 
 

 ##########
 File path: src/operator/tensor/ordering_op-inl.h
 ##########
 @@ -362,7 +362,7 @@ MSHADOW_FORCE_INLINE void TopKSort(const Tensor<gpu, 1, DType>& dat,
     }
   } else {
     const int nthreads(mshadow::cuda::kBaseThreadNum);
-    PartialSortSmallK<<<M, nthreads, nthreads*K*(sizeof(int)+sizeof(DType)),
+    PartialSortSmallK<<<M, nthreads, nthreads*K*(sizeof(IType)+sizeof(DType)),
 
 Review comment:
   @ptrendx If we look at the [memory profiling result ](https://github.com/apache/incubator-mxnet/pull/17462#issuecomment-579961169) posted by @JonTanS, there IS a ~10% max memory increase and ~5% average memory increase caused by this solely change. 
   
   @JonTanS could you please re-run the profiler again to make sure the memory result is accurate?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] apeforest commented on issue #17462: Updated PartialSortSmallK for LT support

Posted by GitBox <gi...@apache.org>.
apeforest commented on issue #17462: Updated PartialSortSmallK for LT support
URL: https://github.com/apache/incubator-mxnet/pull/17462#issuecomment-589240100
 
 
   > but they still seem pretty close.
   
   Not sure what you mean by "pretty close". It still looks like 10% memory increase (both average and max) to me. Could you please clarify if I misread the table? 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] access2rohit commented on a change in pull request #17462: Updated PartialSortSmallK for LT support

Posted by GitBox <gi...@apache.org>.
access2rohit commented on a change in pull request #17462: Updated PartialSortSmallK for LT support
URL: https://github.com/apache/incubator-mxnet/pull/17462#discussion_r397316403
 
 

 ##########
 File path: python/mxnet/test_utils.py
 ##########
 @@ -287,8 +287,8 @@ def assign_each2(input1, input2, function):
     return output
 
 # For testing Large Tensors having total size > 2^32 elements
-def create_2d_tensor(rows, columns, dtype=np.int64):
-    a = mx.nd.arange(0, rows, dtype=dtype).reshape(rows, 1)
+def create_2d_tensor(rows, columns, dtype=np.int64, ctx=None):
 
 Review comment:
   keep default context as CPU. If user wants to use GPU he can pass it.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] access2rohit commented on issue #17462: Updated PartialSortSmallK for LT support

Posted by GitBox <gi...@apache.org>.
access2rohit commented on issue #17462: Updated PartialSortSmallK for LT support
URL: https://github.com/apache/incubator-mxnet/pull/17462#issuecomment-580373779
 
 
   retriggered both failing jobs!

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] JonTanS commented on pull request #17462: Updated PartialSortSmallK for LT support

Posted by GitBox <gi...@apache.org>.
JonTanS commented on pull request #17462:
URL: https://github.com/apache/incubator-mxnet/pull/17462#issuecomment-618596980


   After syncing up with @access2rohit, we found that a test already exists within the current suite of tests.
   
   There is currently a test written in tests/python/gpu/test_operator_gpu.py, but it does not run on the nightly tests. There are currently other nightly tests that test only a couple of the operators and this operator isn't in that list which is why it wasn't caught before. This verifies that this change indeed fixes the topk gpu call.
   
   In Summary:
   No Change 
   LT On - Failed
   LT Off - Pass
   Change
   LT On - Pass
   LT Off - Pass
   
   
   Command to Execute:
   MXNET_TEST_COUNT=1 nosetests --logging-level=DEBUG --verbose -s tests/python/gpu/test_operator_gpu.py:test_order
   
   **No Change, LT Off**
   ```
   Flags Used:
   [✔ CUDA, ✔ CUDNN, ✔ NCCL, ✔ CUDA_RTC, ✖ TENSORRT, ✔ CPU_SSE, ✔ CPU_SSE2, ✔ CPU_SSE3, ✔ CPU_SSE4_1, ✔ CPU_SSE4_2, ✖ CPU_SSE4A, ✔ CPU_AVX, ✖ CPU_AVX2, ✔ OPENMP, ✖ SSE, ✔ F16C, ✖ JEMALLOC, ✔ BLAS_OPEN, ✖ BLAS_ATLAS, ✖ BLAS_MKL, ✖ BLAS_APPLE, ✔ LAPACK, ✔ MKLDNN, ✔ OPENCV, ✖ CAFFE, ✖ PROFILER, ✖ DIST_KVSTORE, ✖ CXX14, ✖ INT64_TENSOR_SIZE, ✔ SIGNAL_HANDLER, ✖ DEBUG, ✖ TVM_OP]
   
   MXNET_TEST_COUNT=1 nosetests --logging-level=DEBUG --verbose -s tests/python/gpu/test_operator_gpu.py:test_order
   
   (2, 5)
   [INFO] Setting module np/mx/python random seeds, use MXNET_MODULE_SEED=548212641 to reproduce.
   test_operator_gpu.test_order ... [DEBUG] Setting test np/mx/python random seeds, use MXNET_TEST_SEED=1891020679 to reproduce.
   [18:15:41] ../src/base.cc:84: Upgrade advisory: this mxnet has been built against cuDNN lib version 7501, which is older than the oldest version tested by CI (7600).  Set MXNET_CUDNN_LIB_CHECKING=0 to quiet this warning.
   ok
   
   ----------------------------------------------------------------------
   Ran 1 test in 8.737s
   
   OK
   ```
   
   
   **No Change, LT On**
   ```
   Flags Used:
   [✔ CUDA, ✔ CUDNN, ✔ NCCL, ✔ CUDA_RTC, ✖ TENSORRT, ✔ CPU_SSE, ✔ CPU_SSE2, ✔ CPU_SSE3, ✔ CPU_SSE4_1, ✔ CPU_SSE4_2, ✖ CPU_SSE4A, ✔ CPU_AVX, ✖ CPU_AVX2, ✔ OPENMP, ✖ SSE, ✔ F16C, ✖ JEMALLOC, ✔ BLAS_OPEN, ✖ BLAS_ATLAS, ✖ BLAS_MKL, ✖ BLAS_APPLE, ✔ LAPACK, ✔ MKLDNN, ✔ OPENCV, ✖ CAFFE, ✖ PROFILER, ✖ DIST_KVSTORE, ✖ CXX14, ✔ INT64_TENSOR_SIZE, ✔ SIGNAL_HANDLER, ✖ DEBUG, ✖ TVM_OP]
   
   MXNET_TEST_COUNT=1 nosetests --logging-level=DEBUG --verbose -s tests/python/gpu/test_operator_gpu.py:test_order
   (2, 5)
   [INFO] Setting module np/mx/python random seeds, use MXNET_MODULE_SEED=548212641 to reproduce.
   test_operator_gpu.test_order ... [DEBUG] Setting test np/mx/python random seeds, use MXNET_TEST_SEED=1891020679 to reproduce.
   [17:55:51] ../src/base.cc:84: Upgrade advisory: this mxnet has been built against cuDNN lib version 7501, which is older than the oldest version tested by CI (7600).  Set MXNET_CUDNN_LIB_CHECKING=0 to quiet this warning.
   [INFO] Setting test np/mx/python random seeds, use MXNET_TEST_SEED=1891020679 to reproduce.
   ERROR
   
   ======================================================================
   ERROR: test_operator_gpu.test_order
   ----------------------------------------------------------------------
   Traceback (most recent call last):
     File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/nose/case.py", line 197, in runTest
       self.test(*self.arg)
     File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/nose/util.py", line 620, in newfunc
       return func(*arg, **kw)
     File "/home/ubuntu/incubator-mxnet/tests/python/gpu/../unittest/common.py", line 215, in test_new
       orig_test(*args, **kwargs)
     File "/home/ubuntu/incubator-mxnet/tests/python/gpu/../unittest/test_ndarray.py", line 948, in test_order
       nd_ret_topk = mx.nd.topk(large_matrix_nd, axis=1, ret_typ="indices", k=5, is_ascend=False).asnumpy()
     File "/home/ubuntu/incubator-mxnet/python/mxnet/ndarray/ndarray.py", line 2566, in asnumpy
       ctypes.c_size_t(data.size)))
     File "/home/ubuntu/incubator-mxnet/python/mxnet/base.py", line 246, in check_call
       raise get_last_ffi_error()
   mxnet.base.MXNetError: Traceback (most recent call last):
     File "../include/mshadow/./stream_gpu-inl.h", line 81
   CUDA: Check failed: e == cudaSuccess: an illegal memory access was encountered
   -------------------- >> begin captured logging << --------------------
   root: INFO: NumPy-shape semantics has been activated in your code. This is required for creating and manipulating scalar and zero-size tensors, which were not supported in MXNet before, as in the official NumPy library. Please DO NOT manually deactivate this semantics while using `mxnet.numpy` and `mxnet.numpy_extension` modules.
   common: INFO: Setting module np/mx/python random seeds, use MXNET_MODULE_SEED=548212641 to reproduce.
   common: DEBUG: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=1891020679 to reproduce.
   common: INFO: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=1891020679 to reproduce.
   --------------------- >> end captured logging << ---------------------
   
   ----------------------------------------------------------------------
   Ran 1 test in 76.899s
   
   FAILED (errors=1)
   ```
   
   
   **Change, LT Off**
   ```
   Flags
   [✔ CUDA, ✔ CUDNN, ✔ NCCL, ✔ CUDA_RTC, ✖ TENSORRT, ✔ CPU_SSE, ✔ CPU_SSE2, ✔ CPU_SSE3, ✔ CPU_SSE4_1, ✔ CPU_SSE4_2, ✖ CPU_SSE4A, ✔ CPU_AVX, ✖ CPU_AVX2, ✔ OPENMP, ✖ SSE, ✔ F16C, ✖ JEMALLOC, ✔ BLAS_OPEN, ✖ BLAS_ATLAS, ✖ BLAS_MKL, ✖ BLAS_APPLE, ✔ LAPACK, ✔ MKLDNN, ✔ OPENCV, ✖ CAFFE, ✖ PROFILER, ✖ DIST_KVSTORE, ✖ CXX14, ✖ INT64_TENSOR_SIZE, ✔ SIGNAL_HANDLER, ✖ DEBUG, ✖ TVM_OP]
   
   MXNET_TEST_COUNT=1 nosetests --logging-level=DEBUG --verbose -s tests/python/gpu/test_operator_gpu.py:test_order
   
   (2, 5)
   [INFO] Setting module np/mx/python random seeds, use MXNET_MODULE_SEED=548212641 to reproduce.
   test_operator_gpu.test_order ... [DEBUG] Setting test np/mx/python random seeds, use MXNET_TEST_SEED=1891020679 to reproduce.
   [18:28:05] ../src/base.cc:84: Upgrade advisory: this mxnet has been built against cuDNN lib version 7501, which is older than the oldest version tested by CI (7600).  Set MXNET_CUDNN_LIB_CHECKING=0 to quiet this warning.
   ok
   
   ----------------------------------------------------------------------
   Ran 1 test in 8.652s
   
   OK
   ```
   
   **Change, LT On**
   ```
   [✔ CUDA, ✔ CUDNN, ✔ NCCL, ✔ CUDA_RTC, ✖ TENSORRT, ✔ CPU_SSE, ✔ CPU_SSE2, ✔ CPU_SSE3, ✔ CPU_SSE4_1, ✔ CPU_SSE4_2, ✖ CPU_SSE4A, ✔ CPU_AVX, ✖ CPU_AVX2, ✔ OPENMP, ✖ SSE, ✔ F16C, ✖ JEMALLOC, ✔ BLAS_OPEN, ✖ BLAS_ATLAS, ✖ BLAS_MKL, ✖ BLAS_APPLE, ✔ LAPACK, ✔ MKLDNN, ✔ OPENCV, ✖ CAFFE, ✖ PROFILER, ✖ DIST_KVSTORE, ✖ CXX14, ✔ INT64_TENSOR_SIZE, ✔ SIGNAL_HANDLER, ✖ DEBUG, ✖ TVM_OP]
   
   MXNET_TEST_COUNT=1 nosetests --logging-level=DEBUG --verbose -s tests/python/gpu/test_operator_gpu.py:test_order
   
   (2, 5)
   [INFO] Setting module np/mx/python random seeds, use MXNET_MODULE_SEED=548212641 to reproduce.
   test_operator_gpu.test_order ... [DEBUG] Setting test np/mx/python random seeds, use MXNET_TEST_SEED=1891020679 to reproduce.
   [18:42:21] ../src/base.cc:84: Upgrade advisory: this mxnet has been built against cuDNN lib version 7501, which is older than the oldest version tested by CI (7600).  Set MXNET_CUDNN_LIB_CHECKING=0 to quiet this warning.
   ok
   
   ----------------------------------------------------------------------
   Ran 1 test in 9.185s
   
   OK
   ```
   
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-mxnet] JonTanS edited a comment on pull request #17462: Updated PartialSortSmallK for LT support

Posted by GitBox <gi...@apache.org>.
JonTanS edited a comment on pull request #17462:
URL: https://github.com/apache/incubator-mxnet/pull/17462#issuecomment-618596980


   After syncing up with @access2rohit, we found that a test already exists within the current suite of tests.
   
   There is currently a test written in tests/python/gpu/test_operator_gpu.py, but it does not run on the large tensor nightly tests because GPU context is not tested on nightly. This verifies that this change indeed fixes the topk gpu call.
   
   In Summary:
   No Change 
   LT On - Failed
   LT Off - Pass
   Change
   LT On - Pass
   LT Off - Pass
   
   
   Command to Execute:
   MXNET_TEST_COUNT=1 nosetests --logging-level=DEBUG --verbose -s tests/python/gpu/test_operator_gpu.py:test_order
   
   **No Change, LT Off**
   ```
   Flags Used:
   [✔ CUDA, ✔ CUDNN, ✔ NCCL, ✔ CUDA_RTC, ✖ TENSORRT, ✔ CPU_SSE, ✔ CPU_SSE2, ✔ CPU_SSE3, ✔ CPU_SSE4_1, ✔ CPU_SSE4_2, ✖ CPU_SSE4A, ✔ CPU_AVX, ✖ CPU_AVX2, ✔ OPENMP, ✖ SSE, ✔ F16C, ✖ JEMALLOC, ✔ BLAS_OPEN, ✖ BLAS_ATLAS, ✖ BLAS_MKL, ✖ BLAS_APPLE, ✔ LAPACK, ✔ MKLDNN, ✔ OPENCV, ✖ CAFFE, ✖ PROFILER, ✖ DIST_KVSTORE, ✖ CXX14, ✖ INT64_TENSOR_SIZE, ✔ SIGNAL_HANDLER, ✖ DEBUG, ✖ TVM_OP]
   
   MXNET_TEST_COUNT=1 nosetests --logging-level=DEBUG --verbose -s tests/python/gpu/test_operator_gpu.py:test_order
   
   (2, 5)
   [INFO] Setting module np/mx/python random seeds, use MXNET_MODULE_SEED=548212641 to reproduce.
   test_operator_gpu.test_order ... [DEBUG] Setting test np/mx/python random seeds, use MXNET_TEST_SEED=1891020679 to reproduce.
   [18:15:41] ../src/base.cc:84: Upgrade advisory: this mxnet has been built against cuDNN lib version 7501, which is older than the oldest version tested by CI (7600).  Set MXNET_CUDNN_LIB_CHECKING=0 to quiet this warning.
   ok
   
   ----------------------------------------------------------------------
   Ran 1 test in 8.737s
   
   OK
   ```
   
   
   **No Change, LT On**
   ```
   Flags Used:
   [✔ CUDA, ✔ CUDNN, ✔ NCCL, ✔ CUDA_RTC, ✖ TENSORRT, ✔ CPU_SSE, ✔ CPU_SSE2, ✔ CPU_SSE3, ✔ CPU_SSE4_1, ✔ CPU_SSE4_2, ✖ CPU_SSE4A, ✔ CPU_AVX, ✖ CPU_AVX2, ✔ OPENMP, ✖ SSE, ✔ F16C, ✖ JEMALLOC, ✔ BLAS_OPEN, ✖ BLAS_ATLAS, ✖ BLAS_MKL, ✖ BLAS_APPLE, ✔ LAPACK, ✔ MKLDNN, ✔ OPENCV, ✖ CAFFE, ✖ PROFILER, ✖ DIST_KVSTORE, ✖ CXX14, ✔ INT64_TENSOR_SIZE, ✔ SIGNAL_HANDLER, ✖ DEBUG, ✖ TVM_OP]
   
   MXNET_TEST_COUNT=1 nosetests --logging-level=DEBUG --verbose -s tests/python/gpu/test_operator_gpu.py:test_order
   (2, 5)
   [INFO] Setting module np/mx/python random seeds, use MXNET_MODULE_SEED=548212641 to reproduce.
   test_operator_gpu.test_order ... [DEBUG] Setting test np/mx/python random seeds, use MXNET_TEST_SEED=1891020679 to reproduce.
   [17:55:51] ../src/base.cc:84: Upgrade advisory: this mxnet has been built against cuDNN lib version 7501, which is older than the oldest version tested by CI (7600).  Set MXNET_CUDNN_LIB_CHECKING=0 to quiet this warning.
   [INFO] Setting test np/mx/python random seeds, use MXNET_TEST_SEED=1891020679 to reproduce.
   ERROR
   
   ======================================================================
   ERROR: test_operator_gpu.test_order
   ----------------------------------------------------------------------
   Traceback (most recent call last):
     File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/nose/case.py", line 197, in runTest
       self.test(*self.arg)
     File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/nose/util.py", line 620, in newfunc
       return func(*arg, **kw)
     File "/home/ubuntu/incubator-mxnet/tests/python/gpu/../unittest/common.py", line 215, in test_new
       orig_test(*args, **kwargs)
     File "/home/ubuntu/incubator-mxnet/tests/python/gpu/../unittest/test_ndarray.py", line 948, in test_order
       nd_ret_topk = mx.nd.topk(large_matrix_nd, axis=1, ret_typ="indices", k=5, is_ascend=False).asnumpy()
     File "/home/ubuntu/incubator-mxnet/python/mxnet/ndarray/ndarray.py", line 2566, in asnumpy
       ctypes.c_size_t(data.size)))
     File "/home/ubuntu/incubator-mxnet/python/mxnet/base.py", line 246, in check_call
       raise get_last_ffi_error()
   mxnet.base.MXNetError: Traceback (most recent call last):
     File "../include/mshadow/./stream_gpu-inl.h", line 81
   CUDA: Check failed: e == cudaSuccess: an illegal memory access was encountered
   -------------------- >> begin captured logging << --------------------
   root: INFO: NumPy-shape semantics has been activated in your code. This is required for creating and manipulating scalar and zero-size tensors, which were not supported in MXNet before, as in the official NumPy library. Please DO NOT manually deactivate this semantics while using `mxnet.numpy` and `mxnet.numpy_extension` modules.
   common: INFO: Setting module np/mx/python random seeds, use MXNET_MODULE_SEED=548212641 to reproduce.
   common: DEBUG: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=1891020679 to reproduce.
   common: INFO: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=1891020679 to reproduce.
   --------------------- >> end captured logging << ---------------------
   
   ----------------------------------------------------------------------
   Ran 1 test in 76.899s
   
   FAILED (errors=1)
   ```
   
   
   **Change, LT Off**
   ```
   Flags
   [✔ CUDA, ✔ CUDNN, ✔ NCCL, ✔ CUDA_RTC, ✖ TENSORRT, ✔ CPU_SSE, ✔ CPU_SSE2, ✔ CPU_SSE3, ✔ CPU_SSE4_1, ✔ CPU_SSE4_2, ✖ CPU_SSE4A, ✔ CPU_AVX, ✖ CPU_AVX2, ✔ OPENMP, ✖ SSE, ✔ F16C, ✖ JEMALLOC, ✔ BLAS_OPEN, ✖ BLAS_ATLAS, ✖ BLAS_MKL, ✖ BLAS_APPLE, ✔ LAPACK, ✔ MKLDNN, ✔ OPENCV, ✖ CAFFE, ✖ PROFILER, ✖ DIST_KVSTORE, ✖ CXX14, ✖ INT64_TENSOR_SIZE, ✔ SIGNAL_HANDLER, ✖ DEBUG, ✖ TVM_OP]
   
   MXNET_TEST_COUNT=1 nosetests --logging-level=DEBUG --verbose -s tests/python/gpu/test_operator_gpu.py:test_order
   
   (2, 5)
   [INFO] Setting module np/mx/python random seeds, use MXNET_MODULE_SEED=548212641 to reproduce.
   test_operator_gpu.test_order ... [DEBUG] Setting test np/mx/python random seeds, use MXNET_TEST_SEED=1891020679 to reproduce.
   [18:28:05] ../src/base.cc:84: Upgrade advisory: this mxnet has been built against cuDNN lib version 7501, which is older than the oldest version tested by CI (7600).  Set MXNET_CUDNN_LIB_CHECKING=0 to quiet this warning.
   ok
   
   ----------------------------------------------------------------------
   Ran 1 test in 8.652s
   
   OK
   ```
   
   **Change, LT On**
   ```
   [✔ CUDA, ✔ CUDNN, ✔ NCCL, ✔ CUDA_RTC, ✖ TENSORRT, ✔ CPU_SSE, ✔ CPU_SSE2, ✔ CPU_SSE3, ✔ CPU_SSE4_1, ✔ CPU_SSE4_2, ✖ CPU_SSE4A, ✔ CPU_AVX, ✖ CPU_AVX2, ✔ OPENMP, ✖ SSE, ✔ F16C, ✖ JEMALLOC, ✔ BLAS_OPEN, ✖ BLAS_ATLAS, ✖ BLAS_MKL, ✖ BLAS_APPLE, ✔ LAPACK, ✔ MKLDNN, ✔ OPENCV, ✖ CAFFE, ✖ PROFILER, ✖ DIST_KVSTORE, ✖ CXX14, ✔ INT64_TENSOR_SIZE, ✔ SIGNAL_HANDLER, ✖ DEBUG, ✖ TVM_OP]
   
   MXNET_TEST_COUNT=1 nosetests --logging-level=DEBUG --verbose -s tests/python/gpu/test_operator_gpu.py:test_order
   
   (2, 5)
   [INFO] Setting module np/mx/python random seeds, use MXNET_MODULE_SEED=548212641 to reproduce.
   test_operator_gpu.test_order ... [DEBUG] Setting test np/mx/python random seeds, use MXNET_TEST_SEED=1891020679 to reproduce.
   [18:42:21] ../src/base.cc:84: Upgrade advisory: this mxnet has been built against cuDNN lib version 7501, which is older than the oldest version tested by CI (7600).  Set MXNET_CUDNN_LIB_CHECKING=0 to quiet this warning.
   ok
   
   ----------------------------------------------------------------------
   Ran 1 test in 9.185s
   
   OK
   ```
   
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org