You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2020/03/10 18:45:37 UTC

[GitHub] [incubator-mxnet] access2rohit opened a new pull request #17805: fixing batch_norm and layer_norm for large tensors

access2rohit opened a new pull request #17805: fixing batch_norm and layer_norm for large tensors
URL: https://github.com/apache/incubator-mxnet/pull/17805
 
 
   ## Description ##
   Enables large tensor support for following ops:
   1. batch_norm
   2. layer_norm
   
   ## Checklist ##
   ### Essentials ###
   Please feel free to remove inapplicable items for your PR.
   - [x] Changes are complete (i.e. I finished coding on this PR)
   - [x] All changes have test coverage:
   - [x] To the best of my knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change
   
   ### Proof Of Correctness ###
   ## layer_norm() ##
   
   Before changes:
   ```
   333	  const int channelCount = dshape[channelAxis];
   (gdb) info local
   param = @0x555555cdb770: {<dmlc::Parameter<mxnet::op::BatchNormParam>> = {<No data fields>}, eps = 0.0010000000474974513, momentum = 0.899999976, fix_gamma = true, use_global_stats = false, output_mean_var = false, axis = 0,
     cudnn_off = false, min_calib_range = {is_none = true, val = {__data = "\000\000\000", __align = {<No data fields>}}}, max_calib_range = {is_none = true, val = {__data = "UU\000", __align = {<No data fields>}}}}
   dshape = @0x5555572290a0: {<mxnet::Tuple<long>> = {static kStackCache = <optimized out>, ndim_ = 1, num_heap_allocated_ = 0, data_stack_ = {4300000000, 1, 4300000000, 0}, data_heap_ = 0x0}, <No data fields>}
   channelAxis = 0
   channelCount = 21845 <--------
   (gdb) p dshape[channelAxis]
   $1 = (long &) @0x5555572290a8: 4300000000 <--------
   (gdb) n
   335	  if (!mxnet::ndim_is_known(dshape)) {
   (gdb) p channelCount
   $2 = 5032704
   ```
   After Changes:
   ```
   Thread 1 "python3" hit Breakpoint 2, mxnet::op::LayerNormShape (attrs=..., in_shape=0x555556578ff8, out_shape=0x555556579010) at src/operator/nn/layer_norm.cc:50
   50	  const index_t channelCount = dshape[axis];
   (gdb) n
   52	  if (!mxnet::ndim_is_known(dshape)) {
   (gdb) info local
   param = @0x7fffffff9438: {<dmlc::Parameter<mxnet::op::LayerNormParam>> = {<No data fields>}, axis = 0, eps = 9.99999975e-06, output_mean_var = false}
   dshape = @0x5555565bc420: {<mxnet::Tuple<long>> = {static kStackCache = <optimized out>, ndim_ = 1, num_heap_allocated_ = 0, data_stack_ = {4300000000, 6878235116697514089, 32088647312828786, 0},
       data_heap_ = 0x0}, <No data fields>}
   axis = 0
   channelCount = 4300000000 <--------
   moments_shape = {<mxnet::Tuple<long>> = {static kStackCache = <optimized out>, ndim_ = -29480, num_heap_allocated_ = 32767, data_stack_ = {140737488326512, 140737488325408, 93825021150800, 140737488325408},
       data_heap_ = 0x7fff936c4de7
        <std::_Rb_tree<dmlc::parameter::FieldAccessEntry*, dmlc::parameter::FieldAccessEntry*, std::_Identity<dmlc::parameter::FieldAccessEntry*>, std::less<dmlc::parameter::FieldAccessEntry*>, std::allocator<dmlc::parameter::FieldAccessEntry*> >::_Alloc_node::operator()<dmlc::parameter::FieldAccessEntry* const&>(dmlc::parameter::FieldAccessEntry* const&) const+49>}, <No data fields>}
   (gdb) p dshape[axis]
   $1 = (long &) @0x5555565bc428: 4300000000 <--------
   ```
   
   ## batch_norm() ##
   
   Before changes:
   ```
   Thread 1 "python3" hit Breakpoint 1, mxnet::op::LayerNormShape (attrs=..., in_shape=0x555556579dc8, out_shape=0x555556579de0) at src/operator/nn/layer_norm.cc:50
   50	  const int channelCount = dshape[axis];
   (gdb) n
   52	  if (!mxnet::ndim_is_known(dshape)) {
   (gdb) p channelCount
   $3 = 5032704   <--------
   (gdb) p dshape[0]
   $4 = (long &) @0x555556c21f58: 430000000 <--------
   (gdb) info local
   param = @0x7fffffff9418: {<dmlc::Parameter<mxnet::op::LayerNormParam>> = {<No data fields>}, axis = 0, eps = 9.99999975e-06, output_mean_var = false}
   dshape = @0x555556c21f50: {<mxnet::Tuple<long>> = {static kStackCache = <optimized out>, ndim_ = 1, num_heap_allocated_ = 0, data_stack_ = {4300000000, 0, 0, 0}, data_heap_ = 0x0}, <No data fields>}
   axis = 0
   channelCount = 5032704
   moments_shape = {<mxnet::Tuple<long>> = {static kStackCache = <optimized out>, ndim_ = -29512, num_heap_allocated_ = 32767, data_stack_ = {140737488326480, 140737488325376, 93825019642720, 140737488325376},
       data_heap_ = 0x7fff936c4de7
        <std::_Rb_tree<dmlc::parameter::FieldAccessEntry*, dmlc::parameter::FieldAccessEntry*, std::_Identity<dmlc::parameter::FieldAccessEntry*>, std::less<dmlc::parameter::FieldAccessEntry*>, std::allocator<dmlc::parameter::FieldAccessEntry*> >::_Alloc_node::operator()<dmlc::parameter::FieldAccessEntry* const&>(dmlc::parameter::FieldAccessEntry* const&) const+49>}, <No data fields>}
   ```
   
   After Changes:
   
   ```
   Thread 1 "python3" hit Breakpoint 1, mxnet::op::BatchNormShape (attrs=..., in_shape=0x555556579d98, out_shape=0x555556579db0) at src/operator/nn/batch_norm.cc:333
   333	  const index_t channelCount = dshape[channelAxis];
   (gdb) n
   335	  if (!mxnet::ndim_is_known(dshape)) {
   (gdb) info local
   param = @0x555555cdb770: {<dmlc::Parameter<mxnet::op::BatchNormParam>> = {<No data fields>}, eps = 0.0010000000474974513, momentum = 0.899999976, fix_gamma = true, use_global_stats = false, output_mean_var = false, axis = 0,
     cudnn_off = false, min_calib_range = {is_none = true, val = {__data = "\000\000\000", __align = {<No data fields>}}}, max_calib_range = {is_none = true, val = {__data = "UU\000", __align = {<No data fields>}}}}
   dshape = @0x5555572290a0: {<mxnet::Tuple<long>> = {static kStackCache = <optimized out>, ndim_ = 1, num_heap_allocated_ = 0, data_stack_ = {4300000000, 1, 4300000000, 0}, data_heap_ = 0x0}, <No data fields>}
   channelAxis = 0
   channelCount = 4300000000 <--------
   (gdb) p dshape[channelAxis]
   $1 = (long &) @0x5555572290a8: 4300000000  <--------
   ```
   
   
   ## Testing ##
   ```
   $ MXNET_TEST_COUNT=1 nosetests --logging-level=DEBUG --verbose -s tests/nightly/test_large_vector.py:test_nn
   /home/ubuntu/anaconda3/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
     from ._conv import register_converters as _register_converters
   test_large_vector.test_nn ... [18:14:51] src/executor/graph_executor.cc:1981: Subgraph backend MKLDNN is activated.
   [18:21:14] src/executor/graph_executor.cc:1981: Subgraph backend MKLDNN is activated.
   ok
   
   ----------------------------------------------------------------------
   Ran 1 test in 1017.457s
   
   OK
   ```
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] access2rohit edited a comment on issue #17805: fixing batch_norm and layer_norm for large tensor nightly test

Posted by GitBox <gi...@apache.org>.

access2rohit edited a comment on issue #17805: fixing batch_norm and layer_norm for large tensor nightly test
URL: https://github.com/apache/incubator-mxnet/pull/17805#issuecomment-597252004
 
 
   @apeforest @ChaiBapchya I don't know much about layer_norm() or batch_norm() to add suitable shape checks in the tests. I have provided gdb outputs after fixing the code. Can you guys suggest addition of proper shape testing that can be added to ``test_large_vecor` and `test_large_array`?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] apeforest edited a comment on issue #17805: fixing batch_norm and layer_norm for large tensor nightly test

Posted by GitBox <gi...@apache.org>.

apeforest edited a comment on issue #17805: fixing batch_norm and layer_norm for large tensor nightly test
URL: https://github.com/apache/incubator-mxnet/pull/17805#issuecomment-597319830
 
 
   It's very unlikely the number of channels will be greater than 2^31. So this should not cause problem in practice. @sxjscience please confirm.
   
   @access2rohit I don't fully understand the gdb output in your description. They seem to stop at different places and what do you want us to see?  

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] access2rohit edited a comment on issue #17805: fixing batch_norm and layer_norm for large tensor nightly test

Posted by GitBox <gi...@apache.org>.

access2rohit edited a comment on issue #17805: fixing batch_norm and layer_norm for large tensor nightly test
URL: https://github.com/apache/incubator-mxnet/pull/17805#issuecomment-597252004
 
 
   @apeforest @ChaiBapchya I don't know much about layer_norm() or batch_norm() to add suitable shape checks in the tests. I have provided gdb outputs after fixing the code. Can you guys suggest addition of proper shape testing that can be added to `test_large_vector` and `test_large_array`?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] access2rohit commented on issue #17805: fixing batch_norm and layer_norm for large tensors

Posted by GitBox <gi...@apache.org>.

access2rohit commented on issue #17805: fixing batch_norm and layer_norm for large tensors
URL: https://github.com/apache/incubator-mxnet/pull/17805#issuecomment-597252004
 
 
   @apeforest @ChaiBapchya I don't know much about layer_norm() or batch_norm() to add suitable shape checks in the tests. But is have provided gdb outputs after fixing the code. Can you guys suggest addition of proper shape testing that can be added at python end ?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] apeforest merged pull request #17805: fixing batch_norm and layer_norm for large tensor nightly test

Posted by GitBox <gi...@apache.org>.

apeforest merged pull request #17805: fixing batch_norm and layer_norm for large tensor nightly test
URL: https://github.com/apache/incubator-mxnet/pull/17805
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] access2rohit commented on issue #17805: fixing batch_norm and layer_norm for large tensor nightly test

Posted by GitBox <gi...@apache.org>.

access2rohit commented on issue #17805: fixing batch_norm and layer_norm for large tensor nightly test
URL: https://github.com/apache/incubator-mxnet/pull/17805#issuecomment-597259149
 
 
   @mxnet-label-bot add [pr-awaiting-review]

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] access2rohit commented on issue #17805: fixing batch_norm and layer_norm for large tensor nightly test

Posted by GitBox <gi...@apache.org>.

access2rohit commented on issue #17805: fixing batch_norm and layer_norm for large tensor nightly test
URL: https://github.com/apache/incubator-mxnet/pull/17805#issuecomment-599762046
 
 
   @mxnet-label-bot update [pr-awaiting-merge]

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] apeforest commented on issue #17805: fixing batch_norm and layer_norm for large tensor nightly test

Posted by GitBox <gi...@apache.org>.

apeforest commented on issue #17805: fixing batch_norm and layer_norm for large tensor nightly test
URL: https://github.com/apache/incubator-mxnet/pull/17805#issuecomment-597319830
 
 
   It's very unlikely the number of channels will be greater than 2^31. So this should not cause problem in practice. @sxjscience please confirm.
   
   @access2rohit I don't fully understand the gdb output in your description. They seem to stop at different places and what you would like us to see? 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] access2rohit commented on issue #17805: fixing batch_norm and layer_norm for large tensor nightly test

Posted by GitBox <gi...@apache.org>.

access2rohit commented on issue #17805: fixing batch_norm and layer_norm for large tensor nightly test
URL: https://github.com/apache/incubator-mxnet/pull/17805#issuecomment-597300764
 
 
   >     1. How is addition of SHAPE_ASSIGN_CHECK to layer_norm causing this failure?
   >        Layer norm/batch norm were passing before and some change caused it to start to fail right? What's that root cause?
   it was incorrect when added check my GDB logs
   
   >     2. Also it turns out - batch norm already has shape check in test_large_array.py
   >        https://github.com/apache/incubator-mxnet/blob/afb8742e6e1e987833b39c487dc892b5537196a1/tests/nightly/test_large_array.py#L327
   > 
   > 
   > Layer norm doesn't have such a check in test_large_array.py. Maybe you could add that.
   Currently I don't have cycles to work on this. I have asked @sxjscience to see if he can add this check. Since I would be occupied for next 2 weeks.
   
   > Fundamentally, For both batch norm and layer norm, since the operation is just to perform normalization over layer/batch, input shape should be equal to output shape.
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] access2rohit edited a comment on issue #17805: fixing batch_norm and layer_norm for large tensor nightly test

Posted by GitBox <gi...@apache.org>.

access2rohit edited a comment on issue #17805: fixing batch_norm and layer_norm for large tensor nightly test
URL: https://github.com/apache/incubator-mxnet/pull/17805#issuecomment-597252004
 
 
   @apeforest @ChaiBapchya I don't know much about layer_norm() or batch_norm() to add suitable shape checks in the tests. I have provided gdb outputs after fixing the code. Can you guys suggest addition of proper shape testing that can be added to `test_large_vecor` and `test_large_array`?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] ChaiBapchya commented on issue #17805: fixing batch_norm and layer_norm for large tensor nightly test

Posted by GitBox <gi...@apache.org>.

ChaiBapchya commented on issue #17805: fixing batch_norm and layer_norm for large tensor nightly test
URL: https://github.com/apache/incubator-mxnet/pull/17805#issuecomment-597270999
 
 
   1. How is addition of SHAPE_ASSIGN_CHECK to layer_norm causing this failure?
   Layer norm/batch norm were passing before and some change caused it to start to fail right? What's that root cause?
   
   2. Also it turns out - batch norm already has shape check in test_large_array.py
   https://github.com/apache/incubator-mxnet/blob/afb8742e6e1e987833b39c487dc892b5537196a1/tests/nightly/test_large_array.py#L327
   
   Layer norm doesn't have such a check in test_large_array.py. Maybe you could add that. 
   
   Fundamentally, For both batch norm and layer norm, since the operation is just to perform normalization over layer/batch, input shape should be equal to output shape.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] access2rohit edited a comment on issue #17805: fixing batch_norm and layer_norm for large tensor nightly test

Posted by GitBox <gi...@apache.org>.

access2rohit edited a comment on issue #17805: fixing batch_norm and layer_norm for large tensor nightly test
URL: https://github.com/apache/incubator-mxnet/pull/17805#issuecomment-597252004
 
 
   @apeforest @ChaiBapchya I don't know much about layer_norm() or batch_norm() to add suitable shape checks in the tests. I have provided gdb outputs after fixing the code. Can you guys suggest addition of proper shape testing that can be added at python end ?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] access2rohit edited a comment on issue #17805: fixing batch_norm and layer_norm for large tensor nightly test

Posted by GitBox <gi...@apache.org>.

access2rohit edited a comment on issue #17805: fixing batch_norm and layer_norm for large tensor nightly test
URL: https://github.com/apache/incubator-mxnet/pull/17805#issuecomment-597300764
 
 
   @ChaiBapchya 
   >     1. How is addition of SHAPE_ASSIGN_CHECK to layer_norm causing this failure?
   >        Layer norm/batch norm were passing before and some change caused it to start to fail right? What's that root cause?
   it was incorrect when added check my GDB logs
   
   >     2. Also it turns out - batch norm already has shape check in test_large_array.py
   >        https://github.com/apache/incubator-mxnet/blob/afb8742e6e1e987833b39c487dc892b5537196a1/tests/nightly/test_large_array.py#L327
   
   Its still incorrect.
   
   > Layer norm doesn't have such a check in test_large_array.py. Maybe you could add that.
   
   Actually its better to add the check added in this PR #17683 
   Currently I don't have cycles to work on this. I have asked @sxjscience to see if he can add this check. Since I would be occupied for next 2 weeks.
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] access2rohit edited a comment on issue #17805: fixing batch_norm and layer_norm for large tensor nightly test

Posted by GitBox <gi...@apache.org>.

access2rohit edited a comment on issue #17805: fixing batch_norm and layer_norm for large tensor nightly test
URL: https://github.com/apache/incubator-mxnet/pull/17805#issuecomment-597300764
 
 
   @ChaiBapchya 
   >     1. How is addition of SHAPE_ASSIGN_CHECK to layer_norm causing this failure?
   >        Layer norm/batch norm were passing before and some change caused it to start to fail right? What's that root cause?
   it was incorrect when added check my GDB logs
   
   >     2. Also it turns out - batch norm already has shape check in test_large_array.py
   >        https://github.com/apache/incubator-mxnet/blob/afb8742e6e1e987833b39c487dc892b5537196a1/tests/nightly/test_large_array.py#L327
   
   Its still incorrect.
   
   > Layer norm doesn't have such a check in test_large_array.py. Maybe you could add that.
   
   Currently I don't have cycles to work on this. I have asked @sxjscience to see if he can add this check. Since I would be occupied for next 2 weeks.
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] access2rohit edited a comment on issue #17805: fixing batch_norm and layer_norm for large tensor nightly test

Posted by GitBox <gi...@apache.org>.

access2rohit edited a comment on issue #17805: fixing batch_norm and layer_norm for large tensor nightly test
URL: https://github.com/apache/incubator-mxnet/pull/17805#issuecomment-597300764
 
 
   @ChaiBapchya 
   >     1. How is addition of SHAPE_ASSIGN_CHECK to layer_norm causing this failure?
   >        Layer norm/batch norm were passing before and some change caused it to start to fail right? What's that root cause?
   it was incorrect when added check my GDB logs
   
   >     2. Also it turns out - batch norm already has shape check in test_large_array.py
   >        https://github.com/apache/incubator-mxnet/blob/afb8742e6e1e987833b39c487dc892b5537196a1/tests/nightly/test_large_array.py#L327
   > 
   > 
   > Layer norm doesn't have such a check in test_large_array.py. Maybe you could add that.
   
   Currently I don't have cycles to work on this. I have asked @sxjscience to see if he can add this check. Since I would be occupied for next 2 weeks.
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] access2rohit edited a comment on issue #17805: fixing batch_norm and layer_norm for large tensor nightly test

Posted by GitBox <gi...@apache.org>.

access2rohit edited a comment on issue #17805: fixing batch_norm and layer_norm for large tensor nightly test
URL: https://github.com/apache/incubator-mxnet/pull/17805#issuecomment-597300764
 
 
   @ChaiBapchya 
   >     1. How is addition of SHAPE_ASSIGN_CHECK to layer_norm causing this failure?
   >        Layer norm/batch norm were passing before and some change caused it to start to fail right? What's that root cause?
   it was incorrect when added check my GDB logs
   
   >     2. Also it turns out - batch norm already has shape check in test_large_array.py
   >        https://github.com/apache/incubator-mxnet/blob/afb8742e6e1e987833b39c487dc892b5537196a1/tests/nightly/test_large_array.py#L327
   
   Its still incorrect.
   
   > Layer norm doesn't have such a check in test_large_array.py. Maybe you could add that.
   
   Actually its better to add the check added in this PR #17805 
   Currently I don't have cycles to work on this. I have asked @sxjscience to see if he can add this check. Since I would be occupied for next 2 weeks.
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] access2rohit edited a comment on issue #17805: fixing batch_norm and layer_norm for large tensor nightly test

Posted by GitBox <gi...@apache.org>.

access2rohit edited a comment on issue #17805: fixing batch_norm and layer_norm for large tensor nightly test
URL: https://github.com/apache/incubator-mxnet/pull/17805#issuecomment-597300764
 
 
   @ChaiBapchya 
   >     1. How is addition of SHAPE_ASSIGN_CHECK to layer_norm causing this failure?
   >        Layer norm/batch norm were passing before and some change caused it to start to fail right? What's that root cause?
   it was incorrect when added check my GDB logs
   
   >     2. Also it turns out - batch norm already has shape check in test_large_array.py
   >        https://github.com/apache/incubator-mxnet/blob/afb8742e6e1e987833b39c487dc892b5537196a1/tests/nightly/test_large_array.py#L327
   
   Its still incorrect.
   
   > Layer norm doesn't have such a check in test_large_array.py. Maybe you could add that.
   
   Actually its beeter to add the check added in this PR #17805 
   Currently I don't have cycles to work on this. I have asked @sxjscience to see if he can add this check. Since I would be occupied for next 2 weeks.
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] ChaiBapchya commented on pull request #17805: fixing batch_norm and layer_norm for large tensor nightly test

Posted by GitBox <gi...@apache.org>.

ChaiBapchya commented on pull request #17805:
URL: https://github.com/apache/incubator-mxnet/pull/17805#issuecomment-695630608


   This needs to be cherry-picked into v1.x 
   Doing it now


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org