You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2019/10/29 17:57:58 UTC
[GitHub] [incubator-mxnet] DickJC123 opened a new issue #16670: cuDNN RNN dtype_with_fallback_ calc needs update

DickJC123 opened a new issue #16670: cuDNN RNN dtype_with_fallback_ calc needs update
URL: https://github.com/apache/incubator-mxnet/issues/16670
 
 
   ## Description
   In src/operator/rnn-inl.h, the calculation of the mathPrec passed to cudnnSetRNNDescriptor_v6() needs review (the variable is `dtype_with_fallback_`).  The current logic is:
   ```
   #if __CUDA_ARCH__ < 530 && CUDNN_VERSION >= 7500
         if (dtype_ == CUDNN_DATA_HALF) {
           dtype_with_fallback_ = CUDNN_DATA_FLOAT;
         } else {
           dtype_with_fallback_ = dtype_;
         }
   #else
           dtype_with_fallback_ = dtype_;
   #endif
   ```
   The use of __CUDA_ARCH__ in determining host-side code behavior is not appropriate (the variable would be undefined in this case and interpreted as 0 in all cases).  Thus the intent of the code (to ensure pseudo-fp16 handling of RNNs for Maxwell GPUs even after the API change of cuDNN 7.5) is being applied to all architectures.
   
   My recommendation would be to adopt pseudo-fp16 for all architectures anyway.  A possible fix:
   ```
       cudnnDataType_t dtype_with_fallback_ =
           (CUDNN_VERSION >= 7500 && dtype_ == CUDNN_DATA_HALF) ? CUDNN_DATA_FLOAT
                                                                                                                     : dtype_;
       
   ```
   Technically, since cuDNN prior to v7.5 ignores the use mathPrec in cudnnSetRNNDescriptor_v6(), the CUDNN_VERSION >= 7500 term could be dropped with no impact.
   
   While we're fixing this, I think we should consider turning on Tensor Cores by default, or at least giving the user a mechanism to do so. 
   
   ## To Reproduce
   To verify my claim about __CUDA_ARCH__, I inserted the following code in rnn-inl.h just before the definition of dtype_with_fallback_:
   ```
   #ifdef __CUDA_ARCH__
         LOG(INFO) << "************* __CUDA_ARCH__ = " << __CUDA_ARCH__;
   #else
         LOG(INFO) << "************* __CUDA_ARCH__ is not defined";
   #endif
   ```
   ```
   $ nosetests --verbose -s tests/python/gpu/test_operator_gpu.py:test_rnn
   ...
   [10:53:00] src/operator/./rnn-inl.h:1388: ************* __CUDA_ARCH__ is not defined
   [10:53:00] src/operator/./rnn-inl.h:1388: ************* __CUDA_ARCH__ is not defined
   ``
   @stu1130  @ptrendx 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services