You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2018/05/29 09:06:29 UTC

[GitHub] austingg commented on issue #10804: Use depthwise convolution(group convolution) by cuDNNv7 if available

austingg commented on issue #10804: Use depthwise convolution(group convolution) by cuDNNv7 if available
URL: https://github.com/apache/incubator-mxnet/pull/10804#issuecomment-392706348
 
 
   cudnn has optimized some special path for grouped convolution mostly in `cudnn 7.0.3 and 7.0.4` . `Performance improvements for grouped convolutions when input channels and output channels per group are 1, 2, or 4 for the following algorithms`. [cudnn release note](https://docs.nvidia.com/deeplearning/sdk/cudnn-release-notes/rel_704.html#rel_704)
   
   we may referecence nvidia-caffe's verfication
   ```c++
   #if CUDNN_VERSION_MIN(7, 0, 2)
     #define CUDNN_GROUPING
   #endif
   #if CUDNN_VERSION_MIN(7, 0, 3)
     #define CUDNN_GROUPING2
   #endif
   
   bool use_v7grouping() const {
   #if defined(CUDNN_GROUPING2)
       return (this->channels_ == this->group_
            || this->channels_ == this->group_ * 2
            || this->channels_ == this->group_* 4)
           && (this->num_output_ == this->group_
            || this->num_output_ == this->group_ * 2
            || this->num_output_ == this->group_* 4);
   #elif defined(CUDNN_GROUPING)
       return this->channels_ == this->num_output_ && this->channels_ == this->group_;
   #else
       return false;
   #endif
     }
   ```
   
   for old path, it still uses for-loop.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services