You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2018/05/29 09:06:29 UTC
[GitHub] austingg commented on issue #10804: Use depthwise convolution(group
convolution) by cuDNNv7 if available
austingg commented on issue #10804: Use depthwise convolution(group convolution) by cuDNNv7 if available
URL: https://github.com/apache/incubator-mxnet/pull/10804#issuecomment-392706348
cudnn has optimized some special path for grouped convolution mostly in `cudnn 7.0.3 and 7.0.4` . `Performance improvements for grouped convolutions when input channels and output channels per group are 1, 2, or 4 for the following algorithms`. [cudnn release note](https://docs.nvidia.com/deeplearning/sdk/cudnn-release-notes/rel_704.html#rel_704)
we may referecence nvidia-caffe's verfication
```c++
#if CUDNN_VERSION_MIN(7, 0, 2)
#define CUDNN_GROUPING
#endif
#if CUDNN_VERSION_MIN(7, 0, 3)
#define CUDNN_GROUPING2
#endif
bool use_v7grouping() const {
#if defined(CUDNN_GROUPING2)
return (this->channels_ == this->group_
|| this->channels_ == this->group_ * 2
|| this->channels_ == this->group_* 4)
&& (this->num_output_ == this->group_
|| this->num_output_ == this->group_ * 2
|| this->num_output_ == this->group_* 4);
#elif defined(CUDNN_GROUPING)
return this->channels_ == this->num_output_ && this->channels_ == this->group_;
#else
return false;
#endif
}
```
for old path, it still uses for-loop.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services