You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mxnet.apache.org by "Ni Hui (JIRA)" <ji...@apache.org> on 2018/03/14 05:30:00 UTC
[jira] [Created] (MXNET-97) implement
DepthwiseConv2dBackwardFilterKernel from tensorflow codebase
Ni Hui created MXNET-97:
---------------------------
Summary: implement DepthwiseConv2dBackwardFilterKernel from tensorflow codebase
Key: MXNET-97
URL: https://issues.apache.org/jira/browse/MXNET-97
Project: Apache MXNet
Issue Type: Improvement
Reporter: Ni Hui
The current mxnet implementation calls __syncthreads() function too much, which is extemely slow.
The new code comes from tensorflow, but the variable names are adjusted for consistency.
My model uses depthwise conv heavily, and now its training time per iteration is over 5x faster on single P40 gpu. ( old 92s vs new 18s )
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org