You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mxnet.apache.org by "Ni Hui (JIRA)" <ji...@apache.org> on 2018/03/14 05:30:00 UTC

[jira] [Created] (MXNET-97) implement DepthwiseConv2dBackwardFilterKernel from tensorflow codebase

Ni Hui created MXNET-97:
---------------------------

             Summary: implement DepthwiseConv2dBackwardFilterKernel from tensorflow codebase
                 Key: MXNET-97
                 URL: https://issues.apache.org/jira/browse/MXNET-97
             Project: Apache MXNet
          Issue Type: Improvement
            Reporter: Ni Hui


The current mxnet implementation calls __syncthreads() function too much, which is extemely slow.
The new code comes from tensorflow, but the variable names are adjusted for consistency.

My model uses depthwise conv heavily, and now its training time per iteration is over 5x faster on single P40 gpu. ( old 92s vs new 18s )




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org