You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2018/06/15 07:25:32 UTC
[GitHub] ZhennanQin opened a new pull request #11301: MKLDNN Backward op
cache
ZhennanQin opened a new pull request #11301: MKLDNN Backward op cache
URL: https://github.com/apache/incubator-mxnet/pull/11301
Hi all,
For MKLDNN, creating operator primitive and corresponding memory will spend a lot of time, for many scenarios, the created primitive can be reused if it meets requirement for next computing. In this PR, we implemented the caching mechanism for most backward operators, just like the optimization have done for forward operator. Correctness test and accuracy test are all PASS. Performance test shown that most models can get speed up in training. Please review this, Thanks.
@zheng-da, @marcoabreu, @azai91, @TaoLv, @pengzhao-intel
This PR covers below commits.
Enable primitive allocation cache for _backward_LRN
Enable primitive allocation cache for _backward_Pooling.
Enable primitive allocation cache for _backward_Activation.
Enable primitive allocation cache for _backward_Deconvolution.
Enable primitive allocation cache for _backward_BatchNorm.
Enable primitive allocation cache for _backward_Fully_Connected.
Enable primitive allocation cache for _backward_Convolution.
**Correntness test result:**
unit_test - PASS
Train model with dummy data when MXNET_MKLDNN_DEBUG=1
model | Status
-- | --
alexnet | PASS
googlenet | PASS
vgg-16 | PASS
vgg-19 | PASS
inception-bn | PASS
inception-v3 | PASS
resnet-50 | PASS
resnet-152 | PASS
**Accuracy test result:**
![1](https://user-images.githubusercontent.com/39290748/41454406-de4f2556-70ab-11e8-84c5-cfbeab0a9e60.png)
CiFAR10 + ResNet50 ( Convergence ): Validation accuracy of 99 epochs ( top1: 67.88%, top5: 96.26%)
![2](https://user-images.githubusercontent.com/39290748/41454419-ecc3dcd0-70ab-11e8-8417-b8eee32f9ec9.png)
CiFAR10 + VGG16 ( Convergence ): Validation accuracy of 74 epochs ( top1: 82.56%, top5: 98.52% )
**Performance test result:**
**Skylake 8180 2sockets training BS=32 (img/s)**
Model | Baseline | After Caching Backward Op | Diff%
-- | -- | -- | --
alexnet | 366.0699 | 412.3794 | 112.65%
googlenet | 104.3128 | 152.8598 | 146.54%
vgg-16 | 38.40072 | 39.65964 | 103.28%
vgg-19 | 32.35834 | 33.80966 | 104.49%
inception-bn | 85.71482 | 124.4676 | 145.21%
inception-v3 | 43.0482 | 57.82697 | 134.33%
resnet-50 | 53.15957 | 65.68513 | 123.56%
resnet-152 | 21.67982 | 28.26535 | 130.38%
inception-v4 | 23.82474 | 29.33314 | 123.12%
**Total Time consumption breakdown(ms)**
Model | Baseline | After caching Backward Op | Diff%
-- | -- | -- | --
Convolution | 19440.11 | 19441.49 | 100.01%
Activation | 11253.48 | 9493.82 | 84.36%
LRN | 43.325 | 44.281 | 102.21%
Pooling | 1002.97 | 1037.459 | 103.44%
Flatten | 36.84 | 37.138 | 100.81%
FullyConnected | 1135.342 | 1063.388 | 93.66%
Dropout | 210.459 | 207.743 | 98.71%
SoftmaxOutput | 6.944 | 6.428 | 92.57%
_backward_SoftmaxOutput | 3.627 | 4.424 | 121.97%
_backward_FullyConnected | 1792.938 | 1806.433 | 100.75%
_backward_Dropout | 3.977 | 3.586 | 90.17%
_backward_Activation | 11083.87 | 9536.887 | 86.04%
_zeros | 184.517 | 175.302 | 95.01%
_backward_copy | 1.076 | 0.917 | 85.22%
_backward_Pooling | 2291.866 | 2056.797 | 89.74%
_backward_Convolution | 63134.26 | 46223.86 | 73.22%
_backward_LRN | 185.551 | 130.467 | 70.31%
Concat | 594.403 | 618.61 | 104.07%
_backward_Concat | 703.62 | 708.137 | 100.64%
add_n | 2501.635 | 2556.075 | 102.18%
BatchNorm | 5815.348 | 5797.396 | 99.69%
_backward_BatchNorm | 10238.95 | 7818.107 | 76.36%
_copy | 339.458 | 351.205 | 103.46%
elemwise_add | 909.67 | 925.058 | 101.69%
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services