You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2019/05/31 09:01:36 UTC

[GitHub] [incubator-mxnet] xianyujie commented on issue #15108: The test time of the model on GPU is normal, but the test time on CPU is very long.

xianyujie commented on issue #15108: The test time of the model on GPU is normal, but the test time on CPU is very long.
URL: https://github.com/apache/incubator-mxnet/issues/15108#issuecomment-497634604
 
 
   Here, I tested the computation time for each output layer as follows:
   Test on CPU:
   > **output_name  my_mod  r100_mod**
   id_output: 					0.00042,   	0.0005
   _minusscalar0_output: 		0.000355,   0.000364
   _mulscalar0_output: 		0.000345,   0.000368
   conv0_output: 				0.004286,   0.005467
   bn0_output: 				0.004573,   0.004223
   relu0_output: 				0.005896,   0.011879
   stage1_unit1_bn1_output: 	0.009214,   0.012385
   stage1_unit1_conv1_output: 	0.030702,   0.033954
   stage1_unit1_bn2_output: 	0.029785,   0.030585
   stage1_unit1_relu1_output: 	0.031125,   0.027422
   stage1_unit1_conv2_output: 	0.103826,   0.030677
   stage1_unit1_bn3_output: 	0.10424,   	0.030824
   stage1_unit1_conv1sc_output:0.007101,   0.017014
   stage1_unit1_sc_output: 	0.006566,   0.01673
   _plus0_output: 				0.107347,   0.040394
   stage1_unit2_bn1_output: 	0.107868,   0.041233
   stage1_unit2_conv1_output: 	0.118626,   0.065909
   stage1_unit2_bn2_output: 	0.111833,   0.045059
   stage1_unit2_relu1_output: 	0.137247,   0.068904
   stage1_unit2_conv2_output: 	0.191567,   0.064852
   stage1_unit2_bn3_output: 	0.173118,   0.060588
   _plus1_output: 				0.156072,   0.04732
   stage1_unit3_bn1_output: 	0.155305,   0.047213
   stage1_unit3_conv1_output: 	0.159347,   0.049804
   stage1_unit3_bn2_output: 	0.164846,   0.066085
   stage1_unit3_relu1_output: 	0.1657,   	0.071379
   stage1_unit3_conv2_output: 	0.325648,   0.052968
   stage1_unit3_bn3_output: 	0.37007,   	0.05749
   _plus2_output: 				0.33891,   	0.071809
   stage2_unit1_bn1_output: 	0.328169,   0.066442
   stage2_unit1_conv1_output: 	0.322733,   0.056022
   stage2_unit1_bn2_output: 	0.325429,   0.066875
   stage2_unit1_relu1_output: 	0.326984,   0.075964
   stage2_unit1_conv2_output: 	0.490685,   0.06886
   stage2_unit1_bn3_output: 	0.494744,   0.070616
   stage2_unit1_conv1sc_output:0.313479,   0.050747
   stage2_unit1_sc_output: 	0.313801,   0.051451
   _plus3_output: 				0.488039,   0.057655
   stage2_unit2_bn1_output: 	0.560673,   0.078912
   stage2_unit2_conv1_output: 	0.551832,   0.080624
   stage2_unit2_bn2_output: 	0.523579,   0.069424
   stage2_unit2_relu1_output: 	0.498217,   0.07407
   stage2_unit2_conv2_output: 	0.584102,   0.074921
   stage2_unit2_bn3_output: 	0.58085,   	0.074009
   _plus4_output: 				0.581565,   0.072269
   stage2_unit3_bn1_output: 	0.587932,   0.079426
   stage2_unit3_conv1_output: 	0.584251,   0.06408
   stage2_unit3_bn2_output: 	0.592011,   0.085435
   stage2_unit3_relu1_output: 	0.58509,   	0.07392
   stage2_unit3_conv2_output: 	0.624394,   0.081102
   stage2_unit3_bn3_output: 	0.627502,   0.078248
   _plus5_output: 				0.624716,   0.082667
   stage2_unit4_bn1_output: 	0.62572,   	0.079881
   stage2_unit4_conv1_output: 	0.671533,   0.104247
   stage2_unit4_bn2_output: 	0.664362,   0.090178
   stage2_unit4_relu1_output: 	0.668143,   0.093252
   stage2_unit4_conv2_output: 	0.831109,   0.093747
   stage2_unit4_bn3_output: 	0.794402,   0.091997
   _plus6_output: 				0.796238,   0.084438
   stage2_unit5_bn1_output: 	0.803676,   0.087099
   stage2_unit5_conv1_output: 	0.798894,   0.088101
   stage2_unit5_bn2_output: 	0.801456,   0.094359
   stage2_unit5_relu1_output: 	0.799399,   0.085508
   stage2_unit5_conv2_output: 	0.972368,   0.099089
   stage2_unit5_bn3_output: 	0.973519,   0.091875
   _plus7_output: 				0.974544,   0.100297
   stage2_unit6_bn1_output: 	0.974595,   0.094085
   stage2_unit6_conv1_output: 	0.975609,   0.104353
   stage2_unit6_bn2_output: 	0.973079,   0.09231
   stage2_unit6_relu1_output: 	0.978731,   0.094409
   stage2_unit6_conv2_output: 	1.151426,   0.095977
   stage2_unit6_bn3_output: 	1.154868,   0.100843
   _plus8_output: 				1.152926,   0.106044
   stage2_unit7_bn1_output: 	1.154156,   0.10229
   stage2_unit7_conv1_output: 	1.264803,   0.104436
   stage2_unit7_bn2_output: 	1.152894,   0.09908
   stage2_unit7_relu1_output: 	1.156383,   0.102306
   stage2_unit7_conv2_output: 	1.327798,   0.102294
   stage2_unit7_bn3_output: 	1.329875,   0.099894
   _plus9_output: 				1.334517,   0.109175
   stage2_unit8_bn1_output: 	1.331067,   0.115449
   stage2_unit8_conv1_output: 	1.337357,   0.11694
   stage2_unit8_bn2_output: 	1.33041,   	0.107694
   stage2_unit8_relu1_output: 	1.367515,   0.115792
   stage2_unit8_conv2_output: 	1.514946,   0.12579
   stage2_unit8_bn3_output: 	1.513034,   0.116744
   _plus10_output: 			1.517037,   0.120017
   stage2_unit9_bn1_output: 	1.513593,   0.117845
   stage2_unit9_conv1_output: 	1.511412,   0.118779
   stage2_unit9_bn2_output: 	1.515947,   0.119333
   stage2_unit9_relu1_output: 	1.517616,   0.122603
   stage2_unit9_conv2_output: 	1.690708,   0.126754
   stage2_unit9_bn3_output: 	1.752104,   0.14106
   _plus11_output: 			1.691881,   0.124162
   stage2_unit10_bn1_output: 	1.693309,   0.121821
   stage2_unit10_conv1_output: 1.693686,   0.124328
   stage2_unit10_bn2_output: 	1.69515,   	0.127055
   stage2_unit10_relu1_output: 1.698078,   0.127429
   stage2_unit10_conv2_output: 1.870003,   0.127633
   stage2_unit10_bn3_output: 	1.870875,   0.125901
   _plus12_output: 			1.87123,   	0.129811
   stage2_unit11_bn1_output: 	1.869331,   0.133868
   stage2_unit11_conv1_output: 1.873357,   0.129109
   stage2_unit11_bn2_output: 	1.873282,   0.132226
   stage2_unit11_relu1_output: 1.920254,   0.132089
   stage2_unit11_conv2_output: 2.049825,   0.13312
   stage2_unit11_bn3_output: 	2.050059,   0.137388
   _plus13_output: 			2.048181,   0.131752
   stage2_unit12_bn1_output: 	2.051213,   0.131852
   stage2_unit12_conv1_output: 2.051463,   0.141885
   stage2_unit12_bn2_output: 	2.052243,   0.13839
   stage2_unit12_relu1_output: 2.052618,   0.135662
   stage2_unit12_conv2_output: 2.22834,   	0.140404
   stage2_unit12_bn3_output: 	2.224938,   0.135503
   _plus14_output: 			2.22612,   	0.139653
   stage2_unit13_bn1_output: 	2.227065,   0.15341
   stage2_unit13_conv1_output: 2.227888,   0.139268
   stage2_unit13_bn2_output: 	2.308939,   0.152178
   stage2_unit13_relu1_output: 2.264702,   0.13944
   stage2_unit13_conv2_output: 2.445593,   0.14272
   stage2_unit13_bn3_output: 	2.48221,   	0.14621
   _plus15_output: 			2.404699,   0.142299
   stage3_unit1_bn1_output: 	2.404635,   0.145371
   stage3_unit1_conv1_output: 	2.495546,   0.130115
   stage3_unit1_bn2_output: 	2.404267,   0.137987
   stage3_unit1_relu1_output: 	2.408431,   0.13683
   stage3_unit1_conv2_output: 	2.646087,   0.15073
   stage3_unit1_bn3_output: 	2.622898,   0.145354
   stage3_unit1_conv1sc_output:2.404587,   0.142018
   stage3_unit1_sc_output: 	2.449002,   0.156136
   _plus16_output: 			2.623318,   0.15435
   stage3_unit2_bn1_output: 	2.660541,   0.145252
   stage3_unit2_conv1_output: 	2.625539,   0.140111
   stage3_unit2_bn2_output: 	2.62496,   	0.150848
   stage3_unit2_relu1_output: 	2.625702,   0.138165
   stage3_unit2_conv2_output: 	2.701528,   0.15487
   stage3_unit2_bn3_output: 	2.674332,   0.139043
   _plus17_output: 			2.82142,   	0.140635
   stage3_unit3_bn1_output: 	2.67498,   	0.141444
   stage3_unit3_conv1_output: 	2.720274,   0.161515
   stage3_unit3_bn2_output: 	2.676898,   0.140523
   stage3_unit3_relu1_output: 	2.744135,   0.158047
   stage3_unit3_conv2_output: 	2.805416,   0.154539
   stage3_unit3_bn3_output: 	2.76725,   	0.155655
   _plus18_output: 			2.723814,   0.149114
   stage3_unit4_bn1_output: 	2.770419,   0.160261
   stage3_unit4_conv1_output: 	2.754413,   0.142758
   stage3_unit4_bn2_output: 	2.776872,   0.144919
   stage3_unit4_relu1_output: 	2.768867,   0.162386
   stage3_unit4_conv2_output: 	2.785148,   0.149916
   stage3_unit4_bn3_output: 	2.785508,   0.150452
   _plus19_output: 			2.852743,   0.150676
   stage3_unit5_bn1_output: 	2.858445,   0.159505
   stage3_unit5_conv1_output: 	2.787531,   0.134272
   stage3_unit5_bn2_output: 	2.857317,   0.176636
   stage3_unit5_relu1_output: 	2.78761,   	0.148593
   stage3_unit5_conv2_output: 	2.973983,   0.150124
   stage3_unit5_bn3_output: 	2.878887,   0.160537
   _plus20_output: 			2.915179,   0.147848
   stage3_unit6_bn1_output: 	2.961713,   0.157653
   stage3_unit6_conv1_output: 	2.925078,   0.158604
   stage3_unit6_bn2_output: 	2.952041,   0.183091
   stage3_unit6_relu1_output: 	2.950082,   0.152073
   stage3_unit6_conv2_output: 	3.070035,   0.154965
   stage3_unit6_bn3_output: 	3.083161,   0.160839
   _plus21_output: 			3.068352,   0.165067
   stage3_unit7_bn1_output: 	3.065441,   0.152242
   stage3_unit7_conv1_output: 	3.071425,   0.158774
   stage3_unit7_bn2_output: 	3.068187,   0.156179
   stage3_unit7_relu1_output: 	3.068218,   0.163327
   stage3_unit7_conv2_output: 	3.214675,   0.165707
   stage3_unit7_bn3_output: 	3.17479,   	0.164326
   _plus22_output: 			3.205394,   0.161077
   stage3_unit8_bn1_output: 	3.206084,   0.15387
   stage3_unit8_conv1_output: 	3.208103,   0.167449
   stage3_unit8_bn2_output: 	3.199914,   0.162553
   stage3_unit8_relu1_output: 	3.204158,   0.165906
   stage3_unit8_conv2_output: 	3.527746,   0.159096
   stage3_unit8_bn3_output: 	3.364253,   0.162314
   _plus23_output: 			3.383523,   0.164172
   stage3_unit9_bn1_output: 	3.382616,   0.164813
   stage3_unit9_conv1_output: 	3.362859,   0.158199
   stage3_unit9_bn2_output: 	3.37949,   	0.158827
   stage3_unit9_relu1_output: 	3.384939,   0.166237
   stage3_unit9_conv2_output: 	3.537675,   0.159558
   stage3_unit9_bn3_output: 	3.540659,   0.160435
   _plus24_output: 			3.543729,   0.163489
   stage3_unit10_bn1_output: 	3.541645,   0.162107
   stage3_unit10_conv1_output: 3.544351,   0.170958
   stage3_unit10_bn2_output: 	3.597147,   0.168301
   stage3_unit10_relu1_output: 3.557067,   0.177325
   stage3_unit10_conv2_output: 3.710173,   0.177329
   stage3_unit10_bn3_output: 	3.701505,   0.170064
   _plus25_output: 			3.710012,   0.166437
   stage3_unit11_bn1_output: 	3.705039,   0.163946
   stage3_unit11_conv1_output: 3.710265,   0.16616
   stage3_unit11_bn2_output: 	3.717233,   0.178367
   stage3_unit11_relu1_output: 3.702583,   0.170878
   stage3_unit11_conv2_output: 3.886982,   0.183558
   stage3_unit11_bn3_output: 	3.877928,   0.165655
   _plus26_output: 			3.997818,   0.177522
   stage3_unit12_bn1_output: 	3.879009,   0.167984
   stage3_unit12_conv1_output: 3.892867,   0.18921
   stage3_unit12_bn2_output: 	3.890289,   0.164483
   stage3_unit12_relu1_output: 3.884186,   0.173198
   stage3_unit12_conv2_output: 4.021346,   0.179663
   stage3_unit12_bn3_output: 	3.984339,   0.162524
   _plus27_output: 			3.984637,   0.166945
   stage3_unit13_bn1_output: 	4.022058,   0.18014
   stage3_unit13_conv1_output: 4.050858,   0.168892
   stage3_unit13_bn2_output: 	4.094155,   0.208368
   stage3_unit13_relu1_output: 4.050477,   0.173887
   stage3_unit13_conv2_output: 4.115462,   0.161895
   stage3_unit13_bn3_output: 	4.186274,   0.192759
   _plus28_output: 			4.18548,   	0.189159
   stage3_unit14_bn1_output: 	4.159161,   0.171517
   stage3_unit14_conv1_output: 4.18956,   	0.190796
   stage3_unit14_bn2_output: 	4.178333,   0.17536
   stage3_unit14_relu1_output: 4.179994,   0.175066
   stage3_unit14_conv2_output: 4.300632,   0.174267
   stage3_unit14_bn3_output: 	4.299918,   0.177745
   _plus29_output: 			4.297867,   0.177089
   stage3_unit15_bn1_output: 	4.285073,   0.191193
   stage3_unit15_conv1_output: 4.240216,   0.171769
   stage3_unit15_bn2_output: 	4.310539,   0.199338
   stage3_unit15_relu1_output: 4.322123,   0.206417
   stage3_unit15_conv2_output: 4.383803,   0.197183
   stage3_unit15_bn3_output: 	4.350731,   0.190658
   _plus30_output: 			4.382977,   0.193918
   stage3_unit16_bn1_output: 	4.357064,   0.178021
   stage3_unit16_conv1_output: 4.386747,   0.204758
   stage3_unit16_bn2_output: 	4.318768,   0.175646
   stage3_unit16_relu1_output: 4.49518,   	0.193358
   stage3_unit16_conv2_output: 4.376717,   0.177199
   stage3_unit16_bn3_output: 	4.433285,   0.209691
   _plus31_output: 			4.476138,   0.181088
   stage3_unit17_bn1_output: 	4.413544,   0.198738
   stage3_unit17_conv1_output: 4.448523,   0.211011
   stage3_unit17_bn2_output: 	4.437947,   0.188803
   stage3_unit17_relu1_output: 4.471436,   0.182246
   stage3_unit17_conv2_output: 4.464995,   0.190605
   stage3_unit17_bn3_output: 	4.495746,   0.188904
   _plus32_output: 			4.497132,   0.186968
   stage3_unit18_bn1_output: 	4.494318,   0.189719
   stage3_unit18_conv1_output: 4.50498,   	0.212633
   stage3_unit18_bn2_output: 	4.506396,   0.213459
   stage3_unit18_relu1_output: 4.475143,   0.193315
   stage3_unit18_conv2_output: 4.653117,   0.20993
   stage3_unit18_bn3_output: 	4.507678,   0.204726
   _plus33_output: 			4.536134,   0.193892
   stage3_unit19_bn1_output:	4.53561,   	0.187413
   stage3_unit19_conv1_output: 4.539222,   0.202827
   stage3_unit19_bn2_output: 	4.560858,   0.201051
   stage3_unit19_relu1_output: 4.500417,   0.187404
   stage3_unit19_conv2_output: 4.592115,   0.197354
   stage3_unit19_bn3_output: 	4.681466,   0.211497
   _plus34_output: 			4.592453,   0.191492
   stage3_unit20_bn1_output: 	4.517682,   0.189632
   stage3_unit20_conv1_output: 4.655152,   0.211197
   stage3_unit20_bn2_output: 	4.587503,   0.192804
   stage3_unit20_relu1_output: 4.580557,   0.204196
   stage3_unit20_conv2_output: 4.628281,   0.206086
   stage3_unit20_bn3_output: 	4.61096,   	0.210766
   _plus35_output: 			4.598163,   0.195696
   stage3_unit21_bn1_output: 	4.606171,   0.219117
   stage3_unit21_conv1_output: 4.609144,   0.218048
   stage3_unit21_bn2_output: 	4.620656,   0.239316
   stage3_unit21_relu1_output: 4.621237,   0.231165
   stage3_unit21_conv2_output: 4.586554,   0.201488
   stage3_unit21_bn3_output: 	4.622247,   0.208518
   _plus36_output: 			4.632964,   0.212868
   stage3_unit22_bn1_output: 	4.630584,   0.235297
   stage3_unit22_conv1_output: 4.640143,   0.21938
   stage3_unit22_bn2_output: 	4.622538,   0.211184
   stage3_unit22_relu1_output: 4.610427,   0.213364
   stage3_unit22_conv2_output: 4.597088,   0.21073
   stage3_unit22_bn3_output: 	4.592711,   0.203537
   _plus37_output: 			4.648936,   0.226621
   stage3_unit23_bn1_output: 	4.578229,   0.201688
   stage3_unit23_conv1_output: 4.579189,   0.208506
   stage3_unit23_bn2_output: 	4.619497,   0.216342
   stage3_unit23_relu1_output: 4.63978,   	0.210522
   stage3_unit23_conv2_output: 4.654119,   0.244915
   stage3_unit23_bn3_output: 	4.633789,   0.237371
   _plus38_output: 			4.645333,   0.217392
   stage3_unit24_bn1_output: 	4.650876,   0.216885
   stage3_unit24_conv1_output: 4.631923,   0.22258
   stage3_unit24_bn2_output: 	4.654131,   0.226166
   stage3_unit24_relu1_output: 4.662393,   0.227011
   stage3_unit24_conv2_output: 4.676686,   0.237505
   stage3_unit24_bn3_output: 	4.67491,   	0.236287
   _plus39_output: 			4.681957,   0.228982
   stage3_unit25_bn1_output: 	4.667902,   0.234539
   stage3_unit25_conv1_output: 4.608225,   0.226377
   stage3_unit25_bn2_output: 	4.660698,   0.234023
   stage3_unit25_relu1_output: 4.670319,   0.236117
   stage3_unit25_conv2_output: 4.668118,   0.235951
   stage3_unit25_bn3_output: 	4.669713,   0.240941
   _plus40_output: 			4.672722,   0.234859
   stage3_unit26_bn1_output: 	4.673621,   0.242095
   stage3_unit26_conv1_output: 4.674926,   0.241272
   stage3_unit26_bn2_output: 	4.629386,   0.236326
   stage3_unit26_relu1_output: 4.643883,   0.235013
   stage3_unit26_conv2_output: 4.672228,   0.242132
   stage3_unit26_bn3_output: 	4.675735,   0.268967
   _plus41_output: 			4.678208,   0.247758
   stage3_unit27_bn1_output: 	4.654869,   0.242446
   stage3_unit27_conv1_output: 4.663212,   0.249276
   stage3_unit27_bn2_output: 	4.684574,   0.249429
   stage3_unit27_relu1_output: 4.660159,   0.245806
   stage3_unit27_conv2_output: 4.673735,   0.247195
   stage3_unit27_bn3_output: 	4.692985,   0.254272
   _plus42_output: 			4.689639,   0.245624
   stage3_unit28_bn1_output: 	4.693664,   0.252727
   stage3_unit28_conv1_output: 4.712328,   0.252102
   stage3_unit28_bn2_output: 	4.658878,   0.256585
   stage3_unit28_relu1_output: 4.683309,   0.254246
   stage3_unit28_conv2_output: 4.665657,   0.258761
   stage3_unit28_bn3_output: 	4.635315,   0.24745
   _plus43_output: 			4.668364,   0.252678
   stage3_unit29_bn1_output: 	4.697991,   0.260005
   stage3_unit29_conv1_output: 4.703722,   0.270149
   stage3_unit29_bn2_output: 	4.75063,   	0.262647
   stage3_unit29_relu1_output: 4.695627,   0.258194
   stage3_unit29_conv2_output: 4.719219,   0.270528
   stage3_unit29_bn3_output: 	4.715352,   0.268147
   _plus44_output: 			4.72601,   	0.277236
   stage3_unit30_bn1_output: 	4.712885,   0.273237
   stage3_unit30_conv1_output: 4.726603,   0.281883
   stage3_unit30_bn2_output: 	4.718283,   0.267689
   stage3_unit30_relu1_output: 4.716642,   0.265039
   stage3_unit30_conv2_output: 4.755806,   0.254604
   stage3_unit30_bn3_output: 	4.72627,   	0.305412
   _plus45_output: 			4.757577,   0.273827
   stage4_unit1_bn1_output: 	4.757362,   0.27349
   stage4_unit1_conv1_output: 	4.755466,   0.28539
   stage4_unit1_bn2_output: 	4.728324,   0.272601
   stage4_unit1_relu1_output: 	4.765415,   0.27752
   stage4_unit1_conv2_output: 	4.794749,   0.285751
   stage4_unit1_bn3_output: 	4.791658,   0.287927
   stage4_unit1_conv1sc_output:4.756854,   0.269675
   stage4_unit1_sc_output: 	4.753292,   0.276264
   _plus46_output: 			4.736092,   0.283252
   stage4_unit2_bn1_output: 	4.737606,   0.277921
   stage4_unit2_conv1_output: 	4.73965,   	0.285815
   stage4_unit2_bn2_output: 	4.772288,   0.295948
   stage4_unit2_relu1_output: 	4.792022,   0.287408
   stage4_unit2_conv2_output: 	4.799194,   0.295432
   stage4_unit2_bn3_output: 	4.800223,   0.301028
   _plus47_output: 			4.798577,   0.297875
   stage4_unit3_bn1_output: 	4.799634,   0.302639
   stage4_unit3_conv1_output: 	4.801734,   0.306124
   stage4_unit3_bn2_output: 	4.825884,   0.31153
   stage4_unit3_relu1_output: 	4.831719,   0.312566
   stage4_unit3_conv2_output: 	4.826539,   0.324629
   stage4_unit3_bn3_output: 	4.875738,   0.356396
   _plus48_output: 			4.827168,   0.325123
   bn1_output: 				4.827469,   0.319441
   dropout0_output: 			4.905909,   0.331991
   pre_fc1_output: 			4.831515,   0.298059
   fc1_output: 				4.831981,   0.372678
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services