You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2019/02/08 22:45:54 UTC

[GitHub] eric-haibin-lin opened a new issue #14100: More user-friendly profiler result

eric-haibin-lin opened a new issue #14100: More user-friendly profiler result
URL: https://github.com/apache/incubator-mxnet/issues/14100
 
 
   This is an example output of `mx.profiler.dumps()`:
   
   ```
   operator
   =================
   Name                          Total Count        Time (ms)    Min Time (ms)    Max Time (ms)    Avg Time (ms)
   ----                          -----------        ---------    -------------    -------------    -------------
   CopyCPU2CPU                            18         107.0130           3.9670           8.1190           5.9452
   argmax                                  6           0.2470           0.0370           0.0460           0.0412
   min                                     6           0.2580           0.0400           0.0460           0.0430
   Concat                                  6           0.2770           0.0390           0.0590           0.0462
   _ones                                   6           0.2050           0.0290           0.0440           0.0342
   sqrt                                    6           0.9000           0.0380           0.3700           0.1500
   _backward_Activation                    6           0.2470           0.0390           0.0440           0.0412
   SyncCopyCPU2GPU                         6           0.1800           0.0270           0.0340           0.0300
   _copyto_GPU2GPU                      3636         142.9960           0.0220           0.3250           0.0393
   _backward_log_softmax                   6           0.2710           0.0290           0.0700           0.0452
   slice                                   6           0.2380           0.0370           0.0410           0.0397
   _backward_pick                          6           0.4430           0.0430           0.1190           0.0738
   SyncCopyGPU2CPU                        18           0.8020           0.0290           0.0860           0.0446
   _plus_scalar                           72          21.2870           0.0790           1.7160           0.2957
   _div_scalar                            72          36.0430           0.0780           1.6120           0.5006
   expand_dims                            84          14.1550           0.0280           1.6070           0.1685
   softmax                                72          10.4490           0.0980           0.1770           0.1451
   Embedding                              18           1.6480           0.0250           0.4190           0.0916
   SetValueOp                              6           0.4930           0.0470           0.1340           0.0822
   _mul_scalar                           150          42.0470           0.0220           1.2810           0.2803
   _contrib_div_sqrt_dim                 144           6.8160           0.0320           0.0890           0.0473
   FullyConnected                        444         262.0170           0.1020           2.2970           0.5901
   _backward_SequenceMask                  6           0.9190           0.1350           0.1680           0.1532
   broadcast_lesser                        6           0.2990           0.0400           0.0650           0.0498
   _backward_Embedding                    18          10.6270           0.0560           1.0350           0.5904
   ones_like                              72          27.5950           0.0270           1.7510           0.3833
   where                                  72           5.6690           0.0340           1.0110           0.0787
   DeleteVariable                      10326          96.4270           0.0000           2.4820           0.0093
   dot                                  1206          54.2630           0.0280           0.1710           0.0450
   batch_dot                             144          15.4360           0.0590           1.0670           0.1072
   _backward_Dropout                     240          10.9490           0.0300           0.0920           0.0456
   broadcast_axis                         78          37.3540           0.0370           3.1240           0.4789
   _backward_mean                         12           2.9340           0.0390           0.6950           0.2445
   LayerNorm                             150          26.5350           0.1380           0.7940           0.1769
   mean                                   12           0.8380           0.0390           0.1220           0.0698
   _backward_LayerNorm                   150          48.3110           0.2340           0.9060           0.3221
   broadcast_mul                        1350          58.1180           0.0210           1.3150           0.0431
   _backward_erf                          72           7.1030           0.0720           0.1460           0.0987
   log_softmax                             6           0.2200           0.0350           0.0390           0.0367
   broadcast_add                         156          10.7970           0.0380           0.7950           0.0692
   _arange                                12          20.3000           0.0260           7.9170           1.6917
   _backward_reshape                     732          36.3330           0.0280           0.1510           0.0496
   erf                                    72           5.8460           0.0570           0.1590           0.0812
   WaitForVar                             24           0.2220           0.0050           0.0150           0.0093
   transpose                             576          87.2210           0.0430           1.2770           0.1514
   Dropout                               240         155.5880           0.0840           2.3730           0.6483
   SequenceMask                            6           2.0030           0.1580           0.6750           0.3338
   CopyCPU2GPU                            24           2.4000           0.0300           0.4570           0.1000
   _backward_where                        72           9.1340           0.0350           0.1860           0.1269
   _copy                                  72           7.2890           0.0680           0.1410           0.1012
   Activation                              6           0.2440           0.0370           0.0460           0.0407
   _backward_FullyConnected              444         309.9390           0.1010           1.4010           0.6981
   _backward_div_scalar                   72           5.3830           0.0560           0.0960           0.0748
   _backward_slice                         6           0.2790           0.0430           0.0500           0.0465
   _backward_broadcast_add               156           9.2620           0.0420           0.0930           0.0594
   add_n                                 222          22.4660           0.0360           1.3650           0.1012
   pick                                    6           0.2670           0.0340           0.0580           0.0445
   _contrib_adamw_update                1206         101.7640           0.0460           1.7690           0.0844
   _backward_batch_dot                   144          16.0560           0.0880           0.1540           0.1115
   _backward_softmax                      72          10.7450           0.0690           0.2450           0.1492
   _backward_broadcast_mul               144          19.9080           0.0340           0.6370           0.1382
   _backward_mul_scalar                   78           6.6860           0.0240           0.1410           0.0857
   ```
   As an user I am interested in sorting the operators based on a particular field (e.g. Avg Time) to find out the most expensive one. It would be great to have such an enhancement. 
   cc @Vikas89 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services