You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2018/01/25 13:57:47 UTC

[GitHub] pengzhao-intel commented on issue #9545: Profiling discussion

pengzhao-intel commented on issue #9545: Profiling discussion
URL: https://github.com/apache/incubator-mxnet/issues/9545#issuecomment-360473609
 
 
   Thanks again, @cjolivier01. 
   @TaoLv provides very informative questions.
   
   FYI, two points I'd like to mention:
   
   - **MKL-DNN profile information**
   When the new CPU backend is enabled from 1.0.1, most of the execution time will be inside MKL-DNN library such as convolution, batch normalization. And we can get the execution time of MKL-DNN by setting MKLDNN_VERBOSE=1, [details](https://github.com/01org/mkl-dnn/blob/master/doc/perf_profile.md). 
   The output includes lots of very useful information for the performance debug, one example as below:
    >mkldnn_verbose,exec,convolution,jit:avx512_common,forward_inference,fsrc:nChw16c fwei:OIhw16i16o fbia:undef fdst:nChw16c,alg:convolution_direct,mb32_g1ic64oc64_ih56oh56kh3sh1dh0ph1_iw56ow56kw3sw1dw0pw1,**1.14307**
   
   avx512_common: the executed instruction set architecture
   forward_inference: the convolution type
   fsrc/fwei/fbia/fdst: format of input data/weight/bias/destination
   algorithm: the convolution algorithm, direct convolution
   mb32_g1ic64oc64_ih56oh56kh3sh1dh0ph1_iw56ow56kw3sw1dw0pw1: 
     mb, mini-batch, 32
     g, group, 1
     ic, input channel, 64
     oc, ouput channel, 64,
     ih, input height, 56,
     ....
     1.14307, the runtime of this kernel. 
   
   So, you can see the output are really meaningful.
   I think we can simplify the MKL-DNN output from MXNET side and make it more readable rather than the raw data.
   
   - **Text Output**
   The text level summary may be not such fancy as the visualization tools. But it's really concise and efficient. The user doesn't need to bother chrome and post-processing json file.
   
   A very good example is from Theano's profiling tools [example](http://deeplearning.net/software/theano/tutorial/profiling.html).
   
   As @TaoLv mentioned, this profile includes the runtime from Class, OPs and their instance (Apply).
   I really like this kind of format :) 
   
   ```
    Function profiling
   ==================
     Message: None
     Time in 1 calls to Function.__call__: 5.698204e-05s
     Time in Function.fn.__call__: 1.192093e-05s (20.921%)
     Time in thunks: 6.198883e-06s (10.879%)
     Total compile time: 3.642474e+00s
       Theano Optimizer time: 7.326508e-02s
          Theano validate time: 3.712177e-04s
       Theano Linker time (includes C, CUDA code generation/compiling): 9.584920e-01s
   
   Class
   ---
   <% time> <sum %> <apply time> <time per call> <type> <#call> <#apply> <Class name>
     100.0%   100.0%       0.000s       2.07e-06s     C        3        3   <class 'theano.tensor.elemwise.Elemwise'>
      ... (remaining 0 Classes account for   0.00%(0.00s) of the runtime)
   
   Ops
   ---
   <% time> <sum %> <apply time> <time per call> <type> <#call> <#apply> <Op name>
     65.4%    65.4%       0.000s       2.03e-06s     C        2        2   Elemwise{add,no_inplace}
     34.6%   100.0%       0.000s       2.15e-06s     C        1        1   Elemwise{mul,no_inplace}
      ... (remaining 0 Ops account for   0.00%(0.00s) of the runtime)
   
   Apply
   ------
   <% time> <sum %> <apply time> <time per call> <#call> <id> <Apply name>
     50.0%    50.0%       0.000s       3.10e-06s      1     0   Elemwise{add,no_inplace}(x, y)
     34.6%    84.6%       0.000s       2.15e-06s      1     2   Elemwise{mul,no_inplace}(TensorConstant{(1,) of 2.0}, Elemwise{add,no_inplace}.0)
     15.4%   100.0%       0.000s       9.54e-07s      1     1   Elemwise{add,no_inplace}(Elemwise{add,no_inplace}.0, z)
      ... (remaining 0 Apply instances account for 0.00%(0.00s) of the runtime)
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services