You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@mxnet.apache.org by GitBox <gi...@apache.org> on 2020/10/16 15:43:42 UTC

[GitHub] [incubator-mxnet] grygielski commented on issue #19218: CPU inference is very slow for some model checkpoints

grygielski commented on issue #19218:
URL: https://github.com/apache/incubator-mxnet/issues/19218#issuecomment-710124169

@buaalsy2003 Sorry for my late response but I somehow missed your question.
If it comes to how I figured out this problem, I had some experience with similar behavior from other frameworks so denormal values was my initial guess. I didn't use any sophisticated debugging tool, to confirm that, I just checked for denormals inside C++ code (more precisely with `fpclassify` function). I've added these checks on convolution input values and built MXNet from source.

However, the first step for me is always checking an output of running MXNet code with `export MKLDNN_VERBOSE=1` environment variable. It outputs oneDNN (MKL-DNN) primitives executed in order with the execution time at the end. This way I can compare 2 runs like in this case and see if any of them differ significantly.

I hope it somehow shed a light on my thought process and can help you in the future.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@mxnet.apache.org
For additional commands, e-mail: issues-help@mxnet.apache.org