You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2019/07/31 00:44:13 UTC

[GitHub] [incubator-mxnet] ZhennanQin edited a comment on issue #15701: Quantization int8 bit performance drop too much

ZhennanQin edited a comment on issue #15701: Quantization int8 bit performance drop too much
URL: https://github.com/apache/incubator-mxnet/issues/15701#issuecomment-516644151

@lxpwj Thanks for reporting this. May I know your CPU type first? Because different generation will provide slight different accuracy due to different instructions used.

Cwiki results are collected based on proposal which targeting MXNet v1.4 release, and it is not the same situation as v1.5. v1.5 added more quantized operators, which will help to accelerate the quantized model, and on the other hand, the accuracy may slightly drop. To reproduce the accuracy from proposal, you need to exclude all non-fused `relu` and `add` operator.

Also, `imagenet1k-resnet-152` isn't a good candidate for quantization because resnet_v2 changed the operator order in base block, causing operator can't be fully fused, and result in frequent type conversion between fp32 and int8. Those conversions will eat up all benefit from quantization, but provide bad accuracy.

Please try `resnet50_v1`, `inceptionv3` or `mobilenetv2_1.0`, they are more quantization friendly and will get good result on both accuracy and speed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

With regards,
Apache Git Services