You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2020/07/22 20:01:24 UTC

[GitHub] [incubator-tvm] anijain2305 opened a new pull request #6115: [Topi, x86] Using MKL blas for quantized dense

anijain2305 opened a new pull request #6115:
URL: https://github.com/apache/incubator-tvm/pull/6115


   Using MKL for quantized dense, following the MKL fallback for FP32 dense.
   
   On C5.12x large cascade lake with VNNI support, results for BERT base are as follows (latency in ms)
   
   Type | Batch size | MXNet+MKL | TVM+MKL
   -- | -- | -- | --
   FP32 | 128 | 33.56 | 16.83
   Quantized | 128 | 23.94697 | 17.59
   
   The overhead, between TVM FP32 and TVM quantized, is because only Dense ops are quantized in the network, and there is a cost of back-and-forth quantize and dequantize. We will investigate if quantize, dequantize can be improved.
   
   @icemelon9 
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] icemelon9 merged pull request #6115: [Topi, x86] Using MKL blas for quantized dense

Posted by GitBox <gi...@apache.org>.

icemelon9 merged pull request #6115:
URL: https://github.com/apache/incubator-tvm/pull/6115


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] tqchen commented on pull request #6115: [Topi, x86] Using MKL blas for quantized dense

Posted by GitBox <gi...@apache.org>.

tqchen commented on pull request #6115:
URL: https://github.com/apache/incubator-tvm/pull/6115#issuecomment-665129014


   While it is OK to make use of the mkldnn in this case, we should always work hard to get good integer schedules and learn from the insights, just as the case we did for the CUDA softmax and other cases.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] icemelon9 commented on pull request #6115: [Topi, x86] Using MKL blas for quantized dense

Posted by GitBox <gi...@apache.org>.

icemelon9 commented on pull request #6115:
URL: https://github.com/apache/incubator-tvm/pull/6115#issuecomment-665267898


   Thanks @anijain2305 @eric-haibin-lin @TaoLv @tqchen 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] anijain2305 commented on pull request #6115: [Topi, x86] Using MKL blas for quantized dense

Posted by GitBox <gi...@apache.org>.

anijain2305 commented on pull request #6115:
URL: https://github.com/apache/incubator-tvm/pull/6115#issuecomment-663631764


   @icemelon9 Can you please manage this PR?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] anijain2305 commented on pull request #6115: [Topi, x86] Using MKL blas for quantized dense

Posted by GitBox <gi...@apache.org>.

anijain2305 commented on pull request #6115:
URL: https://github.com/apache/incubator-tvm/pull/6115#issuecomment-663128946


   @TaoLv Good point, I added the latency numbers for TVM alone. Thanks for pointing it out!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] TaoLv commented on pull request #6115: [Topi, x86] Using MKL blas for quantized dense

Posted by GitBox <gi...@apache.org>.

TaoLv commented on pull request #6115:
URL: https://github.com/apache/incubator-tvm/pull/6115#issuecomment-663028130


   Better to show the performance of TVM before using MKL s8u8s32 GEMM.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] anijain2305 commented on pull request #6115: [Topi, x86] Using MKL blas for quantized dense

Posted by GitBox <gi...@apache.org>.

anijain2305 commented on pull request #6115:
URL: https://github.com/apache/incubator-tvm/pull/6115#issuecomment-664808714






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] anijain2305 commented on pull request #6115: [Topi, x86] Using MKL blas for quantized dense

Posted by GitBox <gi...@apache.org>.

anijain2305 commented on pull request #6115:
URL: https://github.com/apache/incubator-tvm/pull/6115#issuecomment-662819127


   @eric-haibin-lin Yes, the MXNet+MKLDNN baseline is also in `int8`.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org