You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2022/04/28 06:38:56 UTC

[GitHub] [tvm] DzAvril opened a new pull request, #11155: fix accuracy issue of misalignment between arm's q_multiply_shift and x86's

DzAvril opened a new pull request, #11155:
URL: https://github.com/apache/tvm/pull/11155

   
   **Problems introduced by the “two rounding” behavior on arm_cpu**
   op:`q_multiply_shift`, used in re-quantization.
   
   The DEFAULT path:
   
   - file: [src/target/intrin_rule.cc 2](https://github.com/apache/tvm/blob/main/src/target/intrin_rule.cc#L154)
   
   The NEON path:
   
   - file: [python/tvm/topi/arm_cpu/tensor_intrin.py](https://github.com/apache/tvm/blob/main/python/tvm/topi/arm_cpu/tensor_intrin.py#L1147)
   
   The NEON path may produce some different values (due to two rounding)[1], compared with the _DEFAULT_ path, within a single layer.
   
   The problem is, on`arm_cpu`, it will sometimes use the DEAULT path, sometimes use the NEON path:
   
   - If the innermost axis, is a multiple of four and vectorization applied, the NEON path is enabled
   - Otherwise, the DEFAULT path
   
   BTW, it looks like “two rounding” is important to make a “bit exact result” of TFLite qnnpack, see
   
   - [Supporting bit exact TFLite QNN inference](https://discuss.tvm.apache.org/t/supporting-bit-exact-tflite-qnn-inference/5528)
   - [TFLite Rounding](https://discuss.tvm.apache.org/t/tflite-rounding/6287)
   
   Discussion in TVM forum: https://discuss.tvm.apache.org/t/quantization-aligning-result-of-tvm-to-torchs/12225
   
   P.S. The patch in this PR is created by @wangxunx, I helped to send this PR.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] cee1 commented on pull request #11155: fix accuracy issue of misalignment between arm's q_multiply_shift and x86's

Posted by GitBox <gi...@apache.org>.

cee1 commented on PR #11155:
URL: https://github.com/apache/tvm/pull/11155#issuecomment-1118125802

   > Why do we think the DEFAULT path is "accurate" here ?
   
   DEFAULT path is (somewhat) consistent with QAT, specifically, the rounding behavior within fake quantization.
    
   For DEFAULT path, only one right shift is required, and the (only one) rounding happens there.
   
   For NEON path, the (possible) extra rounding behavior is introduced by NEON instruction `sqrdmulh`, which has some side-effect (see https://discuss.tvm.apache.org/t/quantization-aligning-result-of-tvm-to-torchs/12225/2?u=cee1 )


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] manupa-arm commented on pull request #11155: fix accuracy issue of misalignment between arm's q_multiply_shift and x86's

Posted by GitBox <gi...@apache.org>.

manupa-arm commented on PR #11155:
URL: https://github.com/apache/tvm/pull/11155#issuecomment-1113282168

   Why do we think DEFAULT path is "accurate" here ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] cee1 commented on pull request #11155: fix accuracy issue of misalignment between arm's q_multiply_shift and x86's

Posted by GitBox <gi...@apache.org>.

cee1 commented on PR #11155:
URL: https://github.com/apache/tvm/pull/11155#issuecomment-1118130717

   > Please add a test for this - what's the impact of accuracy if you compare this with Tensorflow Lite ?
   > 
   > Are we trading accuracy in one framework for another ?
   > 
   > @Mousius , @manupa-arm , please could you take a look ?
   > 
   > Ramana
   
   If recall correctly, current TVM's NEON path is bit-exact with output of tflite?
   
   But is inconsistent with the fake quantization Op within QAT, which, IMHO, can be seem as the reference implementation of requantization / quantization (where `q_multiply_shift` is involved in)
   
   To solve the concern, this patch is submitted.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] u99127 commented on pull request #11155: fix accuracy issue of misalignment between arm's q_multiply_shift and x86's

Posted by GitBox <gi...@apache.org>.

u99127 commented on PR #11155:
URL: https://github.com/apache/tvm/pull/11155#issuecomment-1113259528

   Please add a test for this - what's the impact of accuracy if you compare this with Tensorflow Lite ? 
   
   
   Are we trading accuracy in one framework for another ? 
   
   @Mousius , @manupa-arm , please could you take a look ? 
   
   Ramana


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org