You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2022/09/23 19:20:47 UTC

[GitHub] [tvm] mbrookhart opened a new pull request, #12889: [Draft][FQ2I] Support non-const scales and zero points in fq2i

mbrookhart opened a new pull request, #12889:
URL: https://github.com/apache/tvm/pull/12889

   cc @AndrewZhaoLuo @honghuichao
   
   I attempted to support non-constant scales and zero points in FQ2I to fix the problem in #12707. This works, the graph gets transformed as I'd expect from this:
   ```
   def @main(%x0: Tensor[(1, 4), int8] /* ty=Tensor[(1, 4), int8] */, %x1: Tensor[(1, 4), int8] /* ty=Tensor[(1, 4), int8] */, %x2: Tensor[(1, 4), int8] /* ty=Tensor[(1, 4), int8] */, %x3: Tensor[(1, 4), int8] /* ty=Tensor[(1, 4), int8] */) -> Tensor[(1, 16), int8] {
     %0 = add(0f /* ty=float32 */, 0.5f /* ty=float32 */) /* ty=float32 */;
     %1 = multiply(0 /* ty=int32 */, 1 /* ty=int32 */) /* ty=int32 */;
     %2 = add(1f /* ty=float32 */, 0.5f /* ty=float32 */) /* ty=float32 */;
     %3 = add(2f /* ty=float32 */, 0.5f /* ty=float32 */) /* ty=float32 */;
     %4 = add(3f /* ty=float32 */, 0.5f /* ty=float32 */) /* ty=float32 */;
     %5 = qnn.dequantize(%x0, %0, %1) /* ty=Tensor[(1, 4), float32] */;
     %6 = qnn.dequantize(%x1, %2, %1) /* ty=Tensor[(1, 4), float32] */;
     %7 = qnn.dequantize(%x2, %3, %1) /* ty=Tensor[(1, 4), float32] */;
     %8 = qnn.dequantize(%x3, %4, %1) /* ty=Tensor[(1, 4), float32] */;
     %9 = (%5, %6, %7, %8) /* ty=(Tensor[(1, 4), float32], Tensor[(1, 4), float32], Tensor[(1, 4), float32], Tensor[(1, 4), float32]) */;
     %10 = concatenate(%9, axis=1) /* ty=Tensor[(1, 16), float32] */;
     %11 = add(3f /* ty=float32 */, 0.5f /* ty=float32 */) /* ty=float32 */;
     qnn.quantize(%10, %11, %1, out_dtype="int8") /* ty=Tensor[(1, 16), int8] */
   }
   ```
   
   To this:
   
   ```
   def @main(%x0: Tensor[(1, 4), int8] /* ty=Tensor[(1, 4), int8] */, %x1: Tensor[(1, 4), int8] /* ty=Tensor[(1, 4), int8] */, %x2: Tensor[(1, 4), int8] /* ty=Tensor[(1, 4), int8] */, %x3: Tensor[(1, 4), int8] /* ty=Tensor[(1, 4), int8] */) -> Tensor[(1, 16), int8] {
     %0 = add(0f /* ty=float32 */, 0.5f /* ty=float32 */) /* ty=float32 */;
     %1 = add(1f /* ty=float32 */, 0.5f /* ty=float32 */) /* ty=float32 */;
     %2 = add(2f /* ty=float32 */, 0.5f /* ty=float32 */) /* ty=float32 */;
     %3 = add(3f /* ty=float32 */, 0.5f /* ty=float32 */) /* ty=float32 */;
     %4 = multiply(0 /* ty=int32 */, 1 /* ty=int32 */) /* ty=int32 */;
     %5 = (%x0, %x1, %x2, %x3) /* ty=(Tensor[(1, 4), int8], Tensor[(1, 4), int8], Tensor[(1, 4), int8], Tensor[(1, 4), int8]) */;
     %6 = (%0, %1, %2, %3) /* ty=(float32, float32, float32, float32) */;
     %7 = (%4, %4, %4, %4) /* ty=(int32, int32, int32, int32) */;
     %8 = add(3f /* ty=float32 */, 0.5f /* ty=float32 */) /* ty=float32 */;
     qnn.concatenate(%5, %6, %7, %8, %4, axis=1) /* ty=Tensor[(1, 16), int8] */
   }
   ```
   
   But the test I wrote fails because requantization doesn't support non-constant scales and zero points, and the concat operation calls requantize under the hood. I'm not sure how soon I will get into fixing that issue.
   
   This might not be a problem for the graph originally propose in the issue, since it seems everything has the same scale/zero point, but it's not a general solution with the current backend.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] mikepapadim commented on pull request #12889: [Draft][FQ2I] Support non-const scales and zero points in fq2i

Posted by GitBox <gi...@apache.org>.

mikepapadim commented on PR #12889:
URL: https://github.com/apache/tvm/pull/12889#issuecomment-1283689339

   @AndrewZhaoLuo I will take a look into this one


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org