You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2022/11/22 10:30:34 UTC

[GitHub] [tvm] vvchernov opened a new issue, #13467: [ONNX][Relay][QNN] tvm doesn't support variable zero point for qnn batch_matmul

vvchernov opened a new issue, #13467:
URL: https://github.com/apache/tvm/issues/13467

   ### Expected behavior
   Support of variable zero point in qnn batch_matmul
   
   ### Actual behavior
   Quantized [model](https://huggingface.co/philschmid/quantized-distilbert-banking77?text=I+like+you.+I+love+you) from Hugging face failed during compilation due to variable zero point used there. To observe this problem the input types matching line from [issue](https://github.com/apache/tvm/issues/13466) should be commented.
   
   Part of error logs:
   
   >   1: tvm::relay::qnn::QnnBatchMatmulCanonicalize(tvm::Attrs const&, tvm::runtime::Array<tvm::RelayExpr, void> const&, tvm::runtime::Array<tvm::Type, void> const&)
   >         at /home/user/Workshop/tvm/src/relay/qnn/op/batch_matmul.cc:179
   >   0: int tvm::relay::GetScalarFromConstant<int>(tvm::RelayExpr)
   >         at /home/user/Workshop/tvm/src/relay/qnn/op/../../op/nn/../../transforms/pattern_utils.h:641
   >   File "/home/user/Workshop/tvm/src/relay/transforms/./pattern_utils.h", line 641
   > TVMError:
   > ---------------------------------------------------------------
   > An error occurred during the execution of TVM.
   > For more information, please see: https://tvm.apache.org/docs/errors.html
   > ---------------------------------------------------------------
   >   Check failed: (n) is false: Expr must be a constant expr - #[version = "0.0.5"]
   > free_var %input_ids: Tensor[(1, 54), int64] /* ty=Tensor[(1, 54), int64] */;
   > %0 = less(%input_ids, 0i64 /* ty=int64 */) /* ty=Tensor[(1, 54), bool] */;
   > %1 = add(%input_ids, 30522i64 /* ty=int64 */) /* ty=Tensor[(1, 54), int64] */;
   > %2 = where(%0, %1, %input_ids) /* ty=Tensor[(1, 54), int64] */;
   > %3 = take(meta[relay.Constant][1] /* ty=Tensor[(30522, 768), float32] */, %2, axis=0) /* ty=Tensor[(1, 54, 768), float32] */;
   > %4 = add(%3, meta[relay.Constant][2] /* ty=Tensor[(1, 54, 768), float32] */) /* ty=Tensor[(1, 54, 768), float32] */;
   > %5 = mean(%4, axis=[-1], keepdims=True) /* ty=Tensor[(1, 54, 1), float32] */;
   > %6 = subtract(%4, %5) /* ty=Tensor[(1, 54, 768), float32] */;
   > %7 = power(%6, 2f /* ty=float32 */) /* ty=Tensor[(1, 54, 768), float32] */;
   > %8 = mean(%7, axis=[-1], keepdims=True) /* ty=Tensor[(1, 54, 1), float32] */;
   > %9 = add(%8, 1e-12f /* ty=float32 */) /* ty=Tensor[(1, 54, 1), float32] */;
   > %10 = sqrt(%9) /* ty=Tensor[(1, 54, 1), float32] */;
   > %11 = divide(%6, %10) /* ty=Tensor[(1, 54, 768), float32] */;
   > %12 = multiply(%11, meta[relay.Constant][3] /* ty=Tensor[(768), float32] */) /* ty=Tensor[(1, 54, 768), float32] */;
   > %13 = add(%12, meta[relay.Constant][4] /* ty=Tensor[(768), float32] */) /* ty=Tensor[(1, 54, 768), float32] */;
   > %14 = max(%13) /* ty=float32 */;
   > %15 = min(%13) /* ty=float32 */;
   > %16 = maximum(0f /* ty=float32 */, %14) /* ty=float32 */;
   > %17 = minimum(0f /* ty=float32 */, %15) /* ty=float32 */;
   > %18 = subtract(%16, %17) /* ty=float32 */;
   > %19 = divide(%18, 255f /* ty=float32 */) /* ty=float32 */;
   > %20 = divide(%13, %19);
   > %21 = min(%13) /* ty=float32 */;
   > %22 = divide(%21, %19) /* ty=float32 */;
   > %23 = subtract(0f /* ty=float32 */, %22) /* ty=float32 */;
   > %24 = clip(%23, a_min=0f, a_max=255f) /* ty=float32 */;
   > %25 = round(%24) /* ty=float32 */;
   > %26 = cast(%25, dtype="uint8") /* ty=uint8 */;
   > %27 = cast(%26, dtype="int32") /* ty=int32 */;
   > %28 = round(%20);
   > %29 = cast(%27, dtype="float32");
   > %30 = add(%28, %29);
   > %31 = clip(%30, a_min=0f, a_max=255f);
   > %32 = cast(%31, dtype="uint8");
   > %33 = reshape(%32, newshape=[-1, 768]) /* ty=Tensor[(54, 768), uint8] */;
   > %34 = cast(%33, dtype="int32");
   > %35 = sum(%34, axis=[1], keepdims=True);
   > %36 = nn.dense(%33, meta[relay.Constant][5] /* ty=Tensor[(768, 768), int8] */, units=768, out_dtype="int32");
   > %37 = multiply(0 /* ty=int32 */, %35);
   > %38 = cast(%26, dtype="int32") /* ty=int32 */;
   > %39 = multiply(%38, 0 /* ty=int32 */);
   > %40 = cast(meta[relay.Constant][5] /* ty=Tensor[(768, 768), int8] */, dtype="int32");
   > %41 = sum(%40, axis=[1]);
   > %42 = multiply(%39, 768);
   > %43 = multiply(%38, %41);
   > %44 = subtract(%36, %37);
   > %45 = subtract(%42, %43);
   > %46 = add(%44, %45);
   > %47 = reshape(%46, newshape=[1, 54, 768]) /* ty=Tensor[(1, 54, 768), int32] */;
   > %48 = cast(%47, dtype="float32") /* ty=Tensor[(1, 54, 768), float32] */;
   > %49 = multiply(%19, 0.00736962f /* ty=float32 */) /* ty=float32 */;
   > %50 = multiply(%48, %49) /* ty=Tensor[(1, 54, 768), float32] */;
   > %51 = add(meta[relay.Constant][0] /* ty=Tensor[(768), float32] */, %50) /* ty=Tensor[(1, 54, 768), float32] */;
   > %52 = reshape(%51, newshape=[1, -1, 12, 64]) /* ty=Tensor[(1, 54, 12, 64), float32] */;
   > %53 = transpose(%52, axes=[0, 2, 3, 1]) /* ty=Tensor[(1, 12, 64, 54), float32] */;
   > %54 = max(%53) /* ty=float32 */;
   > %55 = min(%53) /* ty=float32 */;
   > %56 = maximum(0f /* ty=float32 */, %54) /* ty=float32 */;
   > %57 = minimum(0f /* ty=float32 */, %55) /* ty=float32 */;
   > %58 = subtract(%56, %57) /* ty=float32 */;
   > %59 = min(%53) /* ty=float32 */;
   > %60 = divide(%58, 255f /* ty=float32 */) /* ty=float32 */;
   > %61 = divide(%59, %60) /* ty=float32 */;
   > %62 = subtract(0f /* ty=float32 */, %61) /* ty=float32 */;
   > %63 = clip(%62, a_min=0f, a_max=255f) /* ty=float32 */;
   > %64 = round(%63) /* ty=float32 */;
   > %65 = cast(%64, dtype="uint8") /* ty=uint8 */;
   > cast(%65, dtype="int32") /* ty=int32 */
   
   ### Environment
   Linux 20.04 LTE
   
   ### Steps to reproduce
   Usual step for compilation and launching of the onnx-model by VirtualMachine by python front-end
   
   ### Triage
   * frontend:onnx
   * relay:qnn
   
   ### Notes
   There are several options for format of scales and zero points: a. scalar or tensor, b. constant or variable. Now constant scalar is supported for qnn batch matmul only. It looks like there is no reasonable constraints on zero point format for any qnn operations.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] vvchernov commented on issue #13467: [ONNX][Relay][QNN] tvm doesn't support variable zero point for qnn batch_matmul

Posted by GitBox <gi...@apache.org>.

vvchernov commented on issue #13467:
URL: https://github.com/apache/tvm/issues/13467#issuecomment-1324141720

   cc @masahi


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] vvchernov commented on issue #13467: [ONNX][Relay][QNN] tvm doesn't support variable zero point for qnn batch_matmul

Posted by GitBox <gi...@apache.org>.

vvchernov commented on issue #13467:
URL: https://github.com/apache/tvm/issues/13467#issuecomment-1334791652

   #13469 resolves variable scalar zero point. Tensor format is not supported currently.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] vvchernov commented on issue #13467: [ONNX][Relay][QNN] tvm doesn't support variable zero point for qnn batch_matmul

Posted by GitBox <gi...@apache.org>.

vvchernov commented on issue #13467:
URL: https://github.com/apache/tvm/issues/13467#issuecomment-1337850585

   The fix was merged, nevertheless do not close this issue, I'm trying to implement CI test covered this case through testing of QLinearMatMul op, but there is some problem. Work is in progress.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org