You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2022/12/07 21:26:12 UTC

[GitHub] [tvm] masahi opened a new pull request, #13578: [FQ2I] Support converting `dense` -> `add` to `qnn.dense` -> `add` -> `requantize`

masahi opened a new pull request, #13578:
URL: https://github.com/apache/tvm/pull/13578

   Closes https://github.com/apache/tvm/issues/13545


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] masahi commented on a diff in pull request #13578: [FQ2I] Support converting `dense` -> `add` to `qnn.dense` -> `add` -> `requantize`

Posted by GitBox <gi...@apache.org>.

masahi commented on code in PR #13578:
URL: https://github.com/apache/tvm/pull/13578#discussion_r1043847053


##########
python/tvm/relay/frontend/onnx.py:
##########
@@ -1391,7 +1391,7 @@ def _impl_v1(cls, inputs, attr, params):
         dtype = input0_state.checked_type.dtype
         # Y = alpha * A * B + beta * C
         alpha = float(attr.get("alpha", 1.0))
-        beta = float(attr.get("beta", 1.0))
+        beta = float(attr.get("beta"))

Review Comment:
   Constant folding doesn't work when beta is multiplying an output of qnn ops, since we cannot fold over them. The model in https://github.com/apache/tvm/issues/13545 has `multiply(1f, dequantize(bias)` after `dense`, which was also causing some issues. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] tvm-bot commented on pull request #13578: [FQ2I] Support converting `dense` -> `add` to `qnn.dense` -> `add` -> `requantize`

Posted by GitBox <gi...@apache.org>.

tvm-bot commented on PR #13578:
URL: https://github.com/apache/tvm/pull/13578#issuecomment-1341618008

   <!---bot-comment-->
   
   Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from [Reviewers](https://github.com/apache/incubator-tvm/blob/master/CONTRIBUTORS.md#reviewers) by @-ing them in a comment.
   
   <!--bot-comment-ccs-start-->
    * No users to tag found in teams: `fq2i` <sub>See [#10317](https://github.com/apache/tvm/issues/10317) for details</sub><!--bot-comment-ccs-end-->
   
   <sub>Generated by [tvm-bot](https://github.com/apache/tvm/blob/main/ci/README.md#github-actions)</sub>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] masahi commented on a diff in pull request #13578: [FQ2I] Support converting `dense` -> `add` to `qnn.dense` -> `add` -> `requantize`

Posted by GitBox <gi...@apache.org>.

masahi commented on code in PR #13578:
URL: https://github.com/apache/tvm/pull/13578#discussion_r1043849513


##########
python/tvm/relay/frontend/onnx.py:
##########
@@ -1391,7 +1391,7 @@ def _impl_v1(cls, inputs, attr, params):
         dtype = input0_state.checked_type.dtype
         # Y = alpha * A * B + beta * C
         alpha = float(attr.get("alpha", 1.0))
-        beta = float(attr.get("beta", 1.0))
+        beta = float(attr.get("beta"))

Review Comment:
   Moved `float(beta)` to the `else` block of `if beta is None`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] masahi commented on a diff in pull request #13578: [FQ2I] Support converting `dense` -> `add` to `qnn.dense` -> `add` -> `requantize`

Posted by GitBox <gi...@apache.org>.

masahi commented on code in PR #13578:
URL: https://github.com/apache/tvm/pull/13578#discussion_r1043841295


##########
python/tvm/relay/transform/fake_quantization_to_integer.py:
##########
@@ -502,6 +502,33 @@ def register_binary_qnn(op_name, op):
 
     def binary(expr, type_map):
         left, right, left_t, right_t, out_t = get_binary_types(expr, type_map)
+
+        if (
+            approx_equal(left_t.scale, right_t.scale)

Review Comment:
   But right, this code path is only meant for catching 32 bit `add` used as bias addition. I'll update the condition to catch only this case.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] masahi commented on a diff in pull request #13578: [FQ2I] Support converting `dense` -> `add` to `qnn.dense` -> `add` -> `requantize`

Posted by GitBox <gi...@apache.org>.

masahi commented on code in PR #13578:
URL: https://github.com/apache/tvm/pull/13578#discussion_r1043860122


##########
python/tvm/relay/frontend/onnx.py:
##########
@@ -1391,7 +1391,7 @@ def _impl_v1(cls, inputs, attr, params):
         dtype = input0_state.checked_type.dtype
         # Y = alpha * A * B + beta * C
         alpha = float(attr.get("alpha", 1.0))
-        beta = float(attr.get("beta", 1.0))
+        beta = float(attr.get("beta"))

Review Comment:
   Actually the whole purpose of this change was to avoid multiplying by 1.0, since `multiply(1f, dequantize(bias)` would be converted to `qnn.mul(quantize(1), bias)` by FQ2I. So I restored the original code cc @Icemist 
   
   An alternative would be to add algebraic simplification to the `SimpliyfyExpr` pass. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] AndrewZhaoLuo commented on a diff in pull request #13578: [FQ2I] Support converting `dense` -> `add` to `qnn.dense` -> `add` -> `requantize`

Posted by GitBox <gi...@apache.org>.

AndrewZhaoLuo commented on code in PR #13578:
URL: https://github.com/apache/tvm/pull/13578#discussion_r1043890901


##########
python/tvm/relay/transform/fake_quantization_to_integer.py:
##########
@@ -502,6 +502,33 @@ def register_binary_qnn(op_name, op):
 
     def binary(expr, type_map):
         left, right, left_t, right_t, out_t = get_binary_types(expr, type_map)
+
+        if (
+            approx_equal(left_t.scale, right_t.scale)

Review Comment:
   yes that is fine by me, can you update the comment to warn that dtype == 32 bits is really important for avoiding overflow.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] masahi commented on a diff in pull request #13578: [FQ2I] Support converting `dense` -> `add` to `qnn.dense` -> `add` -> `requantize`

Posted by GitBox <gi...@apache.org>.

masahi commented on code in PR #13578:
URL: https://github.com/apache/tvm/pull/13578#discussion_r1043798545


##########
python/tvm/relay/frontend/onnx.py:
##########
@@ -1406,7 +1406,10 @@ def _impl_v1(cls, inputs, attr, params):
             inputs[0] *= _expr.const(alpha, dtype=dtype)
         out = _op.nn.dense(inputs[0], inputs[1], units=channels)
         if len(inputs) == 3:
-            out = out + _expr.const(beta, dtype=dtype) * inputs[2]
+            if beta != 1.0:

Review Comment:
   Make sense, there is no need to use 1.0 as the default and compare against it. Updated.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] AndrewZhaoLuo commented on a diff in pull request #13578: [FQ2I] Support converting `dense` -> `add` to `qnn.dense` -> `add` -> `requantize`

Posted by GitBox <gi...@apache.org>.

AndrewZhaoLuo commented on code in PR #13578:
URL: https://github.com/apache/tvm/pull/13578#discussion_r1043831311


##########
python/tvm/relay/transform/fake_quantization_to_integer.py:
##########
@@ -502,6 +502,33 @@ def register_binary_qnn(op_name, op):
 
     def binary(expr, type_map):
         left, right, left_t, right_t, out_t = get_binary_types(expr, type_map)
+
+        if (
+            approx_equal(left_t.scale, right_t.scale)

Review Comment:
   Hmm I don't believe this work around is a good idea because it does not handle overflow well where qnn ops will. 
   
   e.g. if we had 128 * 128 --> we will have issues.  
   
   Hmm the main problem we have if I understand correctly is dealing with scales and zero points which are of shape [1] rather than be scalars. 
   
   Can we just unpack scales and zero points into scalars then?



##########
python/tvm/relay/frontend/onnx.py:
##########
@@ -1391,7 +1391,7 @@ def _impl_v1(cls, inputs, attr, params):
         dtype = input0_state.checked_type.dtype
         # Y = alpha * A * B + beta * C
         alpha = float(attr.get("alpha", 1.0))
-        beta = float(attr.get("beta", 1.0))
+        beta = float(attr.get("beta"))

Review Comment:
   Though the change on L1409 might not be needed since if beta == 1, it can be removed with constant folding.



##########
python/tvm/relay/frontend/onnx.py:
##########
@@ -1391,7 +1391,7 @@ def _impl_v1(cls, inputs, attr, params):
         dtype = input0_state.checked_type.dtype
         # Y = alpha * A * B + beta * C
         alpha = float(attr.get("alpha", 1.0))
-        beta = float(attr.get("beta", 1.0))
+        beta = float(attr.get("beta"))

Review Comment:
   I would keep the original line of `.get('beta', 1.0)` since you cannot call `float()` on `None` which `attr.get` can return.
   
   Then below on L1409, you can just do `if beta is None` --> `if 'beta' not in attr.keys()` or something.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] Icemist commented on a diff in pull request #13578: [FQ2I] Support converting `dense` -> `add` to `qnn.dense` -> `add` -> `requantize`

Posted by GitBox <gi...@apache.org>.

Icemist commented on code in PR #13578:
URL: https://github.com/apache/tvm/pull/13578#discussion_r1043642842


##########
python/tvm/relay/frontend/onnx.py:
##########
@@ -1406,7 +1406,10 @@ def _impl_v1(cls, inputs, attr, params):
             inputs[0] *= _expr.const(alpha, dtype=dtype)
         out = _op.nn.dense(inputs[0], inputs[1], units=channels)
         if len(inputs) == 3:
-            out = out + _expr.const(beta, dtype=dtype) * inputs[2]
+            if beta != 1.0:

Review Comment:
   Since we do not use 1.0 in the calculations, I would suggest not to use it at all. something like:
   ```
   beta = attr.get("beta")
   ```
   ...
   ```
   if beta is None:
   	out = out + inputs[2]
   else:
   	out = out + _expr.const(float(beta), dtype=dtype) * inputs[2]
   ```
   
   
   or 
   ```
   input_2 = inputs[2] 
   if beta is not None:  # can be written in 1 line ternary operator
   	input_2 = input_2 * expr.const(float(beta), dtype=dtype)
   out = out + input_2
   ```
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] masahi commented on a diff in pull request #13578: [FQ2I] Support converting `dense` -> `add` to `qnn.dense` -> `add` -> `requantize`

Posted by GitBox <gi...@apache.org>.

masahi commented on code in PR #13578:
URL: https://github.com/apache/tvm/pull/13578#discussion_r1043860122


##########
python/tvm/relay/frontend/onnx.py:
##########
@@ -1391,7 +1391,7 @@ def _impl_v1(cls, inputs, attr, params):
         dtype = input0_state.checked_type.dtype
         # Y = alpha * A * B + beta * C
         alpha = float(attr.get("alpha", 1.0))
-        beta = float(attr.get("beta", 1.0))
+        beta = float(attr.get("beta"))

Review Comment:
   Actually the whole purpose of this change was to avoid multiplying by 1.0, since `multiply(1f, dequantize(bias)` would be converted to `qnn.mul(quantize(1), bias)`. So I restored the original code cc @Icemist 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] masahi merged pull request #13578: [FQ2I] Support converting `dense` -> `add` to `qnn.dense` -> `add` -> `requantize`

Posted by GitBox <gi...@apache.org>.

masahi merged PR #13578:
URL: https://github.com/apache/tvm/pull/13578


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] masahi commented on a diff in pull request #13578: [FQ2I] Support converting `dense` -> `add` to `qnn.dense` -> `add` -> `requantize`

Posted by GitBox <gi...@apache.org>.

masahi commented on code in PR #13578:
URL: https://github.com/apache/tvm/pull/13578#discussion_r1043839934


##########
python/tvm/relay/transform/fake_quantization_to_integer.py:
##########
@@ -502,6 +502,33 @@ def register_binary_qnn(op_name, op):
 
     def binary(expr, type_map):
         left, right, left_t, right_t, out_t = get_binary_types(expr, type_map)
+
+        if (
+            approx_equal(left_t.scale, right_t.scale)

Review Comment:
   `out_t.scale` and `out_t.zero_point` can be of shape `(channels,)` if the addition is used as bias add. If I comment out this if statement and run the test attached in this PR, I get
   
   `AssertionError: The output scale needs to be a scalar, but got a tensor of shape (256,)` 
   
   from the assert I added.
   
   I think there is something weird in the implementation of FQ2I, since it doesn't make sense to have an output scale of shape `(channels,)`.  But that's how it works now, so this is the workaround I came up with.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org