You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2022/09/01 11:46:27 UTC

[GitHub] [tvm] guberti opened a new pull request, #12671: Allow int8 operations for Cortex-M cores

guberti opened a new pull request, #12671:
URL: https://github.com/apache/tvm/pull/12671

   Currently, Relay QNN uses its `helper_no_fast_int8_hw_legalization` change all `int8` operations in Cortex-M-compiled models into `int16` operations. I believe we do this because Cortex-M chips do not have a `4xint8` multiply-accumulate function, and so in some sense we don't have fast `int8` operations.
   
   However, changing our operators to `int16` is substantially slower, as while it saves a few sign extension operations, it doubles the amount of memory loads we need to perform. This PR changes this behavior, and ensures that Cortex-M microcontrollers do not have `int8` operations turned into `int16` ones.
   
   I have also verified that this does, in fact, improve performance for some common models. For example, MobileNet_v1_0.25 on the Cortex-M4 saw a 10% performance improvement, compared to before this change. Accuracy does not seem to be affected.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] guberti commented on a diff in pull request #12671: [microTVM] Allow int8 operations for Cortex-M cores

Posted by GitBox <gi...@apache.org>.

guberti commented on code in PR #12671:
URL: https://github.com/apache/tvm/pull/12671#discussion_r961315831


##########
python/tvm/relay/qnn/op/legalizations.py:
##########
@@ -433,15 +439,15 @@ def _qnn_conv2d_legalize_arm_cpu(attrs, inputs, types):
         attrs["groups"],
     )
     use_int8_on_arm = (not is_depthwise) and is_aarch64_arm() and attrs["data_layout"] == "NHWC"
-    if use_int8_on_arm or is_fast_int8_on_arm():
+    if use_int8_on_arm or is_fast_int8_on_arm() or is_cortexm_arm():

Review Comment:
   I expanded upon the docstring for `is_cortexm_arm` - let me know if you still have questions.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] Mousius commented on a diff in pull request #12671: [microTVM] Allow int8 operations for Cortex-M cores

Posted by GitBox <gi...@apache.org>.

Mousius commented on code in PR #12671:
URL: https://github.com/apache/tvm/pull/12671#discussion_r961512539


##########
python/tvm/relay/qnn/op/legalizations.py:
##########
@@ -417,6 +417,16 @@ def is_aarch64_arm():
     return "aarch64" in target.attrs.get("mtriple", "")
 
 
+def is_cortexm_arm():
+    """Checks whether we are compiling for a Cortex-M target. We want to preserve
+    int8 operations for these boards, because it halves the number of memory loads
+    needed for dense layers and convolutions. It comes at a cost of needing more
+    micro kernels and sign extension instructions to load these values, but this
+    trade-off seems to be worth it on Cortex-M."""
+    target = tvm.target.Target.current(allow_none=False)
+    return "cortex-m" in target.attrs.get("mcpu", "")

Review Comment:
   This may be unnecessary with my other comment 😸 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] guberti commented on pull request #12671: [microTVM] Allow int8 operations for Cortex-M cores

Posted by GitBox <gi...@apache.org>.

guberti commented on PR #12671:
URL: https://github.com/apache/tvm/pull/12671#issuecomment-1235426485

   @Mousius I've changed when `helper_no_fast_int8_hw_legalization` to be what you suggested - please take another look!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] guberti commented on a diff in pull request #12671: [Relay] Change when int8 operations are converted to int16 on Arm

Posted by GitBox <gi...@apache.org>.

guberti commented on code in PR #12671:
URL: https://github.com/apache/tvm/pull/12671#discussion_r964826851


##########
python/tvm/relay/qnn/op/legalizations.py:
##########
@@ -433,15 +443,15 @@ def _qnn_conv2d_legalize_arm_cpu(attrs, inputs, types):
         attrs["groups"],
     )
     use_int8_on_arm = (not is_depthwise) and is_aarch64_arm() and attrs["data_layout"] == "NHWC"
-    if use_int8_on_arm or is_fast_int8_on_arm():
+    if use_int8_on_arm or is_fast_int8_on_arm() or is_cortexm_arm():
         return helper_change_dtypes_to_be_same(attrs, inputs, types, relay.qnn.op.conv2d)
     return helper_no_fast_int8_hw_legalization(attrs, inputs, types, relay.nn.conv2d)

Review Comment:
   Yep, that sounds good, and should make us pass the existing unit tests. I think we should be good to merge once the CI is green.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] guberti commented on a diff in pull request #12671: [microTVM] Allow int8 operations for Cortex-M cores

Posted by GitBox <gi...@apache.org>.

guberti commented on code in PR #12671:
URL: https://github.com/apache/tvm/pull/12671#discussion_r961315831


##########
python/tvm/relay/qnn/op/legalizations.py:
##########
@@ -433,15 +439,15 @@ def _qnn_conv2d_legalize_arm_cpu(attrs, inputs, types):
         attrs["groups"],
     )
     use_int8_on_arm = (not is_depthwise) and is_aarch64_arm() and attrs["data_layout"] == "NHWC"
-    if use_int8_on_arm or is_fast_int8_on_arm():
+    if use_int8_on_arm or is_fast_int8_on_arm() or is_cortexm_arm():

Review Comment:
   I put some additional justification in the `is_cortexm_arm` docstring - let me know if you still have questions.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] Mousius commented on pull request #12671: [microTVM] Allow int8 operations for Cortex-M cores

Posted by GitBox <gi...@apache.org>.

Mousius commented on PR #12671:
URL: https://github.com/apache/tvm/pull/12671#issuecomment-1235495832

   @guberti can you update the PR title/description? 😸 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] guberti commented on a diff in pull request #12671: [microTVM] Allow int8 operations for Cortex-M cores

Posted by GitBox <gi...@apache.org>.

guberti commented on code in PR #12671:
URL: https://github.com/apache/tvm/pull/12671#discussion_r961611822


##########
python/tvm/relay/qnn/op/legalizations.py:
##########
@@ -433,15 +443,15 @@ def _qnn_conv2d_legalize_arm_cpu(attrs, inputs, types):
         attrs["groups"],
     )
     use_int8_on_arm = (not is_depthwise) and is_aarch64_arm() and attrs["data_layout"] == "NHWC"
-    if use_int8_on_arm or is_fast_int8_on_arm():
+    if use_int8_on_arm or is_fast_int8_on_arm() or is_cortexm_arm():
         return helper_change_dtypes_to_be_same(attrs, inputs, types, relay.qnn.op.conv2d)
     return helper_no_fast_int8_hw_legalization(attrs, inputs, types, relay.nn.conv2d)

Review Comment:
   Sounds good to me.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] AndrewZhaoLuo commented on a diff in pull request #12671: [microTVM] Allow int8 operations for Cortex-M cores

Posted by GitBox <gi...@apache.org>.

AndrewZhaoLuo commented on code in PR #12671:
URL: https://github.com/apache/tvm/pull/12671#discussion_r961055706


##########
python/tvm/relay/qnn/op/legalizations.py:
##########
@@ -433,15 +439,15 @@ def _qnn_conv2d_legalize_arm_cpu(attrs, inputs, types):
         attrs["groups"],
     )
     use_int8_on_arm = (not is_depthwise) and is_aarch64_arm() and attrs["data_layout"] == "NHWC"
-    if use_int8_on_arm or is_fast_int8_on_arm():
+    if use_int8_on_arm or is_fast_int8_on_arm() or is_cortexm_arm():

Review Comment:
   Since it is a bit of special case 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] areusch commented on pull request #12671: [microTVM] Allow int8 operations for Cortex-M cores

Posted by GitBox <gi...@apache.org>.

areusch commented on PR #12671:
URL: https://github.com/apache/tvm/pull/12671#issuecomment-1234893034

   cc @ekalda @u99127 @leandron @Mousius could you guys have a look for architectural oversight? Likely keeping everything as int8 makes the most sense for all vN-m chipsets, but curious if you agree this is the right way to identify this subset. some schedules might apply only to DSP (for instance see #12448), but I think those schedules can be selected separately from disabling this conversion.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] Mousius commented on a diff in pull request #12671: [Relay] Change when int8 operations are converted to int16 on Arm

Posted by GitBox <gi...@apache.org>.

Mousius commented on code in PR #12671:
URL: https://github.com/apache/tvm/pull/12671#discussion_r964618472


##########
python/tvm/relay/qnn/op/legalizations.py:
##########
@@ -433,15 +443,15 @@ def _qnn_conv2d_legalize_arm_cpu(attrs, inputs, types):
         attrs["groups"],
     )
     use_int8_on_arm = (not is_depthwise) and is_aarch64_arm() and attrs["data_layout"] == "NHWC"
-    if use_int8_on_arm or is_fast_int8_on_arm():
+    if use_int8_on_arm or is_fast_int8_on_arm() or is_cortexm_arm():
         return helper_change_dtypes_to_be_same(attrs, inputs, types, relay.qnn.op.conv2d)
     return helper_no_fast_int8_hw_legalization(attrs, inputs, types, relay.nn.conv2d)

Review Comment:
   I think you're right, my boolean logic was off methinks, as I remember it the logic should cast for ASIMD and opt out if there's another option which might work better:
   
   ```
       use_int8_on_arm = (not is_depthwise) and attrs["data_layout"] == "NHWC"
       has_dotprod = is_fast_int8_on_arm()
       other_options = use_int8_on_arm or has_dotprod
       if has_asimd() and not other_options:
           return helper_no_fast_int8_hw_legalization(attrs, inputs, types, relay.nn.conv2d)
       return helper_change_dtypes_to_be_same(attrs, inputs, types, relay.qnn.op.conv2d)
   ```
   
   Does that sound right to you? 😸



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] guberti commented on a diff in pull request #12671: [Relay] Change when int8 operations are converted to int16 on Arm

Posted by GitBox <gi...@apache.org>.

guberti commented on code in PR #12671:
URL: https://github.com/apache/tvm/pull/12671#discussion_r961611822


##########
python/tvm/relay/qnn/op/legalizations.py:
##########
@@ -433,15 +443,15 @@ def _qnn_conv2d_legalize_arm_cpu(attrs, inputs, types):
         attrs["groups"],
     )
     use_int8_on_arm = (not is_depthwise) and is_aarch64_arm() and attrs["data_layout"] == "NHWC"
-    if use_int8_on_arm or is_fast_int8_on_arm():
+    if use_int8_on_arm or is_fast_int8_on_arm() or is_cortexm_arm():
         return helper_change_dtypes_to_be_same(attrs, inputs, types, relay.qnn.op.conv2d)
     return helper_no_fast_int8_hw_legalization(attrs, inputs, types, relay.nn.conv2d)

Review Comment:
   ~~Sounds good to me.~~ Actually, I'm not sure I understand. Are you proposing we do the checks like this?
   ```python
       if (
           (not is_depthwise) and has_asimd and attrs["data_layout"] == "NHWC"
           and (not is_fast_int8_on_arm())
       ):
           return helper_no_fast_int8_hw_legalization(attrs, inputs, types, relay.nn.conv2d)
       return helper_change_dtypes_to_be_same(attrs, inputs, types, relay.qnn.op.conv2d)
   ```
   I suspect I'm misunderstanding, as it seems `attrs["data_layout"] == "NHWC"` would make `int8` a *better* choice (plus previously, having `attrs["data_layout"] == "NHWC"` made us more likely to use `int8` operators). Would you mind clarifying @Mousius?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] Mousius merged pull request #12671: [Relay] Change when int8 operations are converted to int16 on Arm

Posted by GitBox <gi...@apache.org>.

Mousius merged PR #12671:
URL: https://github.com/apache/tvm/pull/12671


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] Mousius commented on a diff in pull request #12671: [microTVM] Allow int8 operations for Cortex-M cores

Posted by GitBox <gi...@apache.org>.

Mousius commented on code in PR #12671:
URL: https://github.com/apache/tvm/pull/12671#discussion_r961506187


##########
python/tvm/relay/qnn/op/legalizations.py:
##########
@@ -417,6 +417,16 @@ def is_aarch64_arm():
     return "aarch64" in target.attrs.get("mtriple", "")
 
 
+def is_cortexm_arm():
+    """Checks whether we are compiling for a Cortex-M target. We want to preserve
+    int8 operations for these boards, because it halves the number of memory loads
+    needed for dense layers and convolutions. It comes at a cost of needing more
+    micro kernels and sign extension instructions to load these values, but this
+    trade-off seems to be worth it on Cortex-M."""
+    target = tvm.target.Target.current(allow_none=False)
+    return "cortex-m" in target.attrs.get("mcpu", "")

Review Comment:
   Can we add this to https://github.com/apache/tvm/blob/main/src/target/parsers/mprofile.c as `is_mprofile` ?  We're about to remove all of these helpers in https://github.com/apache/tvm/pull/12454



##########
python/tvm/relay/qnn/op/legalizations.py:
##########
@@ -433,15 +443,15 @@ def _qnn_conv2d_legalize_arm_cpu(attrs, inputs, types):
         attrs["groups"],
     )
     use_int8_on_arm = (not is_depthwise) and is_aarch64_arm() and attrs["data_layout"] == "NHWC"
-    if use_int8_on_arm or is_fast_int8_on_arm():
+    if use_int8_on_arm or is_fast_int8_on_arm() or is_cortexm_arm():
         return helper_change_dtypes_to_be_same(attrs, inputs, types, relay.qnn.op.conv2d)
     return helper_no_fast_int8_hw_legalization(attrs, inputs, types, relay.nn.conv2d)

Review Comment:
   I think this is actually the same as an issue awhile back (https://github.com/apache/tvm/issues/8717#issuecomment-976400930), instead of special casing with `is_cortexm` you should be able to instead use something like:
   
   ```suggestion
       has_asimd = (is_aarch64_arm() or "+neon" in target.mattr)
       if has_asimd and not is_fast_int8_on_arm():
           return helper_no_fast_int8_hw_legalization(attrs, inputs, types, relay.nn.conv2d)
       return helper_change_dtypes_to_be_same(attrs, inputs, types, relay.qnn.op.conv2d)
   ```
   
   Which means it will only do the casting when the specific architecture features are available (`has_asimd` will become `target.features.has_asimd` after https://github.com/apache/tvm/pull/12454 so is fine temporarily)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] AndrewZhaoLuo commented on a diff in pull request #12671: [microTVM] Allow int8 operations for Cortex-M cores

Posted by GitBox <gi...@apache.org>.

AndrewZhaoLuo commented on code in PR #12671:
URL: https://github.com/apache/tvm/pull/12671#discussion_r961055465


##########
python/tvm/relay/qnn/op/legalizations.py:
##########
@@ -433,15 +439,15 @@ def _qnn_conv2d_legalize_arm_cpu(attrs, inputs, types):
         attrs["groups"],
     )
     use_int8_on_arm = (not is_depthwise) and is_aarch64_arm() and attrs["data_layout"] == "NHWC"
-    if use_int8_on_arm or is_fast_int8_on_arm():
+    if use_int8_on_arm or is_fast_int8_on_arm() or is_cortexm_arm():

Review Comment:
   Can you add a comment?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] guberti commented on pull request #12671: [Relay] Change when int8 operations are converted to int16 on Arm

Posted by GitBox <gi...@apache.org>.

guberti commented on PR #12671:
URL: https://github.com/apache/tvm/pull/12671#issuecomment-1235712132

   > @guberti can you update the PR title/description? smile_cat
   
   @Mousius fixed!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] guberti commented on a diff in pull request #12671: [microTVM] Allow int8 operations for Cortex-M cores

Posted by GitBox <gi...@apache.org>.

guberti commented on code in PR #12671:
URL: https://github.com/apache/tvm/pull/12671#discussion_r961524342


##########
python/tvm/relay/qnn/op/legalizations.py:
##########
@@ -433,15 +443,15 @@ def _qnn_conv2d_legalize_arm_cpu(attrs, inputs, types):
         attrs["groups"],
     )
     use_int8_on_arm = (not is_depthwise) and is_aarch64_arm() and attrs["data_layout"] == "NHWC"
-    if use_int8_on_arm or is_fast_int8_on_arm():
+    if use_int8_on_arm or is_fast_int8_on_arm() or is_cortexm_arm():
         return helper_change_dtypes_to_be_same(attrs, inputs, types, relay.qnn.op.conv2d)
     return helper_no_fast_int8_hw_legalization(attrs, inputs, types, relay.nn.conv2d)

Review Comment:
   That makes a lot of sense to me, and is definitely cleaner than special casing Cortex-M. To clarify though - are you suggesting removing the checks for `is_depthwise` and whether `attrs["data_layout"] == "NHWC"`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] Mousius commented on a diff in pull request #12671: [microTVM] Allow int8 operations for Cortex-M cores

Posted by GitBox <gi...@apache.org>.

Mousius commented on code in PR #12671:
URL: https://github.com/apache/tvm/pull/12671#discussion_r961528120


##########
python/tvm/relay/qnn/op/legalizations.py:
##########
@@ -433,15 +443,15 @@ def _qnn_conv2d_legalize_arm_cpu(attrs, inputs, types):
         attrs["groups"],
     )
     use_int8_on_arm = (not is_depthwise) and is_aarch64_arm() and attrs["data_layout"] == "NHWC"
-    if use_int8_on_arm or is_fast_int8_on_arm():
+    if use_int8_on_arm or is_fast_int8_on_arm() or is_cortexm_arm():
         return helper_change_dtypes_to_be_same(attrs, inputs, types, relay.qnn.op.conv2d)
     return helper_no_fast_int8_hw_legalization(attrs, inputs, types, relay.nn.conv2d)

Review Comment:
   Good spot, I think we should keep the check as `(not is_depthwise) and has_asimd and attrs["data_layout"] == "NHWC"` for invoking the helper with the casting?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org