You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2022/04/12 20:19:58 UTC

[GitHub] [tvm] masahi commented on a diff in pull request #10450: [ONNX] Add MatMulInteger importer

masahi commented on code in PR #10450:
URL: https://github.com/apache/tvm/pull/10450#discussion_r848835556


##########
python/tvm/topi/cuda/tensorcore_alter_op.py:
##########
@@ -148,16 +148,18 @@ def _dense_legalize(attrs, inputs, arg_types):
 
     # Pad input and output channels to use tensorcore schedule.
     if dtype in ["float16", "int8", "uint8"]:
-        # The shape of (M, K, N) must be multiple of (16, 16, 16) or (32, 16, 8) or (8, 16, 32)
+        # The shape of (M, K, N) must be multiple of
+        # (16, 16, 16) or (32, 16, 8) or (8, 16, 32) or (4, 4, 4)
         if (
             (M % 8 == 0 and K % 16 == 0 and N % 32 == 0)
             or (M % 16 == 0 and K % 16 == 0 and N % 16 == 0)
             or (M % 32 == 0 and K % 16 == 0 and N % 8 == 0)
+            or (M % 4 == 0 and K % 4 == 0 and N % 4 == 0)
         ):
             # no need to pad
             return None
 
-        candidates = [(16, 16, 16), (32, 16, 8), (8, 16, 32)]
+        candidates = [(16, 16, 16), (32, 16, 8), (8, 16, 32), (4, 4, 4)]

Review Comment:
   Can you try decoupling `(4, 4, 4)` padding from tensorcore stuff in this file? The shape `(4, 4, 4)` is invalid for tensorcore.
   
   For int8, I think rather than hard-coding `(4, 4, 4)` (which doesn't work for other shape like 13), I think the right solution is to "pad to the nearest multiple of 4 greater than the given dim". 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org