You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2021/04/12 17:17:33 UTC

[GitHub] [tvm] hypercubestart opened a new pull request #7831: [AutoTVM] [TOPI] Support AutoTVM for int4 tensorcore

hypercubestart opened a new pull request #7831:
URL: https://github.com/apache/tvm/pull/7831


   adds support for int4 in AutoTVM and fixes bugs, done with @ZihengJiang 
   
   cc: @Laurawly @Hzfengsy @anijain2305 @tqchen @masahi 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] hypercubestart commented on pull request #7831: [AutoTVM] [TOPI] Support AutoTVM for int4 tensorcore

Posted by GitBox <gi...@apache.org>.
hypercubestart commented on pull request #7831:
URL: https://github.com/apache/tvm/pull/7831#issuecomment-929679387


   > Hi @hypercubestart , Great work! I'm doing with 4bit in TVM now. But I found there are two points can be improved in asnumpy.
   > 
   > 1. Asnumpy don't support the conversion of negative numbers.
   > 2. Asnumpy loss a 4 bit data when shape is odd.
   >    Am I right? I have modified these parts locally. Could you review it?
   
   @liubowen520 good points! makes sense to me, feel free to create a PR and cc me and some other people to review


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] hypercubestart edited a comment on pull request #7831: [AutoTVM] [TOPI] Support AutoTVM for int4 tensorcore

Posted by GitBox <gi...@apache.org>.
hypercubestart edited a comment on pull request #7831:
URL: https://github.com/apache/tvm/pull/7831#issuecomment-819249259


   > The code LTGM. But would you like to show some performance results for int4?
   
   yes, I'm testing some combinations of the removed knobs and will show perf results once the parity reaches the results from https://github.com/apache/tvm/pull/6121


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] jlimmm edited a comment on pull request #7831: [AutoTVM] [TOPI] Support AutoTVM for int4 tensorcore

Posted by GitBox <gi...@apache.org>.
jlimmm edited a comment on pull request #7831:
URL: https://github.com/apache/tvm/pull/7831#issuecomment-915113674


   @hypercubestart Hello, I'd like to reproduce your table on T4+int4. It would be appreciated if you could share a sample code. Thank you.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] ZihengJiang commented on pull request #7831: [AutoTVM] [TOPI] Support AutoTVM for int4 tensorcore

Posted by GitBox <gi...@apache.org>.
ZihengJiang commented on pull request #7831:
URL: https://github.com/apache/tvm/pull/7831#issuecomment-829742947


   LGTM. Thanks @hypercubestart !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] jlimmm commented on pull request #7831: [AutoTVM] [TOPI] Support AutoTVM for int4 tensorcore

Posted by GitBox <gi...@apache.org>.
jlimmm commented on pull request #7831:
URL: https://github.com/apache/tvm/pull/7831#issuecomment-915113674


   @hypercubestart Hello, I'd like to reproduce your table on T4+int4. It would be appreciated if you could share a test code. Thank you.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] ZihengJiang merged pull request #7831: [AutoTVM] [TOPI] Support AutoTVM for int4 tensorcore

Posted by GitBox <gi...@apache.org>.
ZihengJiang merged pull request #7831:
URL: https://github.com/apache/tvm/pull/7831


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] hypercubestart commented on a change in pull request #7831: [AutoTVM] [TOPI] Support AutoTVM for int4 tensorcore

Posted by GitBox <gi...@apache.org>.
hypercubestart commented on a change in pull request #7831:
URL: https://github.com/apache/tvm/pull/7831#discussion_r621330715



##########
File path: src/runtime/contrib/random/mt_random_engine.cc
##########
@@ -134,12 +134,16 @@ class RandomEngine {
 
  private:
   void FillData(DLTensor* tensor, int64_t size) {
-    // Make the value be 1.0 - 10.0, not (0.0 - 1.0) so that we could satisfy
+    // Make the value be 17.0 - 30.0, not (0.0 - 1.0) so that we could satisfy
     // quantized dtype (uint8 / int8) data non-empty requirement
-    std::uniform_real_distribution<> dist(1.0, 10.0);
+    // We start from 17.0 because two int4 are packed in a single uint8
+    std::uniform_real_distribution<> dist(17.0, 30.0);

Review comment:
       edited so that only int uses a different distribution from 17-30




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] liubowen520 commented on pull request #7831: [AutoTVM] [TOPI] Support AutoTVM for int4 tensorcore

Posted by GitBox <gi...@apache.org>.
liubowen520 commented on pull request #7831:
URL: https://github.com/apache/tvm/pull/7831#issuecomment-926574313


   Hi @hypercubestart , Great work! I'm doing with 4bit in TVM now. But I found there are two points can be improved in asnumpy.
   1. Asnumpy don't support the conversion of negative numbers.
   2. Asnumpy loss a 4 bit data when shape is odd.
   Am I right? I have modified these parts locally. Could you review it?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] hypercubestart commented on a change in pull request #7831: [AutoTVM] [TOPI] Support AutoTVM for int4 tensorcore

Posted by GitBox <gi...@apache.org>.
hypercubestart commented on a change in pull request #7831:
URL: https://github.com/apache/tvm/pull/7831#discussion_r612066543



##########
File path: src/runtime/contrib/random/mt_random_engine.cc
##########
@@ -134,12 +134,16 @@ class RandomEngine {
 
  private:
   void FillData(DLTensor* tensor, int64_t size) {
-    // Make the value be 1.0 - 10.0, not (0.0 - 1.0) so that we could satisfy
+    // Make the value be 17.0 - 30.0, not (0.0 - 1.0) so that we could satisfy
     // quantized dtype (uint8 / int8) data non-empty requirement
-    std::uniform_real_distribution<> dist(1.0, 10.0);
+    // We start from 17.0 because two int4 are packed in a single uint8
+    std::uniform_real_distribution<> dist(17.0, 30.0);

Review comment:
       17.0 is lower bound for both int4x2 to be > 0, 30.0 is arbitrary upper bound 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] Hzfengsy commented on a change in pull request #7831: [AutoTVM] [TOPI] Support AutoTVM for int4 tensorcore

Posted by GitBox <gi...@apache.org>.
Hzfengsy commented on a change in pull request #7831:
URL: https://github.com/apache/tvm/pull/7831#discussion_r612146748



##########
File path: src/runtime/contrib/random/mt_random_engine.cc
##########
@@ -134,12 +134,16 @@ class RandomEngine {
 
  private:
   void FillData(DLTensor* tensor, int64_t size) {
-    // Make the value be 1.0 - 10.0, not (0.0 - 1.0) so that we could satisfy
+    // Make the value be 17.0 - 30.0, not (0.0 - 1.0) so that we could satisfy
     // quantized dtype (uint8 / int8) data non-empty requirement
-    std::uniform_real_distribution<> dist(1.0, 10.0);
+    // We start from 17.0 because two int4 are packed in a single uint8
+    std::uniform_real_distribution<> dist(17.0, 30.0);

Review comment:
       It makes sense to int4. But I'm not sure if it will influence other workloads (e.g int8).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] hypercubestart commented on a change in pull request #7831: [AutoTVM] [TOPI] Support AutoTVM for int4 tensorcore

Posted by GitBox <gi...@apache.org>.
hypercubestart commented on a change in pull request #7831:
URL: https://github.com/apache/tvm/pull/7831#discussion_r612748696



##########
File path: python/tvm/topi/cuda/conv2d_hwnc_tensorcore.py
##########
@@ -254,13 +253,8 @@ def schedule_hwnc_tensorcore_cuda(cfg, s, Conv):
     vector_as = cfg["vector_as"].val
     vector_ws = cfg["vector_ws"].val
     split_block_k_nums = cfg["split_block_k_nums"].val
-    fuse_pack = cfg["fuse_pack"].val
 
-    if not fuse_pack:
-        s[packed_data].compute_inline()
-    else:
-        with Target("cuda"):
-            schedule_injective_from_existing(s, packed_data)
+    s[packed_data].compute_inline()

Review comment:
       this is a problem for both `int4/int8`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] hypercubestart commented on a change in pull request #7831: [AutoTVM] [TOPI] Support AutoTVM for int4 tensorcore

Posted by GitBox <gi...@apache.org>.
hypercubestart commented on a change in pull request #7831:
URL: https://github.com/apache/tvm/pull/7831#discussion_r612066255



##########
File path: python/tvm/topi/cuda/conv2d_hwnc_tensorcore.py
##########
@@ -254,13 +253,8 @@ def schedule_hwnc_tensorcore_cuda(cfg, s, Conv):
     vector_as = cfg["vector_as"].val
     vector_ws = cfg["vector_ws"].val
     split_block_k_nums = cfg["split_block_k_nums"].val
-    fuse_pack = cfg["fuse_pack"].val
 
-    if not fuse_pack:
-        s[packed_data].compute_inline()
-    else:
-        with Target("cuda"):
-            schedule_injective_from_existing(s, packed_data)
+    s[packed_data].compute_inline()

Review comment:
       having `fuse` and `compute_at` in the search space causes loop itervars to touch buffers a different number of times, which causes errors for AutoTVM's feature extraction, i.e feature extraction will return different size features




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] jlimmm edited a comment on pull request #7831: [AutoTVM] [TOPI] Support AutoTVM for int4 tensorcore

Posted by GitBox <gi...@apache.org>.
jlimmm edited a comment on pull request #7831:
URL: https://github.com/apache/tvm/pull/7831#issuecomment-915113674


   @hypercubestart Hello, I'd like to reproduce your table on T4+int4, but it seems that the current AutoTVM tutorial code [(link)](https://tvm.apache.org/docs/tutorials/autotvm/tune_relay_cuda.html#sphx-glr-tutorials-autotvm-tune-relay-cuda-py) cannot utilize the int4+tensorcore template. It would be appreciated if you could share a sample code. Thank you.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] Hzfengsy commented on a change in pull request #7831: [AutoTVM] [TOPI] Support AutoTVM for int4 tensorcore

Posted by GitBox <gi...@apache.org>.
Hzfengsy commented on a change in pull request #7831:
URL: https://github.com/apache/tvm/pull/7831#discussion_r612150174



##########
File path: python/tvm/topi/cuda/conv2d_hwnc_tensorcore.py
##########
@@ -254,13 +253,8 @@ def schedule_hwnc_tensorcore_cuda(cfg, s, Conv):
     vector_as = cfg["vector_as"].val
     vector_ws = cfg["vector_ws"].val
     split_block_k_nums = cfg["split_block_k_nums"].val
-    fuse_pack = cfg["fuse_pack"].val
 
-    if not fuse_pack:
-        s[packed_data].compute_inline()
-    else:
-        with Target("cuda"):
-            schedule_injective_from_existing(s, packed_data)
+    s[packed_data].compute_inline()

Review comment:
       Does it only happen in `int4` or it is a common problem for both `int8/int4`?
   If it is just for `int4`, we should not change it because it may make `int8/float16` slower. alternatively, we can add a single file only for `int4` schedule.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] Hzfengsy commented on a change in pull request #7831: [AutoTVM] [TOPI] Support AutoTVM for int4 tensorcore

Posted by GitBox <gi...@apache.org>.
Hzfengsy commented on a change in pull request #7831:
URL: https://github.com/apache/tvm/pull/7831#discussion_r612058941



##########
File path: python/tvm/topi/cuda/conv2d_hwnc_tensorcore.py
##########
@@ -254,13 +253,8 @@ def schedule_hwnc_tensorcore_cuda(cfg, s, Conv):
     vector_as = cfg["vector_as"].val
     vector_ws = cfg["vector_ws"].val
     split_block_k_nums = cfg["split_block_k_nums"].val
-    fuse_pack = cfg["fuse_pack"].val
 
-    if not fuse_pack:
-        s[packed_data].compute_inline()
-    else:
-        with Target("cuda"):
-            schedule_injective_from_existing(s, packed_data)
+    s[packed_data].compute_inline()

Review comment:
       Please show why narrowing search space

##########
File path: src/runtime/contrib/random/mt_random_engine.cc
##########
@@ -134,12 +134,16 @@ class RandomEngine {
 
  private:
   void FillData(DLTensor* tensor, int64_t size) {
-    // Make the value be 1.0 - 10.0, not (0.0 - 1.0) so that we could satisfy
+    // Make the value be 17.0 - 30.0, not (0.0 - 1.0) so that we could satisfy
     // quantized dtype (uint8 / int8) data non-empty requirement
-    std::uniform_real_distribution<> dist(1.0, 10.0);
+    // We start from 17.0 because two int4 are packed in a single uint8
+    std::uniform_real_distribution<> dist(17.0, 30.0);

Review comment:
       Why 17.0-30.0?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] hypercubestart commented on pull request #7831: [AutoTVM] [TOPI] Support AutoTVM for int4 tensorcore

Posted by GitBox <gi...@apache.org>.
hypercubestart commented on pull request #7831:
URL: https://github.com/apache/tvm/pull/7831#issuecomment-929679387


   > Hi @hypercubestart , Great work! I'm doing with 4bit in TVM now. But I found there are two points can be improved in asnumpy.
   > 
   > 1. Asnumpy don't support the conversion of negative numbers.
   > 2. Asnumpy loss a 4 bit data when shape is odd.
   >    Am I right? I have modified these parts locally. Could you review it?
   
   @liubowen520 good points! makes sense to me, feel free to create a PR and cc me and some other people to review


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] hypercubestart commented on pull request #7831: [AutoTVM] [TOPI] Support AutoTVM for int4 tensorcore

Posted by GitBox <gi...@apache.org>.
hypercubestart commented on pull request #7831:
URL: https://github.com/apache/tvm/pull/7831#issuecomment-915627466


   > @hypercubestart Hello, I'd like to reproduce your table on T4+int4, but it seems that the current AutoTVM tutorial code [(link)](https://tvm.apache.org/docs/tutorials/autotvm/tune_relay_cuda.html#sphx-glr-tutorials-autotvm-tune-relay-cuda-py) cannot utilize the int4+tensorcore template. It would be appreciated if you could share a sample code. Thank you.
   
   hi! Unfortunately I don't have the code anymore but the PR has an example of creating a network consisting of a single int4 conv2d https://github.com/apache/tvm/blob/f8b1df4d297e19a20914700e7519543f6f3ac233/tests/python/topi/python/test_topi_conv2d_hwnc_tensorcore.py#L149-L167 using the utilities from https://github.com/apache/tvm/pull/6748, so you could reuse most of the AutoTVM tutorial code, but simply replace the network with the network shown above
   
   AutoTVM will then be able to automatically infer the int4+tensorcore template


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] hypercubestart commented on pull request #7831: [AutoTVM] [TOPI] Support AutoTVM for int4 tensorcore

Posted by GitBox <gi...@apache.org>.
hypercubestart commented on pull request #7831:
URL: https://github.com/apache/tvm/pull/7831#issuecomment-819249259


   > The code LTGM. But would you like to show some performance results for int4?
   
   yes, I'm testing some combinations of the orthogonal knobs I deleted and will show perf results once the parity reaches the results from https://github.com/apache/tvm/pull/6121


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] hypercubestart edited a comment on pull request #7831: [AutoTVM] [TOPI] Support AutoTVM for int4 tensorcore

Posted by GitBox <gi...@apache.org>.
hypercubestart edited a comment on pull request #7831:
URL: https://github.com/apache/tvm/pull/7831#issuecomment-915627466


   > @hypercubestart Hello, I'd like to reproduce your table on T4+int4, but it seems that the current AutoTVM tutorial code [(link)](https://tvm.apache.org/docs/tutorials/autotvm/tune_relay_cuda.html#sphx-glr-tutorials-autotvm-tune-relay-cuda-py) cannot utilize the int4+tensorcore template. It would be appreciated if you could share a sample code. Thank you.
   
   hi @jlimmm! Unfortunately I don't have the code anymore but the PR has an example of creating a network consisting of a single int4 conv2d https://github.com/apache/tvm/blob/f8b1df4d297e19a20914700e7519543f6f3ac233/tests/python/topi/python/test_topi_conv2d_hwnc_tensorcore.py#L149-L167 using the utilities from https://github.com/apache/tvm/pull/6748, so you could reuse most of the AutoTVM tutorial code, but simply replace the network with the network shown above
   
   AutoTVM will then be able to automatically infer the int4+tensorcore template


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org