You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2022/11/09 20:32:25 UTC

[GitHub] [tvm] masahi opened a new pull request, #13334: [MetaSchedule] Improve inlining and `VerifyGPUCode` for quantized model workload

masahi opened a new pull request, #13334:
URL: https://github.com/apache/tvm/pull/13334

   1. Workloads from quantized often have a trivial block which only produces a constant scalar:
   ```
   with T.block("compile_engine_const"):
       vi = T.axis.spatial(1, 0)
       T.reads()
       T.writes(compile_engine_const[()])
       compile_engine_const[()] = 59
   ```
   This can be inlined by existing `AutoInline` rule, but depending on the order where spatial blocks are processed by `AutoInline`, these "compile_engine_const" blocks can get in the way of `ReverseComputeInline` on other blocks, since the constant blocks also counted as a producer block. `PostOrderApply` currently processes the constant blocks at the very end, so `ReverseComputeInline` on blocks that consumes such constants always fail to inline. So in practice, we are not generating a fused kernel for quantized conv2d today.
   
   I added a simple inlining rule that inlines only such constant blocks. This rule is supposed to run before `AutoInline`, to unblock `ReverseComputeInline`. This lets us generate a fused kernel. On the int8 resnet50 model from PyTorch, the e2e perf improved from 6.8 to 5.2 msec, using batch size 16, and the same number of trials.   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] masahi commented on pull request #13334: [MetaSchedule] Improve inlining and `VerifyGPUCode` for quantized model workload

Posted by GitBox <gi...@apache.org>.
masahi commented on PR #13334:
URL: https://github.com/apache/tvm/pull/13334#issuecomment-1309846957

   Removed the identification of constant blocks by name, and replaced it with more robust method based on the block structure.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] masahi commented on pull request #13334: [MetaSchedule] Improve inlining and `VerifyGPUCode` for quantized model workload

Posted by GitBox <gi...@apache.org>.
masahi commented on PR #13334:
URL: https://github.com/apache/tvm/pull/13334#issuecomment-1309542522

   @junrushao I realized that an easier way would be to check the content of the block to determine if it is a constant block, rather than relying on the block name. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] tvm-bot commented on pull request #13334: [MetaSchedule] Improve inlining and `VerifyGPUCode` for quantized model workload

Posted by GitBox <gi...@apache.org>.
tvm-bot commented on PR #13334:
URL: https://github.com/apache/tvm/pull/13334#issuecomment-1309342351

   <!---bot-comment-->
   
   Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from [Reviewers](https://github.com/apache/incubator-tvm/blob/master/CONTRIBUTORS.md#reviewers) by @-ing them in a comment.
   
   <!--bot-comment-ccs-start-->
    * cc @Hzfengsy, @elvin-n, @junrushao <sub>See [#10317](https://github.com/apache/tvm/issues/10317) for details</sub><!--bot-comment-ccs-end-->
   
   <sub>Generated by [tvm-bot](https://github.com/apache/tvm/blob/main/ci/README.md#github-actions)</sub>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] masahi merged pull request #13334: [MetaSchedule] Improve inlining and `VerifyGPUCode` for quantized model workload

Posted by GitBox <gi...@apache.org>.
masahi merged PR #13334:
URL: https://github.com/apache/tvm/pull/13334


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] masahi closed pull request #13334: [MetaSchedule] Improve inlining and `VerifyGPUCode` for quantized model workload

Posted by GitBox <gi...@apache.org>.
masahi closed pull request #13334: [MetaSchedule] Improve inlining and `VerifyGPUCode` for quantized model workload
URL: https://github.com/apache/tvm/pull/13334


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] masahi commented on pull request #13334: [MetaSchedule] Improve inlining and `VerifyGPUCode` for quantized model workload

Posted by GitBox <gi...@apache.org>.
masahi commented on PR #13334:
URL: https://github.com/apache/tvm/pull/13334#issuecomment-1310908985

   cc @vinx13 @junrushao please take a look.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] junrushao commented on pull request #13334: [MetaSchedule] Improve inlining and `VerifyGPUCode` for quantized model workload

Posted by GitBox <gi...@apache.org>.
junrushao commented on PR #13334:
URL: https://github.com/apache/tvm/pull/13334#issuecomment-1309373300

   Hey thanks for the contribution!
   
   I was a bit uncertain if we really want to do name checking to determine constants from the compile engine, because it relies on the assumption that relay exists and relay always use `compile_engine_const` as the constant it introduces, which could be fragile in some certain cases.
   
   There is an alternative I could come up with, and please let me know if it makes sense:
   
   Add a `schedule_rule` attribute here (https://github.com/apache/tvm/blob/fbe174bd6c3054ec480c9551610030bdf2d8b64d/src/relay/backend/te_compiler_cache.cc#L275), which will guide TIR to generate the annotation below:
   
   ```python
   T.block_attr({"schedule_rule": "compute_inline"})
   ```
   
   Then register a PackedFunc `meta_schedule.generic.compute_inline` to apply `compute–inline` as part of the custom schedule rule.
   
   Let me know if it makes sense!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [tvm] masahi commented on pull request #13334: [MetaSchedule] Improve inlining and `VerifyGPUCode` for quantized model workload

Posted by GitBox <gi...@apache.org>.
masahi commented on PR #13334:
URL: https://github.com/apache/tvm/pull/13334#issuecomment-1309381919

   @junrushao I like your idea, I'll rework this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org