You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2021/10/13 08:34:59 UTC

[GitHub] [tvm] comaniac commented on pull request #9261: [BYOC] CUTLASS integration

comaniac commented on pull request #9261:
URL: https://github.com/apache/tvm/pull/9261#issuecomment-942064270


   IMHO, CUTLASS doesn't naturally benefit dynamic workloads due to the exact reason you mentioned. We internally use CUTLASS for training and it works well because we JIT kernels with known shapes in runtime.
   
   In the case of CUTLASS with BYOC in TVM for inference, my impression is we could leverage high performance kernel templates while 1) keeping the binary self-contained, 2) fusing ops, and 3) having lightweight tuning (e.g., ~10 trials similar to CUDNN). On the other hand, dynamic workloads are still challenging, and hopefully our ongoing efforts of dynamic kernel tuning and generation could be landed soon to make it happen.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org