You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tvm.apache.org by "LeiWang1999 (via GitHub)" <gi...@apache.org> on 2023/08/03 05:59:15 UTC

[GitHub] [tvm] LeiWang1999 commented on pull request #15462: [Relax] CuDNN Fallback Support through BYOC.

LeiWang1999 commented on PR #15462:
URL: https://github.com/apache/tvm/pull/15462#issuecomment-1663336808

   > Thank you, @LeiWang1999 for bringing cudnn backend :) Regarding find_cudnn_best_algo, there are existing API for both Python and C++ sides that you can reuse. https://github.com/apache/tvm/blob/unity/python/tvm/contrib/cudnn.py#L367 https://github.com/apache/tvm/blob/unity/src/runtime/contrib/cudnn/conv_forward.cc#L212
   > 
   > By the way, do you also plan to work on attention operators by chance? https://docs.nvidia.com/deeplearning/cudnn/api/index.html#cudnnMultiHeadAttnForward
   
   Thanks, I understand that there are existing APIs for finding the best algorithm, but my confusion lies in determining when to incorporate the find_best_algo function. In certain inference frameworks, they utilize a static flag to enable algorithm discovery during the warmup phase. but I don't no if tvm can enable this at runtime.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org