You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2021/09/02 17:22:06 UTC

[GitHub] [tvm] echuraev commented on pull request #8636: [OpenCL] Add vectorization to cuda conv2d_nhwc schedule

echuraev commented on pull request #8636:
URL: https://github.com/apache/tvm/pull/8636#issuecomment-911900705


   @mbrookhart I wasn't able to reproduce the regression on NVidia 3070. As a possible solution, I can create a separate `conv2d_nhwc` schedule for OpenCL. But I saw that my fix for OpenCL issue with `global_work_size` also works for default schedule on Cuda: https://github.com/apache/tvm/pull/8636/files#diff-05fdfdcbc0bdf86e1df35950ae34877c2f9dbddab6a99ca630582547d4e7e0faL88-L89
   
   |                            | Results on NVidia 3090 (1024 trials per kernel) | Results on NVidia 3070 Mobile (512 trials per kernel) |
   |----------------------------|-------------------------------------------------|-------------------------------------------------------|
   | Main (AutoTVM)             | 0.17 ms                                         | 0.4504 ms                                             |
   | Main (Ansor)               | 0.18 ms                                         | 0.4284 ms                                             |
   | Code from the PR (AutoTVM) | 0.26 ms                                         | 0.4200 ms                                             |
   | Code from the PR (Ansor)   | 0.19 ms                                         | 0.4200 ms                                             |


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org