You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2022/02/11 08:40:03 UTC

[GitHub] [tvm] MasterJH5574 commented on pull request #10207: Support sub warp reduction for CUDA target.

MasterJH5574 commented on pull request #10207:
URL: https://github.com/apache/tvm/pull/10207#issuecomment-1035982910


   Interesting. Looks like the perf improvement isn't very much? Only when `n = 4` the shuffle-down implementation is better than the shared memory implementation 🤔
   
   > Another thing worth noting is, we can only allow cross warp reduction by shuffle-down, thus warp size must be a multiple of blockDim.x when blockDim.y * blockDim.z != 1.
   
   BTW do we have this requirement in the codebase now?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org