You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2020/06/04 20:52:52 UTC

[GitHub] [incubator-tvm] t-vi edited a comment on pull request #5727: ROCm warp shuffles and reductions

t-vi edited a comment on pull request #5727:
URL: https://github.com/apache/incubator-tvm/pull/5727#issuecomment-639109441


   That's the idea, yes. In my microbenchmark of the imagenet softmax on the Radeon VII, I'm going from ~140µs to ~14µs. The baseline from PyTorch (handcrafted but somewhat generic kernel) is ~18µs, so this is going well. :slightly_smiling_face: 
   Of course, the topi work is entirely @wpan11nv 's. I'm quite happy I managed to enable warp reductions on ROCm, though.
   And the speedup is not just the warp reductions, the previous softmax in topi was quite unoptimized.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org