You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2022/02/08 20:38:56 UTC

[GitHub] [tvm] masahi commented on pull request #10185: [CUTLASS] Add parallel split-k support to wgrad

masahi commented on pull request #10185:
URL: https://github.com/apache/tvm/pull/10185#issuecomment-1033040112


   Hi Manish,
   
   > Can you share the size that has accuracy issues. Can you repro the accuracy issue in profiler? 
   
   The benchmark result I linked above show accuracy difference in the last two columns. Most workload have some differences, except for some deeper layers in batch = 8 which showed exact match. It seems deeper layers, those having small spatial size and large channels, have generally less accuracy problems. The differences become much bigger for batch = 256. So it kind of works but not quite, it is very hard to debug. The profiler in cutlass doesn't report any accuracy problem, which is another mystery. It could be TVM's use of cuDNN wgrad having some issues.
   
   > Both cuDNN and CUTLASS offers similar get_workspace_size(...) API. Thus, I believe this part should be similar.
   
   The issue is memory reuse across multiple calls. The way we integrate cuDNN and cutlass are significantly different. I tried to apply a similar memory management strategy we use for cuDNN to the JIT-generated cutlass, but as I said above I'm having strange issues.
   
   > As we discussed in an another thread, we run sweeps to find the best split. You can cut down the sweep on k by using a simple analytic model.
   
   Yes, I haven't grokked your note in that thread. I just tried a dumb strategy in my benchmark and it already shows good performance. I didn't pursue perf improvement further, since the accuracy problem was more concerning.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org