You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2021/03/24 18:19:55 UTC

[GitHub] [tvm] csullivan edited a comment on issue #7730: [Bug] Missing broadcast_to before batch_matmul for CuBLAS

csullivan edited a comment on issue #7730:
URL: https://github.com/apache/tvm/issues/7730#issuecomment-806052775


   Thanks @comaniac @masahi. Yes the problem is that different targets, and target specific topi implementations, can support different optimizations. In the case of using the blas libraries supported for a target, implicit broadcast is not supported. 
   
   One option that comes to mind is to add a shape legalization pass that adds the broadcast if a target has specific attributes (e.g. libs=cublas/rocblas etc). However this isn't sufficient; depending on the op strategy priorities or the applied tuning configs, it's possible that the blas library implementation won't be used. A better option could be to make use of #7518, and do the shape legalization after the primitive functions have been lowered to TIR and can be inspected.
   
   We could also disable implicit broadcast, but that can increase the memory use (from folding the constant broadcasts) which we've seen overflow device memory for larger batch sizes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org