You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2020/12/24 07:29:40 UTC

[GitHub] [tvm] Meteorix commented on pull request #7146: [CUDA]batch_matmul tensorcore schedule

Meteorix commented on pull request #7146:
URL: https://github.com/apache/tvm/pull/7146#issuecomment-750787545


   > @Meteorix out of curiosity can you share some of your benchmarking results? I'd love to know how much faster this performs than cublas.
   
   @jwfromm following are some of the benchmark(tuning 1000 times). This schedule beat cublas on some shapes. That is also why I made `batch_matmul_cublas` autotunable in this pr.
   
   ```
   Shape: [1, 64, 1024] [1, 4096, 1024]
   batch_matmul_tensorcore.cuda   2.9238894640234948e-05
   batch_matmul_cublas.cuda       2.7487557097865394e-05 
   batch_matmul.cuda              0.00014189747117647058
   
   Shape: [1, 64, 1024] [1, 1024, 1024]
   batch_matmul_tensorcore.cuda   1.5578384301061096e-05 
   batch_matmul_cublas.cuda       2.041829239101948e-05
   batch_matmul.cuda              6.108717968157696e-05
   
   Shape: [1, 128, 1024] [1, 4096, 1024]
   batch_matmul_tensorcore.cuda   0.00011345079327976625
   batch_matmul_cublas.cuda       0.00011074180193236715 
   batch_matmul.cuda              0.00024510443407707913
   
   Shape: [1, 128, 4096] [1, 1024, 4096]
   batch_matmul_tensorcore.cuda   0.00017083510384959715
   batch_matmul_cublas.cuda       0.00010608833085714285 
   batch_matmul.cuda              0.00035638234315169367
   
   Shape: [16, 128, 64] [16, 128, 64]
   batch_matmul_cublas.cuda       6.046038943091678e-06
   batch_matmul_tensorcore.cuda   4.134768131265665e-06 
   batch_matmul.cuda              1.2430305571941866e-05
   
   Shape: [16, 128, 128] [16, 64, 128]
   batch_matmul_tensorcore.cuda   4.74178964860194e-06 
   batch_matmul_cublas.cuda       9.463372359711623e-06
   batch_matmul.cuda              1.4179731404708587e-05
   
   Shape: [1, 128, 1024] [1, 1024, 1024]
   batch_matmul_tensorcore.cuda   3.857668104222821e-05
   batch_matmul_cublas.cuda       2.3704257450575394e-05 
   batch_matmul.cuda              0.0002515613367983368
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org