You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2021/05/21 16:31:59 UTC

[GitHub] [tvm] areusch opened a new pull request #8108: Fix test_cublas and relax tolerance

areusch opened a new pull request #8108:
URL: https://github.com/apache/tvm/pull/8108


    * Allows test to pass on ci-gpu with new 18.04 image.
   
   @tkonolige @tqchen 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] Hzfengsy commented on pull request #8108: Fix test_cublas and relax tolerance

Posted by GitBox <gi...@apache.org>.

Hzfengsy commented on pull request #8108:
URL: https://github.com/apache/tvm/pull/8108#issuecomment-847452388


   I also prefer to keep accuracy. Just as @comaniac said, 1e-2 is too low for larger end-to-end workloads


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] comaniac commented on pull request #8108: Fix test_cublas and relax tolerance

Posted by GitBox <gi...@apache.org>.

comaniac commented on pull request #8108:
URL: https://github.com/apache/tvm/pull/8108#issuecomment-846302151


   FYI: It's possible that this error is caused by the CUBLAS math flag: https://github.com/apache/tvm/blob/813136401a11a49d6c15e6013c34dd822a5c4ff6/src/runtime/contrib/cublas/cublas.cc#L38
   
   According to the CUBLAS document, this flag is being deprecated. We also made some tests previously around this flag, and found that it is effective even for float32, meaning that CUBLAS kernel internally casts float32 to float16, does the computation, and casts the results back. As a result, this flag may introduce accuracy issue.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] areusch closed pull request #8108: Fix test_cublas and relax tolerance

Posted by GitBox <gi...@apache.org>.

areusch closed pull request #8108:
URL: https://github.com/apache/tvm/pull/8108


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] comaniac commented on pull request #8108: Fix test_cublas and relax tolerance

Posted by GitBox <gi...@apache.org>.

comaniac commented on pull request #8108:
URL: https://github.com/apache/tvm/pull/8108#issuecomment-847271637


   > @comaniac Disabling the flag makes the tests pass. What should we do here? Accept lower accuracy for performance?
   
   I personally prefer to keep the accuracy, because it seems not right to tolerate 1e-2 for a single batch_matmul op. It means the end-to-end accuracy of all models with cublas.batch_matmul may be larger than 1e-2. cc @Hzfengsy @Laurawly as they added this flag at the time it hasn't been deprecated.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] areusch commented on pull request #8108: Fix test_cublas and relax tolerance

Posted by GitBox <gi...@apache.org>.

areusch commented on pull request #8108:
URL: https://github.com/apache/tvm/pull/8108#issuecomment-849983894


   superseded by #8130


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] areusch commented on pull request #8108: Fix test_cublas and relax tolerance

Posted by GitBox <gi...@apache.org>.

areusch commented on pull request #8108:
URL: https://github.com/apache/tvm/pull/8108#issuecomment-847250468


   @comaniac i was sending this through CI for @tkonolige as he was out friday. I'll let him reply to your comment.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] tkonolige commented on pull request #8108: Fix test_cublas and relax tolerance

Posted by GitBox <gi...@apache.org>.

tkonolige commented on pull request #8108:
URL: https://github.com/apache/tvm/pull/8108#issuecomment-847267516


   @comaniac Disabling the flag makes the tests pass. What should we do here? Accept lower accuracy for performance?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org