You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2021/05/10 22:00:01 UTC

[GitHub] [tvm] tkonolige commented on pull request #7983: [PROFILING] Use PAPI to collect hardware performance counters on CPU and CUDA

tkonolige commented on pull request #7983:
URL: https://github.com/apache/tvm/pull/7983#issuecomment-837409214


   @areusch and I chatted a little offline about how to handle mallocs in profiling code. Right now there are a couple of places that do allocation and they would be pretty difficult to remove. Also, as long as there are not any nested `Start` calls (besides the top level nesting), the overhead of malloc is counted overhead section of the full model execution and has no effect on the performance of each op invocation. We should move forward with this PR as it stands and I will think about ways of reducing the amount of allocation that we do.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org