You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2022/08/11 13:56:46 UTC

[GitHub] [tvm] echuraev opened a new pull request, #12382: [Profiler] Fix graph_executor_debug hang

echuraev opened a new pull request, #12382:
URL: https://github.com/apache/tvm/pull/12382

   For some operations such as `__nop` or `__copy` the measured inference
   time is equal to 0. In this case we are in infinite loop and we won't
   exit from it. Added new parameter `max_repeat_num` which specify the
   maximum number of repeats then the inference time is equal to 0. When
   we exceed this value then we will exit from a loop.
   
   cc: @valmat07, @Icemist, @masahi   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] echuraev commented on a diff in pull request #12382: [Profiler] Fix graph_executor_debug hang

Posted by GitBox <gi...@apache.org>.

echuraev commented on code in PR #12382:
URL: https://github.com/apache/tvm/pull/12382#discussion_r944160561


##########
src/runtime/crt/common/crt_runtime_api.c:
##########
@@ -588,6 +591,8 @@ tvm_crt_error_t RunTimeEvaluator(tvm_function_index_t function_index, TVMValue*
       if (err != kTvmErrorNoError) {
         goto release_and_return;
       }
+      if (std::fpclassify(curr_res_seconds) == FP_ZERO) absolute_zero_times++;
+      if (absolute_zero_times >= max_repeat_num) break;
     } while (curr_res_seconds < min_repeat_seconds);

Review Comment:
   Done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] echuraev commented on a diff in pull request #12382: [Profiler] Fix graph_executor_debug hang

Posted by GitBox <gi...@apache.org>.

echuraev commented on code in PR #12382:
URL: https://github.com/apache/tvm/pull/12382#discussion_r944154716


##########
src/runtime/profiling.cc:
##########
@@ -887,6 +887,8 @@ PackedFunc WrapTimeEvaluator(PackedFunc pf, Device dev, int number, int repeat,
         }
         t->Stop();
         int64_t t_nanos = t->SyncAndGetElapsedNanos();
+        if (t_nanos == 0) absolute_zero_times++;
+        if (absolute_zero_times >= max_repeat_num) break;

Review Comment:
   It doesn't make sense. Because for operations such as `__nop` or `__copy` the `t_nanos` will be always zero. And if we will increase the `numbers` then we won't exit from the infinite loop. What about default value of 100. I used 100 because it is not too much and not too little :) If the user needs to change it, he can do it any time.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] echuraev commented on a diff in pull request #12382: [Profiler] Fix graph_executor_debug hang

Posted by GitBox <gi...@apache.org>.

echuraev commented on code in PR #12382:
URL: https://github.com/apache/tvm/pull/12382#discussion_r944172990


##########
include/tvm/runtime/profiling.h:
##########
@@ -573,6 +573,8 @@ PackedFunc ProfileFunction(Module mod, std::string func_name, int device_type, i
  *        minimum duration requirement of one `repeat`.
  *        i.e., When the run time of one `repeat` falls below this time,
  *        the `number` parameter will be automatically increased.
+ * \param max_repeat_ms The maximum number of repeats when measured time is equal to 0.

Review Comment:
   Done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] tkonolige merged pull request #12382: [Profiler] Fix graph_executor_debug hang

Posted by GitBox <gi...@apache.org>.

tkonolige merged PR #12382:
URL: https://github.com/apache/tvm/pull/12382


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] Icemist commented on a diff in pull request #12382: [Profiler] Fix graph_executor_debug hang

Posted by GitBox <gi...@apache.org>.

Icemist commented on code in PR #12382:
URL: https://github.com/apache/tvm/pull/12382#discussion_r943548530


##########
include/tvm/runtime/profiling.h:
##########
@@ -573,6 +573,8 @@ PackedFunc ProfileFunction(Module mod, std::string func_name, int device_type, i
  *        minimum duration requirement of one `repeat`.
  *        i.e., When the run time of one `repeat` falls below this time,
  *        the `number` parameter will be automatically increased.
+ * \param max_repeat_ms The maximum number of repeats when measured time is equal to 0.

Review Comment:
   Let's change the word "repeat" in "max_repeat_num" or change the name completely. It looks similar but does not apply to the other arguments(repeat, min_repeat_ms, repeats_to_cooldown). It may confuse users.



##########
src/runtime/crt/common/crt_runtime_api.c:
##########
@@ -588,6 +591,8 @@ tvm_crt_error_t RunTimeEvaluator(tvm_function_index_t function_index, TVMValue*
       if (err != kTvmErrorNoError) {
         goto release_and_return;
       }
+      if (std::fpclassify(curr_res_seconds) == FP_ZERO) absolute_zero_times++;
+      if (absolute_zero_times >= max_repeat_num) break;
     } while (curr_res_seconds < min_repeat_seconds);

Review Comment:
   It seems logical to me to keep the conditions interrupting the cycle together, unless they are doing something specific.
   mb change 
   ```
   if (absolute_zero_times >= max_repeat_num) break;
   } while (curr_res_seconds < min_repeat_seconds);
   ```
   to
   `} while (curr_res_seconds < min_repeat_seconds && absolute_zero_times < max_repeat_num);`
   
   And do it in all 3 places.



##########
src/runtime/profiling.cc:
##########
@@ -887,6 +887,8 @@ PackedFunc WrapTimeEvaluator(PackedFunc pf, Device dev, int number, int repeat,
         }
         t->Stop();
         int64_t t_nanos = t->SyncAndGetElapsedNanos();
+        if (t_nanos == 0) absolute_zero_times++;
+        if (absolute_zero_times >= max_repeat_num) break;

Review Comment:
   It seems to me that in the case of repeating several zero measurements, we could try to change the "number" using a different formula: for special case when duration_ms is 0 and absolute_zero_times > 1. Otherwise I don't see the sense of the default value of 100 for "max_repeat_num" argument.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] tkonolige commented on pull request #12382: [Profiler] Fix graph_executor_debug hang

Posted by GitBox <gi...@apache.org>.

tkonolige commented on PR #12382:
URL: https://github.com/apache/tvm/pull/12382#issuecomment-1212522975

   @echuraev thanks for this PR. Its definitely an edge case we need fixed.
   
   @Icemist Thanks for reviewing!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org