You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2022/04/02 06:39:48 UTC
[GitHub] [tvm] junrushao1994 commented on a change in pull request #10366: [MetaSchedule] Add Gradient Based Task Scheduler

junrushao1994 commented on a change in pull request #10366:
URL: https://github.com/apache/tvm/pull/10366#discussion_r841025653



##########
File path: tests/python/unittest/test_meta_schedule_tune_tir.py
##########
@@ -98,14 +97,14 @@ def test_tune_matmul_cuda_tensor_core():
     target = Target("nvidia/geforce-rtx-3070")
     config = ReplayTraceConfig(
         num_trials_per_iter=32,
-        num_trials_total=320,
+        max_trials_per_task=320,

Review comment:
       Hey @masahi, thanks for detecting this issue swiftly! I apologize for the inconvenience as the end API is still in flux.
   
   Yes, this PR breaks the existing unstable MetaSchedule API, and we want to do this from very early on, so that the APIs won't be broken again after formally announcing MetaSchedule is ready for production use.
   
   The intention of having the parameter `config` is to provide an all-in-one place for end users to better configure the search - even if it's less realistic given builder/runner/database/cost_model/task_scheduler actually all need to be configured separately, it's intended to be the most frequent parameter users need to tweak.
   
   Then let's discuss about `max_trials_per_task` and `max_trials_global`. Imagine a service that allows users to set how many a global uplimit of total number of trials for an entire model, as well as a per-task limit for each individual task extracted, then these are the two parameters to tweak. This PR adds this feature essentially for completeness of such potential SaaS feature request.
   
   I have to admit that this breaking change to unannounced APIs could be damaging particularly to early users like you for sudden surprises; On the other hand, we do want to work together, do the correct thing to make those APIs look right to the end users. In the near future, we will likely introduce breaking changes including:
   - Based on your work, refactoring `integration` into `extract_task` and `apply_history_best`
   - More structured and readable logging
   - Less error-prone interface in `tune.py` (needs to collect more feedbacks)
   
   We will do our best to ping early users like you to make sure people are aware of our breaking change before officially announcing MetaSchedule is product-ready. After being product ready, we will be less aggressive in terms of refactoring. Thank you so much for your understanding!
   
   > First, max_trials_global is not used at https://github.com/zxybazh/tvm/blob/e3fbb797a88308e4ce3d671939a83084ae1826b8/python/tvm/meta_schedule/search_strategy/replay_trace.py#L60
   
   Yes, and this parameter is only used for configuring the task scheduler.
   
   > And if I use
   > ```
   >             config=ReplayTraceConfig(
   >                 num_trials_per_iter=32,
   >                 max_trials_per_task=32,
   >                 ...
   > ```
   > , `max_trials_per_task` acts like a global max trials, so only one task is tuned, as shown below. This contradicts the name `max_trials_per_task ` and does something very different from the previous `num_trials_total`.
   
   Indeed this behavior doesn't make any sense. May I ask what your `max_trials_global` is in this case? It could lead to early stopping if it's less than `32 * num_tasks`.
   
   
   > All uses of ReplayTraceConfig in tests are broken since now it has three attributes.
   
   We updated the usage of this API in the following files:
   
   - python/tvm/meta_schedule/testing/tune_te_meta_schedule.py
   - python/tvm/meta_schedule/testing/tune_relay_meta_schedule.py
   - tests/python/unittest/test_meta_schedule_measure_callback.py
   - tests/python/unittest/test_meta_schedule_search_strategy.py
   - tests/python/unittest/test_meta_schedule_task_scheduler.py
   - tests/python/unittest/test_meta_schedule_tune_relay.py
   - tests/python/unittest/test_meta_schedule_tune_te.py
   - tests/python/unittest/test_meta_schedule_tune_tir.py
   
   And indeed we missed the following two files (essentially skipped ones):
   - tests/python/unittest/test_meta_schedule_tune_tir.py
   - tests/python/unittest/test_meta_schedule_tune_te.py
   - tests/python/unittest/test_meta_schedule_tune_relay.py
   
   Clearly we forgot to do so for the skipped files, and I will shoot a PR to get them fixed.
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org