You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2021/04/19 12:52:19 UTC

[GitHub] [tvm] apeskov commented on pull request #7877: Async measurer API for auto-scheduler scripts

apeskov commented on pull request #7877:
URL: https://github.com/apache/tvm/pull/7877#issuecomment-822442102


   Looks like I have to provide some explanation of what was done and what was a reason for some design decisions.
   
   Previous state of Runner/Builder implementation was next. Native C++ implementation of both classes are using python static methods "auto_scheduler.local_builder.build" and "auto_scheduler.local_runner.run" respectively. Both python functions are blocking, that is wait until all task in internal ThreadPool are finished. There are no state shared between runs, all treads started at the begin of python build/run functions will be joined at the end. The native implementation of builder/runner doesn't retain any python resources.
   
   **The problem 1**: Because of blocking nature of this API, "Run" step will starts only after "Build" step completion. Host compiles kernels - device awaits, device executes kernels - host waits. Device and host are separate devices and may works in parallel. Build and Run steps should be overlapped. 
   
   **The problem 2**: By design, the tuning round tasks are splitter into batches. At the end of batch processing it has strong sync point/barrier, wait until all task in batch are completed (limitation of sync nature of API). In case of tasks time imbalance all workers spend time in idle state.    
   
   **The goal**: To make Build and Run methods asynchronous (non blocking). The Builder/Runner instance incapsulate some fixed pool of workers and each Build/Run method just submit task to execute without waiting of completion. The synchronisation happens only once at the end of tuning round method. 
   
   **My suggestion for design changes**:
   * Make methods Build and Run async with callback mechanics. The completion callback will be called when one atomic task is completed. 
   ```cpp
   Build(...)  -> SubmitToBuild(... , callback);
   Run(...) -> SubmitToRun(... , callback);
   
   // Usage example before
   for (auto input : input_collection)
     build_result_collection.push_back(builder->Build(input));
   for (auto build_result : build_result_collection)
     run_result_collection.pushback(runner->Run(build_result));
   
   // Usage example after
   for (auto input : input_collection) {
     builder->SubmitToBuild(input, [](auto build_result) {
       runner->SubmitToRun(build_result, [](auto run_result) {
         run_result_collection.pushback(run_result);
       });
     });
     cv.wait(run_result_collection.size() == input_collection.size())
   }
   ```
   Worker pool is attached for builder/runner object and reused for all SubmitToBuild/SubmitToRun calls.
   
   * Make workers ThreadPool as Builder/Runner resources. One builder instance uses single instance of ThreadPool. Release of Builder object will lead of release of worker ThreadPool. There is a no a convenient way to attach regular python object to native C++ implementation of FFI based objects. The only way I found in TVM sources is next:
     - Create python object
     - Capture this object into some python function
     - Wrap this function into PackedFunction
     - Hold this PackedFunction instance as a field in native class 
   
     That's why I added `PackedFunc submit_func` field instead of simple string with function name. It should hold worker TreadPool object. If you know a more proper way to do that please advise me.
   
   * Synchronisation and task dependency solving are implemented on `ProgramMeasurer` level. It uses SubmitToBuild/SubmitToRun in a way it wants. It may simulate blocking behaviour like before. May add additional barriers. Or just submit all existing task to executors as a single scope and wait it completion. The special flag `async_mode` manages sync(old)/async behaviour of ProgramMeasurer.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org