You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2021/03/12 13:19:25 UTC
[GitHub] [tvm] tqchen commented on a change in pull request #7619: [BugFix] Fix the race condition issue of packed func. (#7246).

tqchen commented on a change in pull request #7619:
URL: https://github.com/apache/tvm/pull/7619#discussion_r593163883



##########
File path: src/tir/transforms/lower_tvm_builtin.cc
##########
@@ -74,12 +59,38 @@ class BuiltinLower : public StmtExprMutator {
     ICHECK_EQ(run_array_stack_, 0);
 
     if (prep_seq_.size() != 0) {
-      Stmt ret = SeqStmt::Flatten(prep_seq_, stmt);
+      stmt = SeqStmt::Flatten(prep_seq_, stmt);
       prep_seq_.clear();
-      return ret;
-    } else {
-      return stmt;
     }
+
+    // Always generated "tvm_stack_alloca" intrincis next to the "tvm_packed_func",
+    // which makes the stacks allocated thread-local and every tvm_packed_func will have
+    // it's own stack, rather than a shared one. This could help resolve the race
+    // -condition issue in parallel execution.
+
+    if (emit_stack_shape_) {
+      ICHECK_NE(max_shape_stack_, -1);

Review comment:
       Thanks @zhuwenxi , I guess we do not need to trigger after every statement visits. Instead, we can check ForNode and see if there is a parallel, if there is, we emit immediately. Otherwise, we emit at root.

##########
File path: src/tir/transforms/lower_tvm_builtin.cc
##########
@@ -307,19 +393,37 @@ class BuiltinLower : public StmtExprMutator {
   std::vector<Stmt> prep_seq_;
   PrimExpr device_type_;
   PrimExpr device_id_;
-  // Var handle for each stack.
   Var stack_shape_;
   Var stack_array_;
   Var stack_tcode_;
   Var stack_value_;
+
+  // Mark the occurence of tvm_stack_make_shape of current stmt:
+  // 1. Set to true when the first tvm_stack_make_shape is met;
+  // 2. Reset to false at the end of VisitStmt();
+  bool emit_stack_shape_{false};
+
+  // Mark the occurence of tvm_stack_make_array of current stmt:
+  // 1. Set to true when the first tvm_stack_make_array is met;
+  // 2. Reset to false at the end of VisitStmt().
+  bool emit_stack_array_{false};
+
+  // Mark the occurence of tvm_call_packed of current stmt:
+  // 1. Set to true when tvm_call_packed intrinsic is met;
+  // 2. Reset to false at the end of VisitStmt().
+  bool emit_stack_value_tcode_{false};
+
   // The running statistics
   int64_t run_shape_stack_{-1};
   uint64_t run_array_stack_{0};
   uint64_t run_arg_stack_{0};
   // statistics of stacks
   int64_t max_shape_stack_{-1};
   uint64_t max_array_stack_{0};
-  uint64_t max_arg_stack_{0};

Review comment:
       let us keep max_arg_stack as it is, and use its value (0 or not) to check whether or not we need emission.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org