You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2021/02/23 14:12:57 UTC

[GitHub] [tvm] tqchen edited a comment on issue #7246: [BUG][Tensorize] race condition when using "tvm.tir.call_packed()" in a parallel schedule.

tqchen edited a comment on issue #7246:
URL: https://github.com/apache/tvm/issues/7246#issuecomment-784223136


   The function will looks like
   ```
   fn myfunc() {
      for i in range(10):
          stack_tcode = @tir.packed_arg_alloca("arg_tcode", 8)
          stack_value = @tir.packed_arg_alloca("arg_value", 8)
          tir.tvm_call_packed_lowered("tvm.contrib.cblas.matmul", stack_1)
   }
   ```
   
   In the LLVM code generator, we want the compiled code to look like
   
   ```
   fn myfunc() {
   begin:
       stack_tcode0 = alloca("arg_tcode", 8)
       stack_value1 = alloca("arg_value", 8)
   
   loop:
      for i in range(10):
          tir.tvm_call_packed_lowered("tvm.contrib.cblas.matmul", stack_1)
   }
   ```
   
   Actually, the main thing is that we could lift the semantics of the `tir.tvm_stack_alloca`  to allow it to appear in most places, but allocation happens in the beginning of the function.
   
   This can be done by jumping to the function begin basic block for insertion. https://github.com/apache/tvm/blob/81d9f11ab87dc3ef5fc906aa6ca23737885f7b27/src/target/llvm/codegen_llvm.h#L173
   Note that this is a restriction of LLVM IR(alloca always happens in the beginning of the function). Additionally, we could certainly coleasce the stack of multiple calls, although LLVM could do that for us so we do not need to do so. 
   
   Note that in the case of parallel, a new function will be created for the parallel body, and WithFunctionEntry will get the alloca inserted at the beginning of that function. This is the benefit of delaying the alloca location move until the code gen point.
   
   For the case of C generator, depending on the restriction of language, we might also need to create allocation in the beginning (e.g. create a init fragement stream of a function that is separated from the rest).
   
   For the case of StackVM, we could need to have the ability to insert into the beginning of instruction stream that allocates, and stores the value into a few global heap which can be referred later.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org