You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2019/11/07 14:10:33 UTC
[GitHub] [incubator-tvm] csarofeen opened a new pull request #4270: [Codgen] Thread variable use before define

csarofeen opened a new pull request #4270: [Codgen] Thread variable use before define
URL: https://github.com/apache/incubator-tvm/pull/4270
 
 
   There are instances where threads can be used before defined. This can happen, for example, when a consumer stage has a thread dimension bound to an axis and that dimension is propagated to producer stages as a Domain of `(ThreadIdx.[x,y,z], 1)`.
   
   This PR adds a pass during schedule_ops which looks for uses of either threadIdx.[x,y,z] or blockIdx.[x,y,z] before the corresponding attr statement. If use before define is detected on these variables it will check if they are defined later in the function being lowered and will copy the attr statement to the beginning of the producer scope where the use before define was found.
   
   For example given an IR as:
   
   ```// attr [compute(C, 0x55addf266ad0)] realize_scope = ""
   realize C([0, 1024], [0, 1024]) {
     produce C {
       // attr [iter_var(blockIdx.x, , blockIdx.x)] thread_extent = 1024
       // attr [compute(B, 0x55addf146840)] realize_scope = "local"
       realize B([blockIdx.x, 1], [threadIdx.x, 1]) {
         produce B {
           B(blockIdx.x, threadIdx.x) =(A(blockIdx.x, threadIdx.x)*1.001f)
         }
         // attr [iter_var(threadIdx.x, , threadIdx.x)] thread_extent = 1024
         C(blockIdx.x, threadIdx.x) =(B(blockIdx.x, threadIdx.x)*2f)
       }
     }
   }```
   
   Produce B will be modified to:
   ```
         produce B {
         // attr [iter_var(threadIdx.x, , threadIdx.x)] thread_extent = 1024
           B(blockIdx.x, threadIdx.x) =(A(blockIdx.x, threadIdx.x)*1.001f)
         }
   ```
   
   As when we attempt to compile the first we end up with errors, either in the make_api pass or later.
   
   **Minimal repro:**
   ```
   import tvm
   
   bx = tvm.thread_axis("blockIdx.x")
   tx = tvm.thread_axis("threadIdx.x")
   
   n = 1024
   m = 1024
   
   A = tvm.placeholder((n, m), name='A')
   
   B = tvm.compute(
       (n, m),
       lambda i, j:
       A[i, j] * 1.001,
       name="B"
   )
   
   C = tvm.compute(
       (n, m),
       lambda i, j:
       B[i, j] * 2.0,
       name="C"
   )
   
   deps = [A, C]
   
   s = tvm.create_schedule(C.op)
   s[C].bind(s[C].op.axis[0], bx)
   s[C].bind(s[C].op.axis[1], tx)
   s[B].compute_at(s[C], s[C].op.axis[0])
   s[B].set_scope("local")
   
   print(tvm.lower(s, [A, C], simple_mode=True))
   
   fcuda = tvm.build(s, deps, "cuda")
   print(fcuda.imported_modules[0].get_source())
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services