You are viewing a plain text version of this content. The canonical link for it is here.

Posted to discuss-archive@tvm.apache.org by wyc910926 via Apache TVM Discuss <no...@discuss.tvm.ai> on 2021/08/19 07:05:52 UTC

[Apache TVM Discuss] [Questions] Questions about conv2d weight transform


Sorry, I just started contacting tvm.
I noticed that in order to speed up conv2d, the necessary transforms for the input and weight will be required：

    def conv2d_gemm_weight_transform(kernel, tile_rows, tile_cols):
        KH, KW, IC, OC = get_const_tuple(kernel.shape)
        K = KH * KW * IC
        N = OC

        kernel_flat = te.compute(
            (K, N), lambda x, y: kernel[(x // IC) // KW, (x // IC) % KW, x % IC, y], "weight_flatten"
        )

        pad_K = 0
        pad_N = 0

        if N % tile_rows != 0:
            pad_N = tile_rows - (N % tile_rows)

        if K % tile_cols != 0:
            pad_K = tile_cols - (K % tile_cols)

        N_padded = N + pad_N
        K_padded = K + pad_K

        if pad_K != 0 or pad_N != 0:
            kernel_flat = pad(
                kernel_flat, pad_before=(0, 0), pad_after=(pad_K, pad_N), name="weight_padding"
            )

        return te.compute(
            (N_padded // tile_rows, K_padded // tile_cols, tile_rows, tile_cols),
            lambda x, y, z, w: kernel_flat[w + tile_cols * y, z + tile_rows * x],
            name="weight_block_reshape",
        )

You can see that this is a te.compute, and you can still see this calculation expression at the TIR level，below is the printed TIR：

      // attr [A_padded] storage_scope = "global"
      allocate A_padded[uint8 * 1605632]
      // attr [weight_block_reshape] storage_scope = "global"
      allocate weight_block_reshape[uint8 * 1024]
      // attr [C] storage_scope = "global"
      allocate C[int32 * 1605632]
      // attr [iter_var(pipeline, , pipeline)] pipeline_exec_scope = 1
      for (i1, 0, 50176) {
        for (i2, 0, 32) {
          A_padded[((i1*32) + i2)] = tir.if_then_else((i2 < 12), placeholder[((((floordiv(i1, 224)*675) + (floordiv(i2, 6)*675)) + (floormod(i1, 224)*3)) + floormod(i2, 6))], (uint8)0)
        }
      }
      for (x.y.fused, 0, 8) {
        for (z, 0, 32) {
          weight_block_reshape[((x.y.fused*128) + (z*4))] = tir.if_then_else(((x.y.fused < 3) && (z < 3)), placeholder[((x.y.fused*12) + z)], (uint8)0)
          weight_block_reshape[(((x.y.fused*128) + (z*4)) + 1)] = tir.if_then_else(((x.y.fused < 3) && (z < 3)), placeholder[(((x.y.fused*12) + z) + 3)], (uint8)0)
          weight_block_reshape[(((x.y.fused*128) + (z*4)) + 2)] = tir.if_then_else(((x.y.fused < 3) && (z < 3)), placeholder[(((x.y.fused*12) + z) + 6)], (uint8)0)
          weight_block_reshape[(((x.y.fused*128) + (z*4)) + 3)] = tir.if_then_else(((x.y.fused < 3) && (z < 3)), placeholder[(((x.y.fused*12) + z) + 9)], (uint8)0)
        }
      }
This is just a transformation of the data, why not optimize it? In other words, LLVM will be optimized in the future?





---
[Visit Topic](https://discuss.tvm.apache.org/t/questions-about-conv2d-weight-transform/10835/1) to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click here](https://discuss.tvm.apache.org/email/unsubscribe/e9ccaf71a2427a86efddf42ba6fd727cb32d6eefb7ce7a21f10186dd7be467e9).

[Apache TVM Discuss] [Questions] Questions about conv2d weight transform

Posted by Matt Barrett via Apache TVM Discuss <no...@discuss.tvm.ai>.


If you're wondering why TVM/LLVM doesn't do a compile-time evaluation of this code (because weights are constants), the answer is because TIR doesn't treat weight data as constant. It expects you to pass in the weights at runtime, so the compiler can't do the weight transformation ahead-of-time.

I do agree this isn't ideal and especially for convolutions/GEMMs it means we get worse performance than we otherwise could. We have an RFC up at the moment which would add the ability to represent constant weights directly in TIR: https://github.com/apache/tvm-rfcs/pull/22. This could be the first step to enabling the behaviour you want, so feel free to take a look and comment.





---
[Visit Topic](https://discuss.tvm.apache.org/t/questions-about-conv2d-weight-transform/10835/3) to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click here](https://discuss.tvm.apache.org/email/unsubscribe/555409fb7114a8fd088dff8c32d0e63e8840ecc82ac328fac2dc5b04343ec672).

[Apache TVM Discuss] [Questions] Questions about conv2d weight transform

Posted by wyc0926 via Apache TVM Discuss <no...@discuss.tvm.ai>.


By looking at the generated llvm code, it is found that the weight transform still exists, and llvm is not optimized.
Did I forget to turn on any optimization switch? Still weight transform is not optimized.





---
[Visit Topic](https://discuss.tvm.apache.org/t/questions-about-conv2d-weight-transform/10835/2) to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click here](https://discuss.tvm.apache.org/email/unsubscribe/1946f03e19b07afd27333055ae1bab9ba27f15455d71d43dd495b99b320cd922).