You are viewing a plain text version of this content. The canonical link for it is here.

Posted to discuss-archive@tvm.apache.org by JC Li via Apache TVM Discuss <no...@discuss.tvm.ai> on 2021/03/30 23:53:59 UTC

[Apache TVM Discuss] [Questions] Best way to deal with kernel layout?


Well, I have a special BYOC **dense** kernel that deals with kernel layout different from default topi.nn implemenation. 

The default implemenation has *weight Tensor with shape [out_dim, in_dim]*, while I need [in_dim, out_dim]. 

Two questions here:
1. How can I change the default behavior of the kernel input of dense kernel to assume transposed layout? I tried to modify include/tvm/topi/nn/dense.h and python/tvm/topi/nn/dense.py to reverse the layout, but it doesn't work. Where is the code controlling the default kernel layout for dense op?
2. If I don't want to change the default behavior but add an alternative solution targeting my BYOC target, what's the best way? 

Thanks in advance.





---
[Visit Topic](https://discuss.tvm.apache.org/t/best-way-to-deal-with-kernel-layout/9576/1) to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click here](https://discuss.tvm.apache.org/email/unsubscribe/164d596cf6d7219f95f308cdaf58bbe25abc9126d456d2aef6ae55626acac37b).

[Apache TVM Discuss] [Questions] Best way to deal with kernel layout?

Posted by Josse Van Delm via Apache TVM Discuss <no...@discuss.tvm.ai>.


Okay cool, then I was on the right track after all :smile:
Thanks for the quick clarification @comaniac !





---
[Visit Topic](https://discuss.tvm.apache.org/t/best-way-to-deal-with-kernel-layout/9576/10) to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click here](https://discuss.tvm.apache.org/email/unsubscribe/41023fcf52004ac59ea1a88a38b279a229856e0f8f3f730cf9d4e366fed878fa).

[Apache TVM Discuss] [Questions] Best way to deal with kernel layout?

Posted by "Cody H. Yu via Apache TVM Discuss" <no...@discuss.tvm.ai>.

The answer would definitely be different in the case of not using BYOC. Without BYOC, every backend is handled by TVM compilation pipeline. It means every operator has to have its corresponding TOPI implementation. Since data layout affects the TE compute semantic, the op with different data layout is treat as different operators in TE/TIR. The Relay op strategy is in charge of selecting the correct TE/TIR op from a Relay op.

For example, a Relay conv2d op has a data layout attribute, so Relay op strategy will select the TOPI implementation of either conv2d_nchw, conv2d_nhwc, or conv2d_hwcn accordingly as the link you pointed out. Of course, some data layout may be missing in some targets, so you may encounter an error if you specify a data layout that doesn't have the corresponding TOPI implementation for your target (e.g., `arm_cpu`).

In short, if you are not using BYOC and require a special data layout for a certain op, you need to 1) register your backend as other targets (e.g., x86, cuda, arm_cpu, etc), 2) have the corresponding TOPI implementations for your backend, and 2) register the Relay op strategy to correctly lower a Relay graph to TE/TIR for your backend.

---
[Visit Topic](https://discuss.tvm.apache.org/t/best-way-to-deal-with-kernel-layout/9576/5) to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click here](https://discuss.tvm.apache.org/email/unsubscribe/ddfc41ed3112bd9745346e5e3fe2a8108a4a0e9aef957cf05eeca35f09e64a36).

[Apache TVM Discuss] [Questions] Best way to deal with kernel layout?

Posted by JC Li via Apache TVM Discuss <no...@discuss.tvm.ai>.


Thanks for the suggestion, @comaniac . Adding matmul operator with implementations of all combinations of inputs' layouts seems overkill to me. Instead, adding a target-specific relay pass to deal with such target-specific case would be a better solution, which is lightweight and orthogonal to main TVM passes.





---
[Visit Topic](https://discuss.tvm.apache.org/t/best-way-to-deal-with-kernel-layout/9576/11) to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click here](https://discuss.tvm.apache.org/email/unsubscribe/a84846aad218b4fa40904e9b0f9accdef775a037fe4af820e7bcecadab416f9e).

[Apache TVM Discuss] [Questions] Best way to deal with kernel layout?

Posted by "Cody H. Yu via Apache TVM Discuss" <no...@discuss.tvm.ai>.


If you really want to add an op, I'd just call it matmul. An even better version is having matmul with all 4 possible transposes, and dense is just one of them, but this needs many changes in the code base.

cc @tqchen





---
[Visit Topic](https://discuss.tvm.apache.org/t/best-way-to-deal-with-kernel-layout/9576/9) to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click here](https://discuss.tvm.apache.org/email/unsubscribe/db3e015fc7b13a2198959958e3b309c419bbcb6f41f8c785654434dffc842ea3).

[Apache TVM Discuss] [Questions] Best way to deal with kernel layout?

Posted by JC Li via Apache TVM Discuss <no...@discuss.tvm.ai>.


This looks more like a hack, :slight_smile: 

If I want to do it in the relay, I should add a version of nn.dense (say, name it nn.dense_transposed_kernel) then register a function convert_dense(...) with register_convert_op_layouts("nn.dense"), right?





---
[Visit Topic](https://discuss.tvm.apache.org/t/best-way-to-deal-with-kernel-layout/9576/8) to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click here](https://discuss.tvm.apache.org/email/unsubscribe/6165b4276fd679d4577638d071f33295a61c6c26d881d1af7614a21a77452675).

[Apache TVM Discuss] [Questions] Best way to deal with kernel layout?

Posted by "Cody H. Yu via Apache TVM Discuss" <no...@discuss.tvm.ai>.


There's no change for nn.dense because it doesn't have the version you want, as you already pointed out.

If you're using BYOC, then there is a trick you can play at this moment. Since the preprocess still maintains the type, you cannot simply transpose the weight from `[N, C]` to `[C, N]`. On the other hand, BYOC allows you to manipulate constants when initializing the runtime engine. As a result, you can inverse the weight layout in tensor data but pretend its shape is still `[N, C]`. In short, it looks like the following in runtime:

1. When initializing the engine, you transpose the weight order to be `[C, N]`.
2. TVM host module runs to the `nn.dense`, which input shaps is still `[N, C]` in the graph.
3. Since `nn.dense` has been offloaded to your module, TVM host module calls your module with the input data entry IDs.
4. The data order in the data entry for the weight is already `[C, N]`, so you can access it correctly.





---
[Visit Topic](https://discuss.tvm.apache.org/t/best-way-to-deal-with-kernel-layout/9576/7) to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click here](https://discuss.tvm.apache.org/email/unsubscribe/6c44c83b37712846b09736869ed3312c5b72c47246435842739c583eaa376b8c).

[Apache TVM Discuss] [Questions] Best way to deal with kernel layout?

Posted by JC Li via Apache TVM Discuss <no...@discuss.tvm.ai>.


Hi, @comaniac . I looked into your example and did a simple experiment similar to it. 

My example network imported into relay as below:

    #[version = "0.0.5"]
    def @main(%input.1: Tensor[(1, 1, 32, 16), float32], %conv.0.bias: Tensor[(1), float32], %conv.0.weight: Tensor[(1, 1, 3, 3), float32], %fc.0.weight: Tensor[(30, 14), float32]) {
      %0 = reshape(%input.1, newshape=[1, 1, -1, 16]);
      %1 = nn.conv2d(%0, %conv.0.weight, padding=[0, 0, 0, 0], kernel_size=[3, 3]);
      %2 = nn.bias_add(%1, %conv.0.bias);
      %3 = nn.relu(%2);
      %4 = reshape(%3, newshape=[-1, 14]);
      %5 = transpose(%fc.0.weight, axes=[1, 0]);
      %6 = transpose(%5, axes=[1, 0]);
      %7 = nn.dense(%4, %6, units=None);
      nn.relu(%7)
    }

By applying the kernel layout conversion pass as below:

    desired_layouts = {'nn.dense': ['NHWC', 'OHWI'],
                       'nn.conv2d': ['NCHW', 'OHWI']}
    seq = tvm.transform.Sequential([relay.transform.ConvertLayout(desired_layouts),
                                    relay.transform.FoldConstant()])
    with tvm.transform.PassContext(opt_level=3):
        mod = seq(mod)

The outcome is as below:

    #[version = "0.0.5"]
    def @main(%input.1: Tensor[(1, 1, 32, 16), float32], %conv.0.bias: Tensor[(1), float32], %conv.0.weight: Tensor[(1, 1, 3, 3), float32], %fc.0.weight: Tensor[(30, 14), float32]) -> Tensor[(30, 30), float32] {
      %0 = reshape(%input.1, newshape=[1, 1, -1, 16]) /* ty=Tensor[(1, 1, 32, 16), float32] */;
      %1 = layout_transform(%conv.0.weight, src_layout="OIHW", dst_layout="OHWI") /* ty=Tensor[(1, 3, 3, 1), float32] */;
      %2 = nn.conv2d(%0, %1, padding=[0, 0, 0, 0], kernel_size=[3, 3], kernel_layout="OHWI") /* ty=Tensor[(1, 1, 30, 14), float32] */;
      %3 = expand_dims(%conv.0.bias, axis=1, num_newaxis=2) /* ty=Tensor[(1, 1, 1), float32] */;
      %4 = add(%2, %3) /* ty=Tensor[(1, 1, 30, 14), float32] */;
      %5 = nn.relu(%4) /* ty=Tensor[(1, 1, 30, 14), float32] */;
      %6 = reshape(%5, newshape=[-1, 14]) /* ty=Tensor[(30, 14), float32] */;
      %7 = transpose(%fc.0.weight, axes=[1, 0]) /* ty=Tensor[(14, 30), float32] */;
      %8 = transpose(%7, axes=[1, 0]) /* ty=Tensor[(30, 14), float32] */;
      %9 = nn.dense(%6, %8, units=None) /* ty=Tensor[(30, 30), float32] */;
      nn.relu(%9) /* ty=Tensor[(30, 30), float32] */
    }

The kernel layout fed into nn.conv2d is changed accordingly successfully, but there's no change for nn.dense.

Questions might be dumb: what shall I add in relay to allow the nn.dense kernel layout change for the relay pass dedicated for layout conversion? 

I see there are different conv2d implemenations with different layout formats but there's only one for nn.dense, which is not with the desired kernel layout I'm expecting. Since I'm using BYOC, according to what you've descirbed above, it seems those strategy related implementation doesn't affect me anyways. So where and what shall I change to allow nn.dense kernel layout change? Thank you.





---
[Visit Topic](https://discuss.tvm.apache.org/t/best-way-to-deal-with-kernel-layout/9576/6) to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click here](https://discuss.tvm.apache.org/email/unsubscribe/a6e46416e556af097030a75eb5bf2f7f6ebdc20f7db29e2c854e2c42e66cc8e9).