You are viewing a plain text version of this content. The canonical link for it is here.
Posted to discuss-archive@tvm.apache.org by Jason via Apache TVM Discuss <no...@discuss.tvm.ai> on 2020/10/19 09:38:21 UTC

[Apache TVM Discuss] [Questions] How dose TVM elimitate calls of conv weights layout_transform?


Hi everyone!   
I modified this sample(https://tvm.apache.org/docs/tutorials/frontend/from_pytorch.html)  to add desired_layout NHWC to the network saved from pytorch(which uses NCHW):
```python
 desired_layouts = {'qnn.conv2d': ['NHWC', 'HWIO'],
                     'nn.conv2d': ['NHWC', 'HWIO']
                     }
 # RemoveUnunsedFunctions is used to clean up the graph.
 seq = tvm.transform.Sequential([relay.transform.RemoveUnusedFunctions(),
                                 relay.transform.ConvertLayout(desired_layouts)]
                                 )
 with tvm.transform.PassContext(opt_level=3):
     mod = seq(mod)
 print(mod)
```

The dump of mod is expected: both input/ouput and each layer's weight comes with a layout_tranform, for example:
```txt
  %0 = layout_transform(%input0, src_layout="NCHW", dst_layout="NHWC") /* ty=Tensor[(1, 224, 224, 3), float32] */;
  %1 = layout_transform(%conv1.weight, src_layout="OIHW", dst_layout="HWIO") /* ty=Tensor[(7, 7, 3, 64), float32] */;
  %2 = nn.conv2d(%0, %1, strides=[2, 2], padding=[3, 3, 3, 3], channels=64, kernel_size=[7, 7], data_layout="NHWC", kernel_layout="HWIO") /* ty=Tensor[(1, 112, 112, 64), float32] */;
  %3 = nn.batch_norm(%2, %bn1.weight, %bn1.bias, %bn1.running_mean, %bn1.running_var, axis=3) /* ty=(Tensor[(1, 112, 112, 64), float32], Tensor[(64), float32], Tensor[(64), float32]) */;
  %4 = %3.0;
  %5 = nn.relu(%4) /* ty=Tensor[(1, 112, 112, 64), float32] */;
  %6 = nn.max_pool2d(%5, pool_size=[3, 3], strides=[2, 2], padding=[1, 1, 1, 1], layout="NHWC") /* ty=Tensor[(1, 56, 56, 64), float32] */;
  %7 = layout_transform(%layer1.0.conv1.weight, src_layout="OIHW", dst_layout="HWIO") /* ty=Tensor[(3, 3, 64, 64), float32] */;
  %8 = nn.conv2d(%6, %7, padding=[1, 1, 1, 1], channels=64, kernel_size=[3, 3], data_layout="NHWC", kernel_layout="HWIO") /* ty=Tensor[(1, 56, 56, 64), float32] */;
```

However, I checked the gpu trace, there are only two layout transform 
```text
[CUDA memcpy HtoD]
**fused_layout_transform_11_kernel0** [513]
fused_nn_conv2d_add_nn_relu_7_kernel0 [517]
fused_nn_max_pool2d_kernel0 [521]
fused_nn_conv2d_add_nn_relu_6_kernel0 [525]
fused_nn_conv2d_add_add_nn_relu_3_kernel0 [529]
fused_nn_conv2d_add_nn_relu_6_kernel0 [532]
fused_nn_conv2d_add_add_nn_relu_3_kernel0 [535]
fused_nn_conv2d_add_nn_relu_5_kernel0 [539]
fused_nn_conv2d_add_kernel0 [543]
fused_nn_conv2d_add_add_nn_relu_2_kernel0 [547]
fused_nn_conv2d_add_nn_relu_4_kernel0 [551]
fused_nn_conv2d_add_add_nn_relu_2_kernel0 [554]
fused_nn_conv2d_add_nn_relu_3_kernel0 [558]
fused_nn_conv2d_add_1_kernel0 [562]
fused_nn_conv2d_add_add_nn_relu_1_kernel0 [566]
fused_nn_conv2d_add_nn_relu_2_kernel0 [570]
fused_nn_conv2d_add_add_nn_relu_1_kernel0 [573]
fused_nn_conv2d_add_nn_relu_1_kernel0 [577]
fused_nn_conv2d_add_2_kernel0 [581]
fused_nn_conv2d_add_add_nn_relu_kernel0 [585]
fused_nn_conv2d_add_nn_relu_kernel0 [589]
fused_nn_conv2d_add_add_nn_relu_kernel0 [592]
fused_nn_adaptive_avg_pool2d_kernel0 [596]
**fused_layout_transform_reshape_squeeze_kernel0** [600]
fused_nn_dense_add_kernel0 [604]
[CUDA memcpy DtoH]
```

I'm quite confused here, does this mean all of these kernels support NHWC as input while using OIHW filter parameters? Or does TVM transform these weight parameters in advance? Since there is no need to transform filters more than once.   

PS: I'm working on loading a pytroch model(which in NCHW by default) into TVM and running it in NHWC format only(including input/output/each conv layer), so I expect there should be none layout_transform at all. Am I right?





---
[Visit Topic](https://discuss.tvm.apache.org/t/how-dose-tvm-elimitate-calls-of-conv-weights-layout-transform/8208/1) to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click here](https://discuss.tvm.apache.org/email/unsubscribe/d14f2e354dc25859193127825b83d19f4eae27be67fb05201137b953c9c10ef2).

[Apache TVM Discuss] [Questions] How dose TVM elimitate calls of conv weights layout_transform?

Posted by Jason via Apache TVM Discuss <no...@discuss.tvm.ai>.

After reading these two links:   

[https://discuss.tvm.apache.org/t/layout-conversion-pass/4009/15](https://tvm.apache.org/docs/dev/convert_layout.html)    

[https://tvm.apache.org/docs/dev/convert_layout.html](https://tvm.apache.org/docs/dev/convert_layout.html)    

I'm still confused that for my case, running a NCHW pytorch model in TVM with NHWC input/output/conv2d, the final execution should not include any call of layout transform at all if my code is setup correctly? Is any changes needed to TVM itself, like adding someting to frontend/pytroch.py? 

Thanks a lot!





---
[Visit Topic](https://discuss.tvm.apache.org/t/how-dose-tvm-elimitate-calls-of-conv-weights-layout-transform/8208/2) to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click here](https://discuss.tvm.apache.org/email/unsubscribe/7d5f023bb916b578110b63f03e802259bfcc90bddcbc46028e7f278b3262c7b1).

[Apache TVM Discuss] [Questions] How dose TVM elimitate calls of conv weights layout_transform?

Posted by Yao Wang via Apache TVM Discuss <no...@discuss.tvm.ai>.

Yes weight layout transformation should be optimized by constant folding.





---
[Visit Topic](https://discuss.tvm.apache.org/t/how-dose-tvm-elimitate-calls-of-conv-weights-layout-transform/8208/5) to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click here](https://discuss.tvm.apache.org/email/unsubscribe/e1cf7afcf32c426296a8804923ac82f5d88d58cb9edff0a3b25145e3f4a80e01).

[Apache TVM Discuss] [Questions] How dose TVM elimitate calls of conv weights layout_transform?

Posted by Jason via Apache TVM Discuss <no...@discuss.tvm.ai>.

Thanks for the reply Kevin! Those two layout trans make sense, but for filter parameters, they're loaded from .pth with OIHW by default(relay/frontend/pytorch.py) and I set desired_layout for HWIO. Will these filter parameters be transformed in advanced or by a cuda kernel in each inference?  

I guess they should be converted only once since these parameters are kind of constant data regarding the inference process. Could someone give me a hint which parts of code responsible for it? I observed the same number of layout_transform calls with conv calls in my model/running, therefore something is wrong with it. In comparison, the gpu trace of tvm resnet sample shows only two layout transform, which is expected.

I'm a very beginner to TVM code base and where should I start? Thanks a lot.





---
[Visit Topic](https://discuss.tvm.apache.org/t/how-dose-tvm-elimitate-calls-of-conv-weights-layout-transform/8208/4) to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click here](https://discuss.tvm.apache.org/email/unsubscribe/9022b102452de7ddc825e27d9cce810ee317f08ebb6be2c57598661d9452d22f).

[Apache TVM Discuss] [Questions] How dose TVM elimitate calls of conv weights layout_transform?

Posted by Yao Wang via Apache TVM Discuss <no...@discuss.tvm.ai>.

If original model layout is NCHW and you convert to NHWC in TVM, at least two layout transformation are required: one at the beginning and one at the end.





---
[Visit Topic](https://discuss.tvm.apache.org/t/how-dose-tvm-elimitate-calls-of-conv-weights-layout-transform/8208/3) to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click here](https://discuss.tvm.apache.org/email/unsubscribe/ed7edc408be39d1342cb8f92c0705a09d991f6f0496a147dc19cfbcf5f51e7e6).