You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2021/10/21 01:27:40 UTC

[GitHub] [tvm] AndrewZhaoLuo opened a new pull request #9336: [WIP] Fix inconsistent kernel layout conventions for conv2d_transpose

AndrewZhaoLuo opened a new pull request #9336:
URL: https://github.com/apache/tvm/pull/9336


   The actual topi assume IOHW kernel inputs. It seems for convs we are used to OIHW layouts but for some frameworks (e.g. pytorch) conv2d_transpose is done with IOHW kernel layouts. 
   
   This has caused me a great deal of confusion. It seems like the current code only works because people mix up the layout convention many many times and it cancels out. This is my attempt at untying the knot and making it much more consistent.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] junrushao1994 merged pull request #9336: [Topi][Op][PyTorch][Vitas] Fix inconsistent kernel layout conventions for conv2d_transpose

Posted by GitBox <gi...@apache.org>.

junrushao1994 merged pull request #9336:
URL: https://github.com/apache/tvm/pull/9336


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] AndrewZhaoLuo edited a comment on pull request #9336: [Topi][Op][PyTorch][Vitas] Fix inconsistent kernel layout conventions for conv2d_transpose

Posted by GitBox <gi...@apache.org>.

AndrewZhaoLuo edited a comment on pull request #9336:
URL: https://github.com/apache/tvm/pull/9336#issuecomment-950006190


   I think in general, `I = the dimension that will change when we expect the input channels to change`. `O = the dimension that will change when we expect the output channels to change`. In this operator, this is not the case (and it also isn't for the 1D and 3D cases too probably but I will change those if this gets good traction).
   
   I will say I can understand maybe why this convention is so weird.
   
   conv2d_transpose can be seen as the gradient of conv2d. 
   
   So 'I' and 'O' represent the input and output channels of the imaginary conv2d function this is a gradient of. However, in conv2d_transpose, we don't have this context really and flipping the signs will cause confusion.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] Lyken17 commented on pull request #9336: [Topi][Op][PyTorch][Vitas] Fix inconsistent kernel layout conventions for conv2d_transpose

Posted by GitBox <gi...@apache.org>.

Lyken17 commented on pull request #9336:
URL: https://github.com/apache/tvm/pull/9336#issuecomment-966243901


   @junrushao1994 Yes, definitely related. We both noticed that the `conv2d_transpose` weight layout should be `IOHW` rather than `OIHW` (used in `conv2d`). This is indeed confusing and leads to several wrong shape checks. Will go through the code in details later today. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] junrushao1994 commented on pull request #9336: [Topi][Op][PyTorch][Vitas] Fix inconsistent kernel layout conventions for conv2d_transpose

Posted by GitBox <gi...@apache.org>.

junrushao1994 commented on pull request #9336:
URL: https://github.com/apache/tvm/pull/9336#issuecomment-966728394


   Thank you all for the huge effort this week!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] comaniac commented on a change in pull request #9336: [Topi][Op][PyTorch][Vitas] Fix inconsistent kernel layout conventions for conv2d_transpose

Posted by GitBox <gi...@apache.org>.

comaniac commented on a change in pull request #9336:
URL: https://github.com/apache/tvm/pull/9336#discussion_r746837515



##########
File path: tests/python/contrib/test_vitis_ai/test_vitis_ai_codegen.py
##########
@@ -241,6 +240,13 @@ def test_upsampling(dpu_target):
     verify_codegen(mod, dpu_target=dpu_target)
 
 
+@pytest.mark.skip(
+    reason="I and O used to be mixed up in kernel layouts in TVM."
+    "This is fixed, but vitis needs to adopt the new convention."
+    "To change, simply remove this line:"
+    "https://github.com/Xilinx/pyxir/blob/bef661d6d77adcdbd2cf4163f2cf3a1d31d40406/"
+    "python/pyxir/frontend/tvm/relay_tools/relay_l2_convolution.py#L380"
+)

Review comment:
       cc @jtuyls 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] masahi commented on pull request #9336: [Topi][Op][PyTorch][Vitas] Fix inconsistent kernel layout conventions for conv2d_transpose

Posted by GitBox <gi...@apache.org>.

masahi commented on pull request #9336:
URL: https://github.com/apache/tvm/pull/9336#issuecomment-965931918


   Is the PR https://github.com/apache/tvm/pull/9465 related? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] jtuyls commented on a change in pull request #9336: [Topi][Op][PyTorch][Vitas] Fix inconsistent kernel layout conventions for conv2d_transpose

Posted by GitBox <gi...@apache.org>.

jtuyls commented on a change in pull request #9336:
URL: https://github.com/apache/tvm/pull/9336#discussion_r747362113



##########
File path: tests/python/contrib/test_vitis_ai/test_vitis_ai_codegen.py
##########
@@ -241,6 +240,13 @@ def test_upsampling(dpu_target):
     verify_codegen(mod, dpu_target=dpu_target)
 
 
+@pytest.mark.skip(
+    reason="I and O used to be mixed up in kernel layouts in TVM."
+    "This is fixed, but vitis needs to adopt the new convention."
+    "To change, simply remove this line:"
+    "https://github.com/Xilinx/pyxir/blob/bef661d6d77adcdbd2cf4163f2cf3a1d31d40406/"
+    "python/pyxir/frontend/tvm/relay_tools/relay_l2_convolution.py#L380"
+)

Review comment:
       Thanks, this kernel layout switch makes a lot of sense to me. I will fix this in the Vitis flow.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] Lyken17 commented on a change in pull request #9336: [Topi][Op][PyTorch][Vitas] Fix inconsistent kernel layout conventions for conv2d_transpose

Posted by GitBox <gi...@apache.org>.

Lyken17 commented on a change in pull request #9336:
URL: https://github.com/apache/tvm/pull/9336#discussion_r747653115



##########
File path: src/relay/op/nn/convolution.h
##########
@@ -1044,7 +1044,7 @@ bool Conv2DTransposeRel(const Array<Type>& types, int num_inputs, const Attrs& a
   if (data == nullptr) return false;
 
   static const Layout kNCHW("NCHW");
-  static const Layout kOIHW("OIHW");
+  static const Layout kIOHW("IOHW");

Review comment:
       Does the layout variable affect later calculation, or is it just an alias of layout?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] mbrookhart commented on pull request #9336: [Topi][Op][PyTorch][Vitas] Fix inconsistent kernel layout conventions for conv2d_transpose

Posted by GitBox <gi...@apache.org>.

mbrookhart commented on pull request #9336:
URL: https://github.com/apache/tvm/pull/9336#issuecomment-949999485


   @masahi Perhaps you or someone you know has a better sense of framework conventions here than I do?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] jtuyls commented on a change in pull request #9336: [Topi][Op][PyTorch][Vitas] Fix inconsistent kernel layout conventions for conv2d_transpose

Posted by GitBox <gi...@apache.org>.

jtuyls commented on a change in pull request #9336:
URL: https://github.com/apache/tvm/pull/9336#discussion_r747362113



##########
File path: tests/python/contrib/test_vitis_ai/test_vitis_ai_codegen.py
##########
@@ -241,6 +240,13 @@ def test_upsampling(dpu_target):
     verify_codegen(mod, dpu_target=dpu_target)
 
 
+@pytest.mark.skip(
+    reason="I and O used to be mixed up in kernel layouts in TVM."
+    "This is fixed, but vitis needs to adopt the new convention."
+    "To change, simply remove this line:"
+    "https://github.com/Xilinx/pyxir/blob/bef661d6d77adcdbd2cf4163f2cf3a1d31d40406/"
+    "python/pyxir/frontend/tvm/relay_tools/relay_l2_convolution.py#L380"
+)

Review comment:
       Thanks, this kernel layout switch makes a lot of sense to me. I will fix this in the Vitis flow and then we can remove skipping the test.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] mbrookhart edited a comment on pull request #9336: [Topi][Op][PyTorch][Vitas] Fix inconsistent kernel layout conventions for conv2d_transpose

Posted by GitBox <gi...@apache.org>.

mbrookhart edited a comment on pull request #9336:
URL: https://github.com/apache/tvm/pull/9336#issuecomment-949999485


   @masahi Perhaps you or someone you know has a better sense of framework conventions here than I do? I'm not sure if there's a common representation here.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] AndrewZhaoLuo commented on pull request #9336: [Topi][Op][PyTorch][Vitas] Fix inconsistent kernel layout conventions for conv2d_transpose

Posted by GitBox <gi...@apache.org>.

AndrewZhaoLuo commented on pull request #9336:
URL: https://github.com/apache/tvm/pull/9336#issuecomment-950006190


   I think in general, `I = the dimension that will change when we expect the input channels to change`. `O = the dimension that will change when we expect the output channels to change`. In this operator, this is not the case (and it also isn't for the 1D and 3D cases too probably but I will change those if this gets good traction).
   
   I will say I can understand maybe why this convention is so weird.
   
   conv2d_transpose can be seen as the gradient of conv2d. 
   
   So 'I' and 'O' represent the input and output channels of the original conv2d. However, in conv2d_transpose, we don't have this context really and flipping the signs will cause confusion.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] junrushao1994 commented on pull request #9336: [Topi][Op][PyTorch][Vitas] Fix inconsistent kernel layout conventions for conv2d_transpose

Posted by GitBox <gi...@apache.org>.

junrushao1994 commented on pull request #9336:
URL: https://github.com/apache/tvm/pull/9336#issuecomment-966016854


   @Lyken17 is this related to your fix? Would be great if you could take a look :-) Thanks a lot!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org