You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2022/01/31 21:39:49 UTC

[GitHub] [tvm] lazycal opened a new pull request #10118: Fix LayoutRewriter

lazycal opened a new pull request #10118:
URL: https://github.com/apache/tvm/pull/10118


   Thanks for contributing to TVM!   Please refer to guideline https://tvm.apache.org/docs/contribute/ for useful information and tips. After the pull request is submitted, please request code reviews from [Reviewers](https://github.com/apache/incubator-tvm/blob/master/CONTRIBUTORS.md#reviewers) by @ them in the pull request thread.
   
   Fix #10109 
   Originally during alter layout pass each node (`LayoutAlternatedExprNode`) is tied with only one specific layout stored in `old_layout` (or `new_layout` after altering layout https://github.com/apache/tvm/blob/6a274af9cb4e7baf6d9dd06f0fb38bea5ac3154a/src/relay/transforms/transform_layout.h#L175-L176). This layout can be determined by either one of its consumer(s) or producer. Previously they are implicitly assumed to agree with each other, but this imposes restrictions on the graphs, and some reasonable graph will also violate this assumption (see #10109 for a concrete example). Actually this assumption is more than necessary for this pass. All we need is to the "delta" information of how each tensor's layout is transformed, and we don't really need to know what layouts they are before or after change. 
   
   This PR removes this assumption. Now the specific layouts stored in `old_layout` and `new_layout` do not matter, instead, they only serve to convey that the tensor changed by the transformation `old_layout`->`new_layout`. With the new change, the alter layout pass can be understood as follows (at least IMO):
   
   1. Visit each node in Post DFS order.
   2. For each node, call the specific alter layout function. Its `InferCorrectLayout` function will tell us the "expected" input layouts from a consumer's perspective (e.g., conv_nchw will say 1st input is NCHW and 2nd is OIHW), on both the graphs before and after rewrite, denoted by `old_in2` and `new_in2`. When they differ, it means that we need to apply the transformation `old_in2`->`new_in2` on the **original** inputs to work properly (https://github.com/lazycal/tvm/blob/42364531dea0fe72e1ffc80aba89ca04e50b2a67/src/relay/transforms/transform_layout.h#L384).
   3. Note that the previous transformation is assuming original input, thus prior to that we need to transform the inputs back to its layout (https://github.com/lazycal/tvm/blob/42364531dea0fe72e1ffc80aba89ca04e50b2a67/src/relay/transforms/transform_layout.h#L383).
   4. Apparently the two transformations can be fused by aligning the layout letters, e.g., `layout_transform(CN->NC) + layout_transform(NW->NW8w)` is equivalent to `layout_transform(CN->NC8c)`. The previous assumption assumes that they are already aligned, thus only one transformation was inserted (https://github.com/lazycal/tvm/blob/42364531dea0fe72e1ffc80aba89ca04e50b2a67/src/relay/transforms/transform_layout.h#L381). I kept this case as well.
   
   The only thing that I didn't fully check is `InferCorrectLayout`s. I am not fully aware of its formal semantic, but I assumed that they should return `old_in2` and `new_in2` that are transformable. Some ops may not conform to this in order to workaround the previous restriction. E.g., I found that the dense_op has this specifal hack (https://github.com/apache/tvm/compare/main...lazycal:fix-layout-pass?expand=1#diff-b1f7105acbdb593d30dc3f1506f8f226d8663164bf0e46702f8b050b056604f6L213-L216). So I just removed it.
   
   
   ## Misc
   Another potential bug I found is that the AlterOpLayout pass seems to be overloaded. People often use it to drop a subgraph to replace a certain node (e.g., https://github.com/apache/tvm/blob/d1dafbd6d050ddcd673f567c5ff720a6b6419cfe/python/tvm/topi/x86/conv2d_alter_op.py#L171-L181), while the current code logic in `transform_layout.h` assumes only replacing a node. For example, https://github.com/apache/tvm/blob/6a274af9cb4e7baf6d9dd06f0fb38bea5ac3154a/src/relay/transforms/transform_layout.h#L388,
   this transformation is performed on the args to the call object, but I believe in the subgraph case it's should be changed to the args to the subgraph.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] masahi edited a comment on pull request #10118: Fix LayoutRewriter

Posted by GitBox <gi...@apache.org>.
masahi edited a comment on pull request #10118:
URL: https://github.com/apache/tvm/pull/10118#issuecomment-1026605253


   @lazycal Thank you very much for working on this. I need to review our current layout rewrite implementation to understand this change, so please wait for a few days or longer until I merge this (but I will, definitely). 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] masahi merged pull request #10118: Fix LayoutRewriter

Posted by GitBox <gi...@apache.org>.
masahi merged pull request #10118:
URL: https://github.com/apache/tvm/pull/10118


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] masahi commented on a change in pull request #10118: Fix LayoutRewriter

Posted by GitBox <gi...@apache.org>.
masahi commented on a change in pull request #10118:
URL: https://github.com/apache/tvm/pull/10118#discussion_r796509866



##########
File path: src/relay/transforms/transform_layout.h
##########
@@ -371,6 +375,17 @@ Expr LayoutRewriter(const Call& ref_call, const Array<Expr>& new_args, const Obj
   ICHECK_EQ(new_in.size(), new_in2.size())
       << "The number of input nodes should keep the same during alter_op_layout";
 
+  auto transform_layout = [&memorizer](Expr arg_item, const Layout& old_in, const Layout& old_in2,
+                                       const Layout& new_in, const Layout& new_in2) {
+    if (old_in2.Equals(old_in)) {  // the two transforms can be fused to one
+      arg_item = memorizer.Transform(arg_item, new_in, new_in2);
+    } else {
+      if (old_in.defined()) arg_item = memorizer.Transform(arg_item, new_in, old_in);
+      arg_item = memorizer.Transform(arg_item, old_in2, new_in2);

Review comment:
       In this code path, `old_in != `old_in2`. So after the transform at L383, how is it possible that we can apply another transform with `old_in2` as the src layout?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] lazycal commented on a change in pull request #10118: Fix LayoutRewriter

Posted by GitBox <gi...@apache.org>.
lazycal commented on a change in pull request #10118:
URL: https://github.com/apache/tvm/pull/10118#discussion_r796811838



##########
File path: src/relay/transforms/transform_layout.h
##########
@@ -371,6 +375,17 @@ Expr LayoutRewriter(const Call& ref_call, const Array<Expr>& new_args, const Obj
   ICHECK_EQ(new_in.size(), new_in2.size())
       << "The number of input nodes should keep the same during alter_op_layout";
 
+  auto transform_layout = [&memorizer](Expr arg_item, const Layout& old_in, const Layout& old_in2,
+                                       const Layout& new_in, const Layout& new_in2) {
+    if (old_in2.Equals(old_in)) {  // the two transforms can be fused to one
+      arg_item = memorizer.Transform(arg_item, new_in, new_in2);
+    } else {
+      if (old_in.defined()) arg_item = memorizer.Transform(arg_item, new_in, old_in);
+      arg_item = memorizer.Transform(arg_item, old_in2, new_in2);

Review comment:
       Good catch. I thought that `old_in` and `old_in2` should be isomorphic, i.e., having the same structure (rank and subcoordinate factors, etc.) and only differing in how each axis is named (e.g., `NW` vs `NC`), given that they are describing the same tensor's layout. In this case, the transform can be applied. A concrete example: `new_in=NW8w`, `old_in=NW`, `old_in2=NC`, `new_in2=NC16c`, we will apply `NW8w->NW` and `NC->NC16c`, which is valid since layout_transform will work as long as the layout structure match the tensor shape. The net outcome is equivalent to a single transform `NC8c->NC16c`.
   
   However, I just hit a bizare case in BroadcastInferLayout that does not give isomorphic layouts:
   https://github.com/apache/tvm/blob/6a274af9cb4e7baf6d9dd06f0fb38bea5ac3154a/src/relay/transforms/infer_layout_utils.h#L224-L234
   This code path may expand the tensor's layout and assign `old_in2` to something with larger rank. For example, if the op is `a+b`, and originally `a`'s layout is `NCHW` and `b` is `W`, then its consumer (the broadcast node) will infer `old_in2=NCHW` for `b`. Now `W` and `NCHW` are not really isomorphic... I'm working on a fix, but this does sound pretty werid for me when you say a tensor with rank 1 is inferred with a layout with rank 4....




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] masahi commented on a change in pull request #10118: Fix LayoutRewriter

Posted by GitBox <gi...@apache.org>.
masahi commented on a change in pull request #10118:
URL: https://github.com/apache/tvm/pull/10118#discussion_r796509866



##########
File path: src/relay/transforms/transform_layout.h
##########
@@ -371,6 +375,17 @@ Expr LayoutRewriter(const Call& ref_call, const Array<Expr>& new_args, const Obj
   ICHECK_EQ(new_in.size(), new_in2.size())
       << "The number of input nodes should keep the same during alter_op_layout";
 
+  auto transform_layout = [&memorizer](Expr arg_item, const Layout& old_in, const Layout& old_in2,
+                                       const Layout& new_in, const Layout& new_in2) {
+    if (old_in2.Equals(old_in)) {  // the two transforms can be fused to one
+      arg_item = memorizer.Transform(arg_item, new_in, new_in2);
+    } else {
+      if (old_in.defined()) arg_item = memorizer.Transform(arg_item, new_in, old_in);
+      arg_item = memorizer.Transform(arg_item, old_in2, new_in2);

Review comment:
       In this code path, `old_in != `old_in2.`. So after the transform at L383, how is it possible that we can apply another transform with `old_in2` as the src layout?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] masahi commented on pull request #10118: Fix LayoutRewriter

Posted by GitBox <gi...@apache.org>.
masahi commented on pull request #10118:
URL: https://github.com/apache/tvm/pull/10118#issuecomment-1026605253


   @lazycal Thank you very much for working on this. I need to review our current layout rewrite implementation to understand this change, so please wait a few days. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] masahi commented on pull request #10118: Fix LayoutRewriter

Posted by GitBox <gi...@apache.org>.
masahi commented on pull request #10118:
URL: https://github.com/apache/tvm/pull/10118#issuecomment-1027266790


   > Same here. I have 2 proposals:
   > 
   >     * replace `_in` with `_in_input` and `_in2` with `_in_infer`.
   > 
   >     * Or more explicit, `_in` with `_in_producer` and `_in2` with `_in_consumer`.
   > 
   > 
   > Not sure which one would you prefer? Or any other ideas?
   
   I prefer the second one.  I thought `infer`-ed or not is for `old vs new` distinction? I think I prefer `before vs after` than `old vs new`, but a name like `before_in_producer` doesn't sound good... Actually I don't understand why we need 2 x 2 combination of layouts. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] masahi commented on a change in pull request #10118: Fix LayoutRewriter

Posted by GitBox <gi...@apache.org>.
masahi commented on a change in pull request #10118:
URL: https://github.com/apache/tvm/pull/10118#discussion_r796974298



##########
File path: src/relay/transforms/transform_layout.h
##########
@@ -371,6 +375,17 @@ Expr LayoutRewriter(const Call& ref_call, const Array<Expr>& new_args, const Obj
   ICHECK_EQ(new_in.size(), new_in2.size())
       << "The number of input nodes should keep the same during alter_op_layout";
 
+  auto transform_layout = [&memorizer](Expr arg_item, const Layout& old_in, const Layout& old_in2,
+                                       const Layout& new_in, const Layout& new_in2) {
+    if (old_in2.Equals(old_in)) {  // the two transforms can be fused to one
+      arg_item = memorizer.Transform(arg_item, new_in, new_in2);
+    } else {
+      if (old_in.defined()) arg_item = memorizer.Transform(arg_item, new_in, old_in);
+      arg_item = memorizer.Transform(arg_item, old_in2, new_in2);

Review comment:
       FYI many of us are aware of the messy situation our layout convert pass is in. I believe an entire rewrite is desired at some point. I'd love to have your thoughts on this in the discuss forum etc.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] lazycal commented on a change in pull request #10118: Fix LayoutRewriter

Posted by GitBox <gi...@apache.org>.
lazycal commented on a change in pull request #10118:
URL: https://github.com/apache/tvm/pull/10118#discussion_r796951191



##########
File path: src/relay/op/nn/nn.cc
##########
@@ -283,14 +279,6 @@ InferCorrectLayoutOutput DensePackInferCorrectLayout(const Attrs& attrs,
                                                      const Array<tvm::relay::Type>& old_in_types) {
   auto params = attrs.as<DensePackAttrs>();
   ICHECK(params);
-  // Respect input layout, if explicitly specified (for example, "NW").
-  // However, a packed layout such as "NC8c" is not supported by dense_pack op. For such cases,
-  // we insert a layout transform "NC8c" -> "NC".
-  // We do not expect to get a packed layout like "NW8w", which is not compatitble with "NC",
-  // since packing is always done on the "C" axis.
-  if (new_in_layouts.size() > 0 && new_in_layouts[0].defined() && new_in_layouts[0].ndim() == 2) {
-    return InferCorrectLayoutOutput({new_in_layouts[0], params->weight_layout}, {"NC"}, attrs);
-  }
   return InferCorrectLayoutOutput({"NC", params->weight_layout}, {"NC"}, attrs);
 }

Review comment:
       I skimmed through your change in https://github.com/apache/tvm/commit/66ac4705aae9bec92047920c8a9273693cd48c44, and I think it is probably still needed. Mainly because when FTVMAlterOpLayout is defined but FInterCorrectLayout is not, the current code logic in https://github.com/apache/tvm/blob/6a274af9cb4e7baf6d9dd06f0fb38bea5ac3154a/src/relay/transforms/transform_layout.h#L310-L320
   will assume this OP accepts any layout. In the case of Dense, it only accepts 2D `data` tensor, and when the producer for `data` tensor changes its layout, we need an additional layout transform to convert it back, which is not handled in L310-L320.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] lazycal commented on pull request #10118: Fix LayoutRewriter

Posted by GitBox <gi...@apache.org>.
lazycal commented on pull request #10118:
URL: https://github.com/apache/tvm/pull/10118#issuecomment-1027277470


   So the 2x2 is actually encoding 2 transforms: one is the producer's change, and the other is the consumer's (expected) change. For a weird example, before the change: `(A) conv_NCHW -> (B) conv_NCHW`
   after the change: `(A') conv_NCHW2c -> (B') conv_NCHW4c`.
   Suppose we are rewriting Node B. Then the producer's change will be NCHW->NCHW2c, and the consumer's expected change will be NCHW->NCHW4c. There are 4 layouts, hence 2x2 combinations.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] masahi commented on a change in pull request #10118: Fix LayoutRewriter

Posted by GitBox <gi...@apache.org>.
masahi commented on a change in pull request #10118:
URL: https://github.com/apache/tvm/pull/10118#discussion_r796378270



##########
File path: src/relay/op/nn/nn.cc
##########
@@ -283,14 +279,6 @@ InferCorrectLayoutOutput DensePackInferCorrectLayout(const Attrs& attrs,
                                                      const Array<tvm::relay::Type>& old_in_types) {
   auto params = attrs.as<DensePackAttrs>();
   ICHECK(params);
-  // Respect input layout, if explicitly specified (for example, "NW").
-  // However, a packed layout such as "NC8c" is not supported by dense_pack op. For such cases,
-  // we insert a layout transform "NC8c" -> "NC".
-  // We do not expect to get a packed layout like "NW8w", which is not compatitble with "NC",
-  // since packing is always done on the "C" axis.
-  if (new_in_layouts.size() > 0 && new_in_layouts[0].defined() && new_in_layouts[0].ndim() == 2) {
-    return InferCorrectLayoutOutput({new_in_layouts[0], params->weight_layout}, {"NC"}, attrs);
-  }
   return InferCorrectLayoutOutput({"NC", params->weight_layout}, {"NC"}, attrs);
 }

Review comment:
       Do we still need `DenseInferCorrectLayout` and `DensePackInferCorrectLayout` now? I added these functions to workaround alter layout issues, but that might not be necessary anymore. Can you try remove them and see what happens?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] masahi commented on pull request #10118: Fix LayoutRewriter

Posted by GitBox <gi...@apache.org>.
masahi commented on pull request #10118:
URL: https://github.com/apache/tvm/pull/10118#issuecomment-1026738883


   I really hate `new_` vs  `old_` and `_in vs _in2` naming convention in the existing code, it's impossible to understand. More than welcome to clean them up :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] lazycal commented on pull request #10118: Fix LayoutRewriter

Posted by GitBox <gi...@apache.org>.
lazycal commented on pull request #10118:
URL: https://github.com/apache/tvm/pull/10118#issuecomment-1027087466


   > @lazycal Thank you very much for working on this. I need to review our current layout rewrite implementation to understand this change, so please wait for a few days or longer until I merge this (but I will, definitely).
   
   Thanks! I am also reviewing the InferCorrectLayout functions to see if there is anything broken. The broadcast one is one example I just hit.
   
   > I really hate `new_` vs `old_` and `_in vs _in2` naming convention in the existing code, it's impossible to understand. More than welcome to clean them up :)
   
   Same here. I have 2 proposals: 
   - replace `_in` with `_in_input` and `_in2` with `_in_infer`.
   - Or more explicit, `_in` with `_in_producer` and `_in2` with `_in_consumer`.
   
   Not sure which one would you prefer? Or any other ideas?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] lazycal commented on a change in pull request #10118: Fix LayoutRewriter

Posted by GitBox <gi...@apache.org>.
lazycal commented on a change in pull request #10118:
URL: https://github.com/apache/tvm/pull/10118#discussion_r796979604



##########
File path: src/relay/transforms/transform_layout.h
##########
@@ -371,6 +375,17 @@ Expr LayoutRewriter(const Call& ref_call, const Array<Expr>& new_args, const Obj
   ICHECK_EQ(new_in.size(), new_in2.size())
       << "The number of input nodes should keep the same during alter_op_layout";
 
+  auto transform_layout = [&memorizer](Expr arg_item, const Layout& old_in, const Layout& old_in2,
+                                       const Layout& new_in, const Layout& new_in2) {
+    if (old_in2.Equals(old_in)) {  // the two transforms can be fused to one
+      arg_item = memorizer.Transform(arg_item, new_in, new_in2);
+    } else {
+      if (old_in.defined()) arg_item = memorizer.Transform(arg_item, new_in, old_in);
+      arg_item = memorizer.Transform(arg_item, old_in2, new_in2);

Review comment:
       Thanks for having me in your discussion. I agree on an entire rewrite. Ping me in discuss forum when you some day decide to do it :-), and I'd love to participate.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] masahi commented on a change in pull request #10118: Fix LayoutRewriter

Posted by GitBox <gi...@apache.org>.
masahi commented on a change in pull request #10118:
URL: https://github.com/apache/tvm/pull/10118#discussion_r796509866



##########
File path: src/relay/transforms/transform_layout.h
##########
@@ -371,6 +375,17 @@ Expr LayoutRewriter(const Call& ref_call, const Array<Expr>& new_args, const Obj
   ICHECK_EQ(new_in.size(), new_in2.size())
       << "The number of input nodes should keep the same during alter_op_layout";
 
+  auto transform_layout = [&memorizer](Expr arg_item, const Layout& old_in, const Layout& old_in2,
+                                       const Layout& new_in, const Layout& new_in2) {
+    if (old_in2.Equals(old_in)) {  // the two transforms can be fused to one
+      arg_item = memorizer.Transform(arg_item, new_in, new_in2);
+    } else {
+      if (old_in.defined()) arg_item = memorizer.Transform(arg_item, new_in, old_in);
+      arg_item = memorizer.Transform(arg_item, old_in2, new_in2);

Review comment:
       In this code path, `old_in != old_in2`. So after the transform at L383, how is it possible that we can apply another transform with `old_in2` as the src layout?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org