You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2022/03/04 22:36:21 UTC

[GitHub] [tvm-rfcs] tkonolige opened a new pull request #59: Remove tophub from being the default configuration when compiling

tkonolige opened a new pull request #59:
URL: https://github.com/apache/tvm-rfcs/pull/59


   Prompted by https://github.com/apache/tvm/pull/10474, this RFC proposes removing tophub from the default compilation workflow. If you rely on tophub being the implicit default, please comment on this RFC.
   
   @masahi @comaniac 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm-rfcs] tkonolige commented on pull request #59: Remove tophub from being the default configuration when compiling

Posted by GitBox <gi...@apache.org>.
tkonolige commented on pull request #59:
URL: https://github.com/apache/tvm-rfcs/pull/59#issuecomment-1062030685


   @comaniac rtx-3070


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm-rfcs] comaniac commented on pull request #59: Remove tophub from being the default configuration when compiling

Posted by GitBox <gi...@apache.org>.
comaniac commented on pull request #59:
URL: https://github.com/apache/tvm-rfcs/pull/59#issuecomment-1061322559


   > I ran with and without tophub on a selection of models in `tvm.relay.testing`:
   
   Thanks for the experiments. What's your target device?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm-rfcs] tkonolige commented on a change in pull request #59: Remove tophub from being the default configuration when compiling

Posted by GitBox <gi...@apache.org>.
tkonolige commented on a change in pull request #59:
URL: https://github.com/apache/tvm-rfcs/pull/59#discussion_r821164652



##########
File path: rfcs/0059-remove-tophub-from-default.md
##########
@@ -0,0 +1,57 @@
+- Feature Name: remove_tophub_from_default
+- Start Date: 2021-03-04
+- RFC PR: [apache/tvm-rfcs#0059](https://github.com/apache/tvm-rfcs/pull/0059)
+- GitHub Issue: [apache/tvm#10474](https://github.com/apache/tvm/pull/10474)
+
+# Summary
+[summary]: #summary
+
+Remove tophub from being the default autotvm configuration when compiling with
+the VM or graph executor. This bring thes VM and graph executor compilation
+workflow in line with all other workflows which do no use tophub by default.
+
+https://github.com/apache/tvm/pull/10474 provides a short discussion of the
+tradeoffs/issues in applying this change.
+
+# Motivation
+[motivation]: #motivation
+
+- Tophub is not maintained and is out of date in some cases. Users should
+  explicitly use it so there is no surprising behavior in using out of date
+  schedules.
+- Tophub is implicitly used only with the VM and graph executor compilation
+  flows. Other flows (like `VMCompiler.optimize`) will return different
+  results because they do not implicitly use tophub.
+- Many default schedules exist in TVM, but they are not used because the
+  tophub schedules override them. Either they should be removed or used.
+- Compilation implicitly depends on scheduling choices located outside of the
+  TVM codebase. This implicit behavior is confusing when debugging and
+  requires us to maintain a consistent state between two different repos.
+
+# Drawbacks
+[drawbacks]: #drawbacks
+
+We may see a performance degradation in models compiled without tuning as the
+tophub schedules may be more optimized then the fallback/untuned schedules. If
+this is an issue, we can port the poorly performing schedules from tophub to
+the fallback (i.e. default) configuration for the op.
+
+# Rationale and alternatives
+[rationale-and-alternatives]: #rationale-and-alternatives
+
+Alternative choices:
+
+- Use tophub as the default context when no other context has been specified.
+  This will make `VMCompiler.optimize` and `VMCompiler.lower` consistent in
+  their results.
+
+- Add tophub to `VMCompiler.optimize`. This doesn't solve the underlying
+  problem. I expect us see more inconsistent results in other parts of the
+  codebase where tophub is not used.
+
+# Unresolved questions
+[unresolved-questions]: #unresolved-questions
+
+Will there be any actual performance impact in removing tophub from being the default?

Review comment:
       Do the default schedules not work for the specific GPU on CI or do they not work for any GPU? My guess is they don't work for any GPU because they have never been tested. And they haven't been tested because using tophub by default causes the default schedules to not be used. To me it seems like this is a good reason to bring to not use tophub by default.
   
   Branching could be difficult. How many different targets are used in the tophub schedules?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm-rfcs] comaniac commented on a change in pull request #59: Remove tophub from being the default configuration when compiling

Posted by GitBox <gi...@apache.org>.
comaniac commented on a change in pull request #59:
URL: https://github.com/apache/tvm-rfcs/pull/59#discussion_r820002751



##########
File path: rfcs/0059-remove-tophub-from-default.md
##########
@@ -0,0 +1,57 @@
+- Feature Name: remove_tophub_from_default
+- Start Date: 2021-03-04
+- RFC PR: [apache/tvm-rfcs#0059](https://github.com/apache/tvm-rfcs/pull/0059)
+- GitHub Issue: [apache/tvm#10474](https://github.com/apache/tvm/pull/10474)
+
+# Summary
+[summary]: #summary
+
+Remove tophub from being the default autotvm configuration when compiling with
+the VM or graph executor. This bring thes VM and graph executor compilation
+workflow in line with all other workflows which do no use tophub by default.
+
+https://github.com/apache/tvm/pull/10474 provides a short discussion of the
+tradeoffs/issues in applying this change.
+
+# Motivation
+[motivation]: #motivation
+
+- Tophub is not maintained and is out of date in some cases. Users should
+  explicitly use it so there is no surprising behavior in using out of date
+  schedules.
+- Tophub is implicitly used only with the VM and graph executor compilation
+  flows. Other flows (like `VMCompiler.optimize`) will return different
+  results because they do not implicitly use tophub.
+- Many default schedules exist in TVM, but they are not used because the
+  tophub schedules override them. Either they should be removed or used.
+- Compilation implicitly depends on scheduling choices located outside of the
+  TVM codebase. This implicit behavior is confusing when debugging and
+  requires us to maintain a consistent state between two different repos.
+
+# Drawbacks
+[drawbacks]: #drawbacks
+
+We may see a performance degradation in models compiled without tuning as the
+tophub schedules may be more optimized then the fallback/untuned schedules. If
+this is an issue, we can port the poorly performing schedules from tophub to
+the fallback (i.e. default) configuration for the op.
+
+# Rationale and alternatives
+[rationale-and-alternatives]: #rationale-and-alternatives
+
+Alternative choices:
+
+- Use tophub as the default context when no other context has been specified.
+  This will make `VMCompiler.optimize` and `VMCompiler.lower` consistent in
+  their results.
+
+- Add tophub to `VMCompiler.optimize`. This doesn't solve the underlying
+  problem. I expect us see more inconsistent results in other parts of the
+  codebase where tophub is not used.
+
+# Unresolved questions
+[unresolved-questions]: #unresolved-questions
+
+Will there be any actual performance impact in removing tophub from being the default?

Review comment:
       Based on the CI, at least we know that the workloads in MobileNet on GPU require TopHub schedules to be compiled. The default schedules are invalid for some GPUs (e.g., the one used in CI) and will result in compilation failure. Given this fact, we cannot guarantee default schedules are valid for all workloads on GPU, and TopHub could more or less help complement this issue. 
   
   Thus, if we go with this RFC, the concern I may have is how should we port the TopHub schedule to be the default one? I could imagine that we may end up having a decision-tree like logic for each op to dispatch the desired default schedule. This may make TOPI even more ad-hoc and complicate. I would definitely like to hear others feedback about this.
   

##########
File path: rfcs/0059-remove-tophub-from-default.md
##########
@@ -0,0 +1,57 @@
+- Feature Name: remove_tophub_from_default
+- Start Date: 2021-03-04
+- RFC PR: [apache/tvm-rfcs#0059](https://github.com/apache/tvm-rfcs/pull/0059)
+- GitHub Issue: [apache/tvm#10474](https://github.com/apache/tvm/pull/10474)
+
+# Summary
+[summary]: #summary
+
+Remove tophub from being the default autotvm configuration when compiling with
+the VM or graph executor. This bring thes VM and graph executor compilation
+workflow in line with all other workflows which do no use tophub by default.
+
+https://github.com/apache/tvm/pull/10474 provides a short discussion of the
+tradeoffs/issues in applying this change.
+
+# Motivation
+[motivation]: #motivation
+
+- Tophub is not maintained and is out of date in some cases. Users should
+  explicitly use it so there is no surprising behavior in using out of date
+  schedules.
+- Tophub is implicitly used only with the VM and graph executor compilation
+  flows. Other flows (like `VMCompiler.optimize`) will return different
+  results because they do not implicitly use tophub.
+- Many default schedules exist in TVM, but they are not used because the
+  tophub schedules override them. Either they should be removed or used.
+- Compilation implicitly depends on scheduling choices located outside of the
+  TVM codebase. This implicit behavior is confusing when debugging and
+  requires us to maintain a consistent state between two different repos.
+
+# Drawbacks
+[drawbacks]: #drawbacks
+
+We may see a performance degradation in models compiled without tuning as the
+tophub schedules may be more optimized then the fallback/untuned schedules. If
+this is an issue, we can port the poorly performing schedules from tophub to
+the fallback (i.e. default) configuration for the op.
+
+# Rationale and alternatives
+[rationale-and-alternatives]: #rationale-and-alternatives
+
+Alternative choices:
+
+- Use tophub as the default context when no other context has been specified.
+  This will make `VMCompiler.optimize` and `VMCompiler.lower` consistent in
+  their results.
+
+- Add tophub to `VMCompiler.optimize`. This doesn't solve the underlying
+  problem. I expect us see more inconsistent results in other parts of the
+  codebase where tophub is not used.
+
+# Unresolved questions
+[unresolved-questions]: #unresolved-questions
+
+Will there be any actual performance impact in removing tophub from being the default?
+
+Should we provide a deprecation notice, or a change notice, or no notice at all?

Review comment:
       If we go with this RFC, I vote for a WARNING in graph executor saying that TopHub is no longer applied automatically, please xxx (provide a link or just describe how to bring TopHub back if users want). We could further discuss how long should the warning be. Formally, it should be there for at least one release cycle (0.9 -> 1.0), but we could make it shorter if everyone agrees.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm-rfcs] tkonolige commented on pull request #59: Remove tophub from being the default configuration when compiling

Posted by GitBox <gi...@apache.org>.
tkonolige commented on pull request #59:
URL: https://github.com/apache/tvm-rfcs/pull/59#issuecomment-1061214817


   I ran with and without tophub on a selection of models in `tvm.relay.testing`:
   
   ```
   resnet
   With Tophub
   Execution time summary:
    mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)
      0.5029       0.1609       1.1974       0.1431       0.4449
   
   Without Tophub
   Execution time summary:
    mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)
      0.0885       0.0842       0.1020       0.0823       0.0072
   
   densenet
   With Tophub
   Execution time summary:
    mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)
      1.7610       1.9728       1.9733       0.9131       0.4239
   
   Without Tophub
   Execution time summary:
    mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)
      4.3721       5.5795       5.5805       0.5085       1.9678
   
   inception_v3
   With Tophub
   Execution time summary:
    mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)
      4.5442       6.2537       6.2550       0.3917       2.3228
   
   Without Tophub
   Execution time summary:
    mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)
     43.6982      64.7263      64.9195       0.3586      26.9297
   
   mlp
   With Tophub
   Execution time summary:
    mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)
      0.0184       0.0185       0.0197       0.0172       0.0010
   
   Without Tophub
   Execution time summary:
    mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)
      0.0154       0.0152       0.0167       0.0148       0.0007
   
   mobilenet
   With Tophub
   Execution time summary:
    mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)
      0.0872       0.0832       0.1042       0.0816       0.0086
   
   Without Tophub
   Execution time summary:
    mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)
      0.0844       0.0840       0.0896       0.0816       0.0028
   
   squeezenet
   With Tophub
   Execution time summary:
    mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)
      0.1058       0.1051       0.1103       0.1033       0.0024
   
   Without Tophub
   Execution time summary:
    mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)
      0.1056       0.1051       0.1084       0.1040       0.0016
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm-rfcs] comaniac commented on a change in pull request #59: Remove tophub from being the default configuration when compiling

Posted by GitBox <gi...@apache.org>.
comaniac commented on a change in pull request #59:
URL: https://github.com/apache/tvm-rfcs/pull/59#discussion_r821253221



##########
File path: rfcs/0059-remove-tophub-from-default.md
##########
@@ -0,0 +1,57 @@
+- Feature Name: remove_tophub_from_default
+- Start Date: 2021-03-04
+- RFC PR: [apache/tvm-rfcs#0059](https://github.com/apache/tvm-rfcs/pull/0059)
+- GitHub Issue: [apache/tvm#10474](https://github.com/apache/tvm/pull/10474)
+
+# Summary
+[summary]: #summary
+
+Remove tophub from being the default autotvm configuration when compiling with
+the VM or graph executor. This bring thes VM and graph executor compilation
+workflow in line with all other workflows which do no use tophub by default.
+
+https://github.com/apache/tvm/pull/10474 provides a short discussion of the
+tradeoffs/issues in applying this change.
+
+# Motivation
+[motivation]: #motivation
+
+- Tophub is not maintained and is out of date in some cases. Users should
+  explicitly use it so there is no surprising behavior in using out of date
+  schedules.
+- Tophub is implicitly used only with the VM and graph executor compilation
+  flows. Other flows (like `VMCompiler.optimize`) will return different
+  results because they do not implicitly use tophub.
+- Many default schedules exist in TVM, but they are not used because the
+  tophub schedules override them. Either they should be removed or used.
+- Compilation implicitly depends on scheduling choices located outside of the
+  TVM codebase. This implicit behavior is confusing when debugging and
+  requires us to maintain a consistent state between two different repos.
+
+# Drawbacks
+[drawbacks]: #drawbacks
+
+We may see a performance degradation in models compiled without tuning as the
+tophub schedules may be more optimized then the fallback/untuned schedules. If
+this is an issue, we can port the poorly performing schedules from tophub to
+the fallback (i.e. default) configuration for the op.
+
+# Rationale and alternatives
+[rationale-and-alternatives]: #rationale-and-alternatives
+
+Alternative choices:
+
+- Use tophub as the default context when no other context has been specified.
+  This will make `VMCompiler.optimize` and `VMCompiler.lower` consistent in
+  their results.
+
+- Add tophub to `VMCompiler.optimize`. This doesn't solve the underlying
+  problem. I expect us see more inconsistent results in other parts of the
+  codebase where tophub is not used.
+
+# Unresolved questions
+[unresolved-questions]: #unresolved-questions
+
+Will there be any actual performance impact in removing tophub from being the default?

Review comment:
       I might be wrong, but my impression is the default schedule must work for at least one GPU by the time it was committed, because there's no point to put an invalid schedule anyways (but maybe that valid schedule is no longer valid...)
   
   I also feel that branching is difficult. IIRC, TopHub CUDA should include 3-4 GPUs, but once we rely on default schedules, any future "default" schedules have to be in TOPI as well.
   
   In short, I feel the easiest way for now is 1) removing the tophub context as you suggested, but 2) making sure default schedules could pass CIs. Meanwhile, 3) adding a deprecate warning.
   
   cc @tqchen @masahi @junrushao1994 @areusch 
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org