You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2020/07/15 22:05:27 UTC

[GitHub] [incubator-tvm] zhiics opened a new pull request #6068: [BYOC][Optimization] Run accelerator specific optimizations

zhiics opened a new pull request #6068:
URL: https://github.com/apache/incubator-tvm/pull/6068


   This PR enables the invocation of accelerator specific optimization flow to make BYOC consistent to TVM compilation flow. For example, we separate the optimization and codegen. Optimization is now invoked when a function is partitioned, and codegen only focuses on generating the needed runtime module. This is needed by libraries such as ARM compute library in #5915 
   
   Note that we now have explicitly registered various type of APIs for the BYOC flow and implicitly invoke them to accomplish the end-to-end flow. It would require users/vendors to change multiple places for registration. @jroesch also has a concern for this. @comaniac Lets think of a way to centralize the APIs and make an RFC.
   
   @masahi @lhutton1 @mbaret @trevor-m 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] trevor-m commented on pull request #6068: [BYOC][Optimization] Run accelerator specific optimizations

Posted by GitBox <gi...@apache.org>.

trevor-m commented on pull request #6068:
URL: https://github.com/apache/incubator-tvm/pull/6068#issuecomment-659050029


   It looks like this optimization callback is invoked after/during partitioning. What about optimizations that we want to do before annotation?
   
   For example, if the library can only support conv2d with a constant weights, we need to call bind_params_by_name and FoldConstant on the graph before annotation, so that the conv2d annotator can check if the conv2d args are constant.
   
   Another example is ConvertLayout to convert to NCHW, and then only allowing NCHW ops in the annotation.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] comaniac edited a comment on pull request #6068: [BYOC][Optimization] Run accelerator specific optimizations

Posted by GitBox <gi...@apache.org>.

comaniac edited a comment on pull request #6068:
URL: https://github.com/apache/incubator-tvm/pull/6068#issuecomment-659059599


   > My initial thought is that putting this into partition_graph looks a bit odd as running optimizations seems outside of the scope of the pass. What's motivated that instead of just having the codegen itself run the optimizations it needs?
   
   The most important reason has been demonstrated in #5915. Since the codegen should return a runtime module without mutating the graph (e.g., run transpose), running optimizations inside the codegen results in inconsistency between the module Relay/TVM processed and the module codegen processed. In the ACL case, it cannot use the unified weight serialization mechanism from MetadataModule but has to deal with by itself. This is not only tedious but also increasing the binary size, because MetadataModule still maintains the original weights that will never be used by ACL runtime.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] zhiics commented on pull request #6068: [BYOC][Optimization] Run accelerator specific optimizations

Posted by GitBox <gi...@apache.org>.

zhiics commented on pull request #6068:
URL: https://github.com/apache/incubator-tvm/pull/6068#issuecomment-659071984


   A better implementation should be invoking a `target` specific optimization after partitioning. This is more to make #5915 clean. We should think about all these in the RFC.
   
   The flow would be the following:
   target required processing passes -> annotation -> merge regions -> partitioning -> target required post processing passes -> relay codegen and target codegen


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] masahi commented on pull request #6068: [BYOC][Optimization] Run accelerator specific optimizations

Posted by GitBox <gi...@apache.org>.

masahi commented on pull request #6068:
URL: https://github.com/apache/incubator-tvm/pull/6068#issuecomment-659094262


   Thanks @zhiics @comaniac @trevor-m @mbaret 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] zhiics edited a comment on pull request #6068: [BYOC][Optimization] Run accelerator specific optimizations

Posted by GitBox <gi...@apache.org>.

zhiics edited a comment on pull request #6068:
URL: https://github.com/apache/incubator-tvm/pull/6068#issuecomment-659071984


   A better implementation should be invoking a `target` specific optimization after partitioning. This is more to make #5915 clean. The reason is what @comaniac mentioned above. We should think about all these in the RFC.
   
   The flow would be the following:
   target required processing passes -> annotation -> merge regions -> partitioning -> target required post processing passes -> relay compilation/codegen and target codegen


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] zhiics commented on pull request #6068: [BYOC][Optimization] Run accelerator specific optimizations

Posted by GitBox <gi...@apache.org>.

zhiics commented on pull request #6068:
URL: https://github.com/apache/incubator-tvm/pull/6068#issuecomment-659052119


   Yes, this is only for the passes that need to be executed on the partitioned program. For passes that should be executed before annotation, we should consider it how to pack the general flow of BYOC and optimization. We haven't followed up much on this as well. Maybe we should think about it together in the RFC mentioned above.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] zhiics edited a comment on pull request #6068: [BYOC][Optimization] Run accelerator specific optimizations

Posted by GitBox <gi...@apache.org>.

zhiics edited a comment on pull request #6068:
URL: https://github.com/apache/incubator-tvm/pull/6068#issuecomment-659071984


   A better implementation should be invoking a `target` specific optimization after partitioning. This is more to make #5915 clean. We should think about all these in the RFC.
   
   The flow would be the following:
   target required processing passes -> annotation -> merge regions -> partitioning -> target required post processing passes -> relay compliation/codegen and target codegen


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] trevor-m edited a comment on pull request #6068: [BYOC][Optimization] Run accelerator specific optimizations

Posted by GitBox <gi...@apache.org>.

trevor-m edited a comment on pull request #6068:
URL: https://github.com/apache/incubator-tvm/pull/6068#issuecomment-659050029






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] comaniac commented on pull request #6068: [BYOC][Optimization] Run accelerator specific optimizations

Posted by GitBox <gi...@apache.org>.

comaniac commented on pull request #6068:
URL: https://github.com/apache/incubator-tvm/pull/6068#issuecomment-659059599


   > My initial thought is that putting this into partition_graph looks a bit odd as running optimizations seems outside of the scope of the pass. What's motivated that instead of just having the codegen itself run the optimizations it needs?
   
   The most important reason has been demonstrated in #5915. Since the codegen should return a runtime module without mutating the graph (e.g., run transpose), running optimizations inside the codegen results in inconsistency between the module Relay/TVM processed and the module codegen processed. In the ACL case, it cannot use the unified weight serialization mechanism from MetadataModule but has to deal with by itself. This is not only tedious but also increases the binary size, because MetadataModule still maintains the original weights that will never be used by ACL runtime.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] masahi merged pull request #6068: [BYOC][Optimization] Run accelerator specific optimizations

Posted by GitBox <gi...@apache.org>.

masahi merged pull request #6068:
URL: https://github.com/apache/incubator-tvm/pull/6068


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-tvm] mbaret commented on pull request #6068: [BYOC][Optimization] Run accelerator specific optimizations

Posted by GitBox <gi...@apache.org>.

mbaret commented on pull request #6068:
URL: https://github.com/apache/incubator-tvm/pull/6068#issuecomment-659053385


   My initial thought is that putting this into partition_graph looks a bit odd as running optimizations seems outside of the scope of the pass. What's motivated that instead of just having the codegen itself run the optimizations it needs?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org