You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2020/11/25 17:34:11 UTC

[GitHub] [tvm] manupa-arm commented on a change in pull request #6655: [BYOC] Added "include_non_call_ops" parameter to AnnotateTarget pass

manupa-arm commented on a change in pull request #6655:
URL: https://github.com/apache/tvm/pull/6655#discussion_r530541955



##########
File path: src/relay/transforms/annotate_target.cc
##########
@@ -128,14 +143,31 @@ class AnnotateTargetRewriter : public ExprRewriter {
      * \return An annotated and target-propagated relay expression.
      */
     Expr new_expr = expr;
-    if (op_expr_to_target_.find(expr) != op_expr_to_target_.end() && FreeVars(expr).size() != 0) {
-      new_expr = InsertAnnotation(expr, op_expr_to_target_[expr], make_end_op);
-      op_expr_to_target_[new_expr] = op_expr_to_target_[expr];
+    const CallNode* call = expr.as<CallNode>();
+    if (op_expr_to_target_.find(expr) != op_expr_to_target_.end()) {
+      // Check whether expr has args, if not - do not insert compiler_end.
+      if (expr->IsInstance<RefWriteNode>() || expr->IsInstance<RefCreateNode>() ||
+          expr->IsInstance<RefReadNode>() || expr->IsInstance<TupleNode>() ||

Review comment:
       While running merge compiler regions helps cut down the regions, it also makes the external codegen's responsibility to allocate memory for intermediate tensors on those partitions. Thus, in the specific case of ACL, I think there is not much gained by such a merger as ACL would be implementing each ACL primitive operator and let tvm handle the memory allocation of the tensors passed onto external function. Moreover, the kernel launch overhead  should also be minimal as it is running on the CPU (so the host and device is almost the same here). Also such a merger will also make the IO tensors live throughout the execution of external function while the space could be re-used if it was not merged.
   
   The problem is the specification of the ACL did not indicate the simple regions (or non-call ops) to be annotated, thus annotate target here is doing something extra than it was asked. 
   
   I quite agree that this pass is complicated and needs breakdown. I guess that should be discussed in a RFC as to how it should look like. One direction maybe to take out the annotation of simple regions (non-call ops) as a seperate part ( I believe this was how it looked liked sometime back when it had something AnnotateRestDefault until it got merged here :) ).




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org