You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2022/05/19 14:25:28 UTC

[GitHub] [tvm] shtinsa commented on a diff in pull request #11341: [TE] Optimized version of concatenation layer

shtinsa commented on code in PR #11341:
URL: https://github.com/apache/tvm/pull/11341#discussion_r877131601


##########
python/tvm/relay/op/strategy/generic.py:
##########
@@ -1760,6 +1769,44 @@ def cumsum_strategy(attrs, inputs, out_type, target):
     return strategy
 
 
+@override_native_generic_func("concat_strategy")
+def concatenate_strategy(attrs, inputs, out_type, target):
+    """concatenate generic strategy"""
+    strategy = _op.OpStrategy()
+    strategy.add_implementation(
+        wrap_compute_concat(topi.concatenate),
+        wrap_topi_schedule(topi.generic.schedule_extern),
+        name="concatenate",
+    )
+    return strategy
+
+
+@concatenate_strategy.register(["cpu"])
+def concatenate_strategy_cpu(attrs, inputs, out_type, target):
+    """concatenate x86 strategy"""
+    strategy = _op.OpStrategy()
+    use_old_concat = False
+    for inpt in inputs:
+        shape = inpt.shape
+        for i in shape:
+            if not isinstance(i, tir.expr.IntImm):
+                use_old_concat = True
+                break
+    if use_old_concat:
+        strategy.add_implementation(
+            wrap_compute_concat(topi.transform.concatenate),
+            wrap_topi_schedule(topi.x86.injective.schedule_concatenate),
+            name="concatenate.generic",
+        )
+    else:
+        strategy.add_implementation(
+            wrap_compute_concat(topi.x86.concatenate),
+            wrap_topi_schedule(topi.x86.schedule_concatenate_cpu),
+            name="concatenate.cpu",
+        )

Review Comment:
   1. Regarding the dynamic shaping: the new version can be adapted if "any_val" is not presented within the shape.
   2. Yes I tried to do benchmarking and it was faster in most cases. The problems may happen if available threads number is high because the "complexity" of solution is small in this case.
   3. I Updated code to have two strategies



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org