You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2021/05/12 18:56:10 UTC

[GitHub] [tvm] tkonolige opened a new pull request #8030: [TOPI] Custom schedule for standalone transpose in cuda

tkonolige opened a new pull request #8030:
URL: https://github.com/apache/tvm/pull/8030


   This PR adds an optimized schedule for transpose if the transpose is not fused into anything else.
   
   @altanh @junrushao1994 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] tkonolige commented on a change in pull request #8030: [TOPI] Custom schedule for standalone transpose in cuda

Posted by GitBox <gi...@apache.org>.

tkonolige commented on a change in pull request #8030:
URL: https://github.com/apache/tvm/pull/8030#discussion_r633766017



##########
File path: python/tvm/topi/cuda/sparse.py
##########
@@ -105,13 +105,22 @@ def _callback(op):
     return s
 
 
-def schedule_cuda_transpose(s, out):
+def schedule_transpose(outs):

Review comment:
       moved to transform.py

##########
File path: python/tvm/relay/op/strategy/cuda.py
##########
@@ -1068,3 +1070,23 @@ def unique_strategy_cuda(attrs, inputs, out_type, target):
         name="unique.cuda",
     )
     return strategy
+
+
+@schedule_transpose.register(["cuda", "gpu", "rocm"])
+def schedule_transpose_cuda(attrs, outs, target):
+    """
+    Transpose cuda strategy
+    Dispatches to and optimized schedule if the transpose is standalone (not fused).
+    """
+    warp_size = int(Target.current(allow_none=False).thread_warp_size)
+    if (

Review comment:
       As far as I can tell, there is not a better way to do this. There is a way to add implementations based on input sizes, but these are not on a per-target basis. If you know a better way, let me know.

##########
File path: tests/python/topi/python/test_topi_transform.py
##########
@@ -870,6 +871,30 @@ def test_transpose():
     verify_transpose((3, 10), None)
 
 
+@tvm.testing.parametrize_targets
+def test_transpose_schedule(target, dev):
+    shape = (100, 34)
+    x = relay.var("x", relay.TensorType(shape, "float32"))
+    f = relay.transpose(x)
+    ex = relay.create_executor(
+        kind="graph", mod=tvm.IRModule.from_expr(relay.Function([x], f)), device=dev, target=target
+    )
+    r = np.random.rand(*shape)
+    tvm.testing.assert_allclose(ex.evaluate()(r).asnumpy(), np.transpose(r))
+
+    # make sure schedule does not fire here

Review comment:
       It is more like a wish. Ideally we would be able to know which schedules were used, but there is to way to introspect on what was used. I've updated the comment to reflect this.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] altanh commented on a change in pull request #8030: [TOPI] Custom schedule for standalone transpose in cuda

Posted by GitBox <gi...@apache.org>.

altanh commented on a change in pull request #8030:
URL: https://github.com/apache/tvm/pull/8030#discussion_r632736914



##########
File path: python/tvm/topi/cuda/sparse.py
##########
@@ -105,13 +105,22 @@ def _callback(op):
     return s
 
 
-def schedule_cuda_transpose(s, out):
+def schedule_transpose(outs):

Review comment:
       feels a bit weird to have this in `sparse.py`

##########
File path: python/tvm/relay/op/strategy/cuda.py
##########
@@ -1068,3 +1070,23 @@ def unique_strategy_cuda(attrs, inputs, out_type, target):
         name="unique.cuda",
     )
     return strategy
+
+
+@schedule_transpose.register(["cuda", "gpu", "rocm"])
+def schedule_transpose_cuda(attrs, outs, target):
+    """
+    Transpose cuda strategy
+    Dispatches to and optimized schedule if the transpose is standalone (not fused).
+    """
+    warp_size = int(Target.current(allow_none=False).thread_warp_size)
+    if (

Review comment:
       is there a more principled way to do this? like maybe with an OpStrategy or something

##########
File path: vta/tutorials/autotvm/tune_relay_vta.py
##########
@@ -357,7 +357,7 @@ def tune_and_evaluate(tuning_opt):
     )
 
     # filter out non-packed conv2d task
-    tasks = list(filter(lambda t: len(t.args[0][1]) > 4, tasks))
+    tasks = list(filter(lambda t: len(t.args[0][1]) > 4 and "conv" in t.name, tasks))

Review comment:
       what happened here, did this transpose change introduce a new task or something?

##########
File path: tests/python/topi/python/test_topi_transform.py
##########
@@ -870,6 +871,30 @@ def test_transpose():
     verify_transpose((3, 10), None)
 
 
+@tvm.testing.parametrize_targets
+def test_transpose_schedule(target, dev):
+    shape = (100, 34)
+    x = relay.var("x", relay.TensorType(shape, "float32"))
+    f = relay.transpose(x)
+    ex = relay.create_executor(
+        kind="graph", mod=tvm.IRModule.from_expr(relay.Function([x], f)), device=dev, target=target
+    )
+    r = np.random.rand(*shape)
+    tvm.testing.assert_allclose(ex.evaluate()(r).asnumpy(), np.transpose(r))
+
+    # make sure schedule does not fire here

Review comment:
       is this a TODO? Also I wonder if it would be good to parametrize the test shape by warp size (rather than hard coding) for future proofing




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] comaniac commented on pull request #8030: [TOPI] Custom schedule for standalone transpose in cuda

Posted by GitBox <gi...@apache.org>.

comaniac commented on pull request #8030:
URL: https://github.com/apache/tvm/pull/8030#issuecomment-843597340


   > LGTM, the only thing I'm wondering about is if someone (for whatever reason) really wanted to tune the default injective schedule for transpose, is there any way to allow that?
   > 
   > cc @comaniac for additional review (feel free to tag more relevant reviewers)
   
   There's no reason to tune inject schedule and you basically cannot do it because injective schedule doesn't have AutoTVM knobs for tuning.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] comaniac commented on a change in pull request #8030: [TOPI] Custom schedule for standalone transpose in cuda

Posted by GitBox <gi...@apache.org>.

comaniac commented on a change in pull request #8030:
URL: https://github.com/apache/tvm/pull/8030#discussion_r634793947



##########
File path: tests/python/topi/python/test_topi_transform.py
##########
@@ -870,6 +871,31 @@ def test_transpose():
     verify_transpose((3, 10), None)
 
 
+@tvm.testing.parametrize_targets
+def test_transpose_schedule(target, dev):
+    shape = (100, target.thread_warp_size + 3)
+    x = relay.var("x", relay.TensorType(shape, "float32"))
+    f = relay.transpose(x)
+    ex = relay.create_executor(
+        kind="graph", mod=tvm.IRModule.from_expr(relay.Function([x], f)), device=dev, target=target
+    )
+    r = np.random.rand(*shape)
+    tvm.testing.assert_allclose(ex.evaluate()(r).asnumpy(), np.transpose(r))
+
+    # We want to make sure schedule does not fire here, but there is no way of
+    # inspecting which schedules were used.

Review comment:
       Fair enough. Then it might be better to name it `test_transpose_fuse` or something like that (nit).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] tkonolige commented on a change in pull request #8030: [TOPI] Custom schedule for standalone transpose in cuda

Posted by GitBox <gi...@apache.org>.

tkonolige commented on a change in pull request #8030:
URL: https://github.com/apache/tvm/pull/8030#discussion_r634793073



##########
File path: tests/python/topi/python/test_topi_transform.py
##########
@@ -870,6 +871,31 @@ def test_transpose():
     verify_transpose((3, 10), None)
 
 
+@tvm.testing.parametrize_targets
+def test_transpose_schedule(target, dev):
+    shape = (100, target.thread_warp_size + 3)
+    x = relay.var("x", relay.TensorType(shape, "float32"))
+    f = relay.transpose(x)
+    ex = relay.create_executor(
+        kind="graph", mod=tvm.IRModule.from_expr(relay.Function([x], f)), device=dev, target=target
+    )
+    r = np.random.rand(*shape)
+    tvm.testing.assert_allclose(ex.evaluate()(r).asnumpy(), np.transpose(r))
+
+    # We want to make sure schedule does not fire here, but there is no way of
+    # inspecting which schedules were used.

Review comment:
       We could. I like to keep it separate so the intention is known.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] tkonolige commented on a change in pull request #8030: [TOPI] Custom schedule for standalone transpose in cuda

Posted by GitBox <gi...@apache.org>.

tkonolige commented on a change in pull request #8030:
URL: https://github.com/apache/tvm/pull/8030#discussion_r634792504



##########
File path: vta/tutorials/autotvm/tune_relay_vta.py
##########
@@ -357,7 +357,7 @@ def tune_and_evaluate(tuning_opt):
     )
 
     # filter out non-packed conv2d task
-    tasks = list(filter(lambda t: len(t.args[0][1]) > 4, tasks))
+    tasks = list(filter(lambda t: len(t.args[0][1]) > 4 and "conv" in t.name, tasks))

Review comment:
       We may want to tune it in the future.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] tkonolige commented on a change in pull request #8030: [TOPI] Custom schedule for standalone transpose in cuda

Posted by GitBox <gi...@apache.org>.

tkonolige commented on a change in pull request #8030:
URL: https://github.com/apache/tvm/pull/8030#discussion_r634779214



##########
File path: vta/tutorials/autotvm/tune_relay_vta.py
##########
@@ -357,7 +357,7 @@ def tune_and_evaluate(tuning_opt):
     )
 
     # filter out non-packed conv2d task
-    tasks = list(filter(lambda t: len(t.args[0][1]) > 4, tasks))
+    tasks = list(filter(lambda t: len(t.args[0][1]) > 4 and "conv" in t.name, tasks))

Review comment:
       actually, no, but this check makes sure anyways.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] areusch merged pull request #8030: [TOPI] Custom schedule for standalone transpose in cuda

Posted by GitBox <gi...@apache.org>.

areusch merged pull request #8030:
URL: https://github.com/apache/tvm/pull/8030


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] tkonolige commented on a change in pull request #8030: [TOPI] Custom schedule for standalone transpose in cuda

Posted by GitBox <gi...@apache.org>.

tkonolige commented on a change in pull request #8030:
URL: https://github.com/apache/tvm/pull/8030#discussion_r632815266



##########
File path: vta/tutorials/autotvm/tune_relay_vta.py
##########
@@ -357,7 +357,7 @@ def tune_and_evaluate(tuning_opt):
     )
 
     # filter out non-packed conv2d task
-    tasks = list(filter(lambda t: len(t.args[0][1]) > 4, tasks))
+    tasks = list(filter(lambda t: len(t.args[0][1]) > 4 and "conv" in t.name, tasks))

Review comment:
       yes




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] comaniac commented on a change in pull request #8030: [TOPI] Custom schedule for standalone transpose in cuda

Posted by GitBox <gi...@apache.org>.

comaniac commented on a change in pull request #8030:
URL: https://github.com/apache/tvm/pull/8030#discussion_r634787135



##########
File path: vta/tutorials/autotvm/tune_relay_vta.py
##########
@@ -357,7 +357,7 @@ def tune_and_evaluate(tuning_opt):
     )
 
     # filter out non-packed conv2d task
-    tasks = list(filter(lambda t: len(t.args[0][1]) > 4, tasks))
+    tasks = list(filter(lambda t: len(t.args[0][1]) > 4 and "conv" in t.name, tasks))

Review comment:
       Isn't the new added schedule not tunable? Or is there any concern of adding knobs?

##########
File path: tests/python/topi/python/test_topi_transform.py
##########
@@ -870,6 +871,31 @@ def test_transpose():
     verify_transpose((3, 10), None)
 
 
+@tvm.testing.parametrize_targets
+def test_transpose_schedule(target, dev):
+    shape = (100, target.thread_warp_size + 3)
+    x = relay.var("x", relay.TensorType(shape, "float32"))
+    f = relay.transpose(x)
+    ex = relay.create_executor(
+        kind="graph", mod=tvm.IRModule.from_expr(relay.Function([x], f)), device=dev, target=target
+    )
+    r = np.random.rand(*shape)
+    tvm.testing.assert_allclose(ex.evaluate()(r).asnumpy(), np.transpose(r))
+
+    # We want to make sure schedule does not fire here, but there is no way of
+    # inspecting which schedules were used.

Review comment:
       Like this comment mentions, there is no way of inspecting which schedules were used, so it seems to me that the difference between this test and `test_transpose` is the workload in this test includes `add` to test the case of fusion. Accordingly, could we just extend `test_transpose`?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] tkonolige commented on a change in pull request #8030: [TOPI] Custom schedule for standalone transpose in cuda

Posted by GitBox <gi...@apache.org>.

tkonolige commented on a change in pull request #8030:
URL: https://github.com/apache/tvm/pull/8030#discussion_r634794779



##########
File path: tests/python/topi/python/test_topi_transform.py
##########
@@ -870,6 +871,31 @@ def test_transpose():
     verify_transpose((3, 10), None)
 
 
+@tvm.testing.parametrize_targets
+def test_transpose_schedule(target, dev):
+    shape = (100, target.thread_warp_size + 3)
+    x = relay.var("x", relay.TensorType(shape, "float32"))
+    f = relay.transpose(x)
+    ex = relay.create_executor(
+        kind="graph", mod=tvm.IRModule.from_expr(relay.Function([x], f)), device=dev, target=target
+    )
+    r = np.random.rand(*shape)
+    tvm.testing.assert_allclose(ex.evaluate()(r).asnumpy(), np.transpose(r))
+
+    # We want to make sure schedule does not fire here, but there is no way of
+    # inspecting which schedules were used.

Review comment:
       switched to `test_transpose_unfused_schedule`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org