You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2022/07/01 21:07:42 UTC

[GitHub] [tvm-rfcs] yelite commented on a diff in pull request #79: [RFC] TVMScript Metaprogramming

yelite commented on code in PR #79:
URL: https://github.com/apache/tvm-rfcs/pull/79#discussion_r912250622


##########
rfcs/0079-tvmscript-metaprogramming.md:
##########
@@ -0,0 +1,398 @@
+- Feature Name: tvmscript-metaprogramming
+- Start Date: 2022-06-16
+- RFC PR: [apache/tvm-rfcs#79](https://github.com/apache/tvm-rfcs/pull/79)
+- GitHub Issue: [apache/tvm#0000](https://github.com/apache/tvm/issues/0000)
+- Co-Authors: Yaxing Cai ([**@cyx-6**](https://github.com/cyx-6), main implementation), Lite Ye
+  ([**@yelite**](https://github.com/yelite)), Yong Wu
+  ([**@yongwww**](https://github.com/yongwww)), Yuchen Jin
+  ([**@YuchenJin**](https://github.com/YuchenJin)), Eric Lunderberg
+  ([**@Lunderberg**](https://github.com/Lunderberg)), Masahiro Masuda
+  ([**@masahi**](https://github.com/masahi)), Junru Shao
+  ([**@junrushao1994**](https://github.com/junrushao1994), main designer)
+
+# Summary
+[summary]: #summary
+
+This RFC proposes a new TVMScript parser infrastructure, supporting extensive
+metaprogramming and syntactic sugars. The new infrastructure is IR-agnostic,
+treating TIR just as one of dialects. Additionally, the new infrastructure will
+provide better tooling around Python ecosystem (pylint, mypy, etc.).
+
+# Motivation
+[motivation]: #motivation
+
+**What is TVMScript**. 
+Check [Blitz Course to TensorIR](https://tvm.apache.org/docs/tutorial/tensor_ir_blitz_course.html) and
+[TVMScript Unified Printer RFC](https://github.com/apache/tvm-rfcs/pull/74/files#diff-6965a40ad8df7618ae68e11c88f924542a506c74a931cc3011ae9f99989b5f51R20-R26)
+for an introduction into TVMScript.
+
+**What is metaprogramming.** In the context of TVMScript, metaprogramming means
+a programmable way to control IR generation. For example, in
+https://github.com/apache/tvm/pull/11097, a metaprogramming feature was added
+to the TVMScript parser, allows users to programmably control the shapes of the
+input buffers of a `PrimFunc`.
+
+### Limitation of current design
+
+The current parser lacks capability on generic metaprogramming that allows user
+to have more control on IR construction. This makes it challenging to support
+operators like NMS (non-maximum suppression, which is crucial to object
+detection model). There is an implementation of NMS at
+[python/tvm/topi/cuda/nms.py#L367-L386](https://github.com/apache/tvm/blob/d0650bad66d0ff89a01347537021bc442a98c223/python/tvm/topi/cuda/nms.py#L367-L386).
+The implementation of NMS-like operators requires rank-polymorphism and the
+ability to interleave host program with TVMScript, which is difficult to be
+implemented under the current design.
+
+TVMScript also needs reasonable support on Python tooling. Currently it doesn’t
+play nicely with pylint and mypy. For example,
+[test_meta_schedule_postproc_rewrite_tensorize.py](https://github.com/apache/tvm/blob/d0650bad66d0ff89a01347537021bc442a98c223/tests/python/unittest/test_meta_schedule_postproc_rewrite_tensorize.py)
+has 100+ warnings from pylint within only 500 hundred lines of code. This
+creates confusion to the user and leaves an impression that TVMScript isn’t a
+mature product and not production-ready. Even though it’s something that can be
+incrementally improved under the current design, we believe it’s easier to get
+an ideal result if we have a design with the tooling support in mind.
+
+The current design also lacks of unified approach for different IRs. At
+[https://github.com/tlc-pack/relax/tree/relax/python/tvm/script/relax](https://github.com/tlc-pack/relax/tree/relax/python/tvm/script/relax),
+a mature implementation of TVMScript parser is maintained for Relax. But it’s
+hard to extend if we want to support more IRs for TVM unity.
+
+To conclude, with this RFC, we want to:
+
+1. Add more metaprogramming features to TVMScript, making it easier for TVM
+   developers to write complicated operators.
+2. Improve tooling and documentation of TVMScript, reducing the friction for an
+   average machine learning practitioner to use TVMScript.
+3. Modularize and infrastructuralize the TVMScript parser, lowering the cost to
+   implement parser for new IR.
+
+
+# Guide-level explanation
+[guide-level-explanation]: #guide-level-explanation
+
+## Metaprogramming features to support
+
+### (F1) Template Metaprogramming
+
+Users should be able to use variables from outer scope in the TVMScript
+function/class. The parsed result should be identical to function/class with
+the variable replaced by its value. For instance,
+
+```python
+@T.prim_func
+def matmul(
+  A: T.Buffer[(128, 128)],
+) -> None:
+  ...
+
+def gen_matmul(n, m) -> None:
+  @T.prim_func
+  def f(A: T.Buffer[(n, m)]):
+    ...
+  return f
+
+f = matmul(n=128, m=128) # `f` should be identical to `matmul`

Review Comment:
   Fixed. Thanks for pointing out!



##########
rfcs/0079-tvmscript-metaprogramming.md:
##########
@@ -0,0 +1,398 @@
+- Feature Name: tvmscript-metaprogramming
+- Start Date: 2022-06-16
+- RFC PR: [apache/tvm-rfcs#79](https://github.com/apache/tvm-rfcs/pull/79)
+- GitHub Issue: [apache/tvm#0000](https://github.com/apache/tvm/issues/0000)
+- Co-Authors: Yaxing Cai ([**@cyx-6**](https://github.com/cyx-6), main implementation), Lite Ye
+  ([**@yelite**](https://github.com/yelite)), Yong Wu
+  ([**@yongwww**](https://github.com/yongwww)), Yuchen Jin
+  ([**@YuchenJin**](https://github.com/YuchenJin)), Eric Lunderberg
+  ([**@Lunderberg**](https://github.com/Lunderberg)), Masahiro Masuda
+  ([**@masahi**](https://github.com/masahi)), Junru Shao
+  ([**@junrushao1994**](https://github.com/junrushao1994), main designer)
+
+# Summary
+[summary]: #summary
+
+This RFC proposes a new TVMScript parser infrastructure, supporting extensive
+metaprogramming and syntactic sugars. The new infrastructure is IR-agnostic,
+treating TIR just as one of dialects. Additionally, the new infrastructure will
+provide better tooling around Python ecosystem (pylint, mypy, etc.).
+
+# Motivation
+[motivation]: #motivation
+
+**What is TVMScript**. 
+Check [Blitz Course to TensorIR](https://tvm.apache.org/docs/tutorial/tensor_ir_blitz_course.html) and
+[TVMScript Unified Printer RFC](https://github.com/apache/tvm-rfcs/pull/74/files#diff-6965a40ad8df7618ae68e11c88f924542a506c74a931cc3011ae9f99989b5f51R20-R26)
+for an introduction into TVMScript.
+
+**What is metaprogramming.** In the context of TVMScript, metaprogramming means
+a programmable way to control IR generation. For example, in
+https://github.com/apache/tvm/pull/11097, a metaprogramming feature was added
+to the TVMScript parser, allows users to programmably control the shapes of the
+input buffers of a `PrimFunc`.
+
+### Limitation of current design
+
+The current parser lacks capability on generic metaprogramming that allows user
+to have more control on IR construction. This makes it challenging to support
+operators like NMS (non-maximum suppression, which is crucial to object
+detection model). There is an implementation of NMS at
+[python/tvm/topi/cuda/nms.py#L367-L386](https://github.com/apache/tvm/blob/d0650bad66d0ff89a01347537021bc442a98c223/python/tvm/topi/cuda/nms.py#L367-L386).
+The implementation of NMS-like operators requires rank-polymorphism and the
+ability to interleave host program with TVMScript, which is difficult to be
+implemented under the current design.
+
+TVMScript also needs reasonable support on Python tooling. Currently it doesn’t
+play nicely with pylint and mypy. For example,
+[test_meta_schedule_postproc_rewrite_tensorize.py](https://github.com/apache/tvm/blob/d0650bad66d0ff89a01347537021bc442a98c223/tests/python/unittest/test_meta_schedule_postproc_rewrite_tensorize.py)
+has 100+ warnings from pylint within only 500 hundred lines of code. This
+creates confusion to the user and leaves an impression that TVMScript isn’t a
+mature product and not production-ready. Even though it’s something that can be
+incrementally improved under the current design, we believe it’s easier to get
+an ideal result if we have a design with the tooling support in mind.
+
+The current design also lacks of unified approach for different IRs. At
+[https://github.com/tlc-pack/relax/tree/relax/python/tvm/script/relax](https://github.com/tlc-pack/relax/tree/relax/python/tvm/script/relax),
+a mature implementation of TVMScript parser is maintained for Relax. But it’s
+hard to extend if we want to support more IRs for TVM unity.
+
+To conclude, with this RFC, we want to:
+
+1. Add more metaprogramming features to TVMScript, making it easier for TVM
+   developers to write complicated operators.
+2. Improve tooling and documentation of TVMScript, reducing the friction for an
+   average machine learning practitioner to use TVMScript.
+3. Modularize and infrastructuralize the TVMScript parser, lowering the cost to
+   implement parser for new IR.
+
+
+# Guide-level explanation
+[guide-level-explanation]: #guide-level-explanation
+
+## Metaprogramming features to support
+
+### (F1) Template Metaprogramming
+
+Users should be able to use variables from outer scope in the TVMScript
+function/class. The parsed result should be identical to function/class with
+the variable replaced by its value. For instance,
+
+```python
+@T.prim_func
+def matmul(
+  A: T.Buffer[(128, 128)],
+) -> None:
+  ...
+
+def gen_matmul(n, m) -> None:
+  @T.prim_func
+  def f(A: T.Buffer[(n, m)]):
+    ...
+  return f
+
+f = matmul(n=128, m=128) # `f` should be identical to `matmul`
+```
+
+This is already partially supported by https://github.com/apache/tvm/pull/11097
+for using `PrimExpr` captured by outer function. With the new parser, we want
+to support this feature in more places and with more variable types.
+
+### (F2) Rank-polymorphism
+
+Users should be able to write a single function to handle different ranks of
+input buffers (different numbers of dimensions). For example, user should be
+able to write a generic function to do broadcast add,
+
+```python
+def broadcast_add(a, b, c):
+  @T.prim_func
+  def f(
+    A: T.BufferFrom(a),
+    B: T.BufferFrom(b),
+    C: T.BufferFrom(c),
+  ) -> None:
+    for i, i_a, i_b in T.some_broadcast_method(A.shape, B.shape):
+      with T.block():
+        C[*i] = A[*i_a] + B[*i_b]
+
+broadcast_add(
+  a = Buffer((128, 1), "float32"),
+  b = Buffer((1, 128), "float32"),
+  c = Buffer((128, 128), "float32"),
+)
+```
+
+### (F3) Sugar: TE Compute in TIR
+
+Users should be able to replace boilerplate code with a function call, which’s
+expanded to large chunk of code during parsing. For example, we may want to use
+TE’s compute-like syntax to replace nested loop,
+
+```python
+@T.prim_func
+def te_compute_sugar(
+  A: T.Buffer[(128, 128)],
+  B: T.Buffer[(128, 128)],
+) -> None:
+  ...
+  C = T.compute((128, 128), lambda i, j: A[i, j] + B[i, j])
+  ...
+
+## expands to ====>
+
+@T.prim_func
+def te_compute_expanded(
+  A: T.Buffer[(128, 128)],
+  B: T.Buffer[(128, 128)],
+) -> None:
+  ...
+  for i in range(128):
+    for j in range(128):
+      with T.block("..."):
+        C[i, j] = A[i, j] + B[i, j]
+  ...
+```
+
+### (F4) Interleave host program and TVMScript program to customize metaprogramming
+
+As an escape hatch from writing code to be parsed by the TVMScript
+parser, users should be able to write imperative code to construct IR nodes
+directly and embed it inside regular TVMScript. Those code will be evaluated
+by the Python interpreter when parsing. This gives users the ultimate tool when
+TVMScript isn’t expressible enough for their use cases. For example, at
+[python/tvm/topi/vision/nms.py#L380-L431](https://github.com/apache/tvm/blob/3cb4597ed48360e3f3d80161d1c03f833072d28e/python/tvm/topi/vision/nms.py#L380-L431),
+there are blocks of repetitive code on computing the coordinates of the four
+corners of bounding box. This can be simplified as:
+
+```python
+# Before, without IRBuilder interleaving
+@T.prim_func
+def nms(...):
+  ...
+  for i in range(batch_size):
+    ...
+    a_l = min(
+      output[batch_idx, box_a_idx, box_start_idx],
+      output[batch_idx, box_a_idx, box_start_idx + 2],
+    )
+    a_t = min(
+      output[batch_idx, box_a_idx, box_start_idx + 1],
+      output[batch_idx, box_a_idx, box_start_idx + 3],
+    )
+    a_r = max(
+      output[batch_idx, box_a_idx, box_start_idx],
+      output[batch_idx, box_a_idx, box_start_idx + 2],
+    )
+    a_b = max(
+      output[batch_idx, box_a_idx, box_start_idx + 1],
+      output[batch_idx, box_a_idx, box_start_idx + 3],
+    )
+		...
+    for k in range(j):
+      check_iou = ...
+			...
+      if check_iou > 0:
+        # b_l: left, b_t: top, b_r: right, b_b: bottom
+        b_l = min(
+          output[batch_idx, box_b_idx, box_start_idx],
+          output[batch_idx, box_b_idx, box_start_idx + 2],
+        )
+        b_t = min(
+          output[batch_idx, box_b_idx, box_start_idx + 1],
+          output[batch_idx, box_b_idx, box_start_idx + 3],
+        )
+        b_r = max(
+          output[batch_idx, box_b_idx, box_start_idx],
+          output[batch_idx, box_b_idx, box_start_idx + 2],
+        )
+        b_b = max(
+          output[batch_idx, box_b_idx, box_start_idx + 1],
+          output[batch_idx, box_b_idx, box_start_idx + 3],
+        )
+        ...
+
+# With IRBuilder interleaving:
+
+from tvm.script import tir as T
+
+def get_box_coordinates(output, batch_idx, box_idx, box_start_idx):
+  """a method executed by python interpreter"""
+  box_l = T.min(
+    output[batch_idx, box_idx, box_start_idx],
+    output[batch_idx, box_idx, box_start_idx + 2],
+	) # type(box_l) is PrimExpr
+  ... # Repeat for other coordinates
+  return box_l, box_t, box_r, box_b
+
+@T.prim_func(capture=[get_box_coordinates])
+def nms(...):
+  ...
+  for i in range(batch_size):
+	  ...
+	  a_l, a_t, a_r, a_b = get_box_coordinates(output, batch_idx, box_a_idx, box_start_idx)
+		...
+	  for k in range(j):
+	    check_iou = ...
+			...
+	    if check_iou > 0:
+				b_l, b_t, b_r, b_b = get_box_coordinates(output, batch_idx, box_b_idx, box_start_idx)

Review Comment:
   Fixed. Thanks for pointing out!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org