You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@tvm.apache.org by Matt Barrett via Apache TVM Discuss <no...@discuss.tvm.ai> on 2020/11/09 14:58:30 UTC

[Apache TVM Discuss] [Development/RFC] [RFC] Refactor the compile_engine to expose a Relay -> TE translator


The current design of the compile_engine utilises ScheduleGetter to translate a primitive function into a scheduled tensor expression. However, as it is an all-in-one pass, this means it is directly coupled to the schedules defined in TOPI. It would instead be useful to break this into two stages, one which converts the Relay function into an unscheduled TE graph, and another which applies the TOPI-derived scheduling. We can then expose the Relay → TE translator step such that it can be reused by alternative scheduling approaches, for instance the cascading scheduling I outlined [here](https://discuss.tvm.apache.org/t/rfc-cascade-scheduling/8119).

In particular, I propose creating a TETranslator pass (deriving from MemoizedExpressionTranslator) and reducing the scope of ScheduleGetter so that it is just an ExprVisitor which picks out the anchor implementation and function name. The TETranslator would then be exposed as an API which could be reused by other components.

If we agree that this change would be valuable, then there is a question over how to name the Relay → TE translator component and where it should live. Here's my current strawman:

* TETranslator as a new pass in backend/compile_engine.cc
* Expose the translator as a global with:

```
TVM_REGISTER_GLOBAL("relay.backend._TranslateToTE")
    .set_body_typed([](Function prim_func, Target target) {
      auto translator = TETranslator(target);
      return translator.Translate(prim_func);
    });
```

* Create a python API under compile_engine.py called 'translate_to_te'

I've pushed a WIP PR with this strawman which you can find [here](https://github.com/apache/incubator-tvm/pull/6888).

Thanks





---
[Visit Topic](https://discuss.tvm.apache.org/t/rfc-refactor-the-compile-engine-to-expose-a-relay-te-translator/8417/1) to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click here](https://discuss.tvm.apache.org/email/unsubscribe/9e17fe3cb82dd3aea7f4d29671eda902e2f0cba66019622f622b51b38cd51002).

[Apache TVM Discuss] [Development/RFC] [RFC] Refactor the compile_engine to expose a Relay -> TE translator

Posted by Matt Barrett via Apache TVM Discuss <no...@discuss.tvm.ai>.


@Hzfengsy @spectrometerHBH I'd be interested to hear your thoughts on this as I imagine it could have some overlap with the work you're doing on TensorIR.





---
[Visit Topic](https://discuss.tvm.apache.org/t/rfc-refactor-the-compile-engine-to-expose-a-relay-te-translator/8417/5) to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click here](https://discuss.tvm.apache.org/email/unsubscribe/95b5dc4cc9876c1ed95a511e952f6b5daa18b1468ce97af60797d7b5b733605f).

[Apache TVM Discuss] [Development/RFC] [RFC] Refactor the compile_engine to expose a Relay -> TE translator

Posted by "Cody H. Yu via Apache TVM Discuss" <no...@discuss.tvm.ai>.


Thanks for clarification and now I feel we are on the same page. For the idea of StrategySelector, I have no idea for now and would like to know opinions from other as well.





---
[Visit Topic](https://discuss.tvm.apache.org/t/rfc-refactor-the-compile-engine-to-expose-a-relay-te-translator/8417/4) to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click here](https://discuss.tvm.apache.org/email/unsubscribe/9837de79367f01fd1ce76add2a73cf3b4e7e1454f859e32a7b0ead20bb78e293).

[Apache TVM Discuss] [Development/RFC] [RFC] Refactor the compile_engine to expose a Relay -> TE translator

Posted by Matt Barrett via Apache TVM Discuss <no...@discuss.tvm.ai>.


> it seems you need to call `lower_call` twice (one in `TETranslator` and another in `ScheduleGetter` ). In this case, seems like you still select the schedule in `TETranslator`

So yes, I do call it twice and really this is a consequence of 'lower_call' also probably needing a similar refactor. What I'm actually doing is ignoring the schedule information in TETranslator even though the lower_call does provide it. That way the output of the TETranslator would just be a TE Compute DAG rather than a TE Schedule. I really like that Ansor acts directly on TE rather than Relay and think that's a pattern to work towards going forward with scheduling optimizations.

> it sounds weird to select a compute by referring to the quality of its corresponding TOPI schedule which you won’t apply.

I agree with this. Perhaps we would need to provide TETranslator with a 'StrategySelector' that could be customized? For my envisioned use-case this happens to not be an issue, but I'd be interested in hearing opinions.

In summary, I agree that there's probably some distance to go in completing this refactor to expose something truly flexible. Once I get some more view/opinions on the best direction to take, I can make a start on improving the WIP PR.





---
[Visit Topic](https://discuss.tvm.apache.org/t/rfc-refactor-the-compile-engine-to-expose-a-relay-te-translator/8417/3) to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click here](https://discuss.tvm.apache.org/email/unsubscribe/d8ea97d4379b6aa00100321f728daa2509c43bcae80711dc3cd55232aa49aba3).

[Apache TVM Discuss] [Development/RFC] [RFC] Refactor the compile_engine to expose a Relay -> TE translator

Posted by "Cody H. Yu via Apache TVM Discuss" <no...@discuss.tvm.ai>.

Thanks for the RFC! I do agree that it would be great to create an additional path to improve the flexibility, especially now we have auto_scheduler to schedule a TE graph from scratch (cc @merrymercy).

Meanwhile, I think tightly-coupled the selection of schedule and compute is intentional because an advance schedule needs a specialized compute (e.g., NCHWc, Winograd), and that's why Relay op strategy was design in this way to select both compute and schedule at one place (cc @haichen). Could you elaborate a bit more about this part? IIUC from your WIP PR, it seems you need to call `lower_call` twice (one in `TETranslator` and another in `ScheduleGetter`). In this case, seems like you still select the schedule in `TETranslator` (https://github.com/apache/incubator-tvm/blob/ed4cedce02a6ff608626bc61dfff6fc6f98004c9/src/relay/backend/compile_engine.cc#L259), and you perform the same process when visiting the primiary funciton (https://github.com/apache/incubator-tvm/blob/ed4cedce02a6ff608626bc61dfff6fc6f98004c9/src/relay/backend/compile_engine.cc#L266)?

A follow-up question is that since you still call `lower_call` in `TETranslator`, Relay op strategy is still required to register the mapping from Relay ops to TE computes. Accordingly, the logic of selecting a compute is still based on the schedule quality (or `plevel` by default), which seems not improve the flexibility but just lets you apply another schedule to the select compute. Although it seems to me that this is the main purpose of this RFC, we should have another mechanism to determine computes in `TETranslator`; otherwise it sounds weird to select a compute by referring to the quality of its corresponding TOPI schedule which you won't apply.

Also cc @zhiics

---
[Visit Topic](https://discuss.tvm.apache.org/t/rfc-refactor-the-compile-engine-to-expose-a-relay-te-translator/8417/2) to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click here](https://discuss.tvm.apache.org/email/unsubscribe/21f1195d5841ca75d7624db4a9cb4d84c24620eb5aa9fe52d4ea22b212cb3b16).

[Apache TVM Discuss] [Development/RFC] [RFC] Refactor the compile_engine to expose a Relay -> TE translator

Posted by "Cody H. Yu via Apache TVM Discuss" <no...@discuss.tvm.ai>.


Another requirement I have for the general TE translator is to support an arbitrary Relay function, including the Relay function with more than one reduce op (e.g., conv2d). The current compile engine doesn't allow this pattern because it selects one schedule implementation per Relay function, but this should not be a limitation anymore if you are going to decouple the selection of compute and schedule. However, we probably don't have to cover it in this RFC if that's out of scope to you.





---
[Visit Topic](https://discuss.tvm.apache.org/t/rfc-refactor-the-compile-engine-to-expose-a-relay-te-translator/8417/6) to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click here](https://discuss.tvm.apache.org/email/unsubscribe/e4cf31aefab8e0821fa1159e82171e0eceb841adf776d15d17c11f93685e8f8d).