You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@tvm.apache.org by Matthew Brookhart via Apache TVM Discuss <no...@discuss.tvm.ai> on 2020/11/05 04:33:27 UTC

[Apache TVM Discuss] [Development/RFC] [RFC] TVM Object Schema DSL

I've been looking at the PR and some of the discussion, and I thought I'd bring my thoughts back jto this RFC, it seems like a better place for broader design thoughts.

First, thanks for the RFC, @ziheng. There is definitely waaaay too much boilerplate in TVM right now, and finding ways to streamline that will help development in the future.

I'm a still a little confused on what the exact goal of this RFC is.

It seems like the current design is as a setup tool: You write it once, execute it once, and then throw away the schema code. After that, you edit and check in the generated code. At most, I think that would save 15-30 minutes of development time per new datatype introduced, I'm not sure it's really worth the complexity of parsing the python AST.

If we want to move to a situation where we remove the boilerplate code from the repository and generate it on every build, that becomes a more complicated question. First, declarative code like that can be very difficult to debug, it places enormous pressure on the correctness of the parser implementation. Second, if we do want to move to a system where we automatically generate more of the bindings, I really don't think it should be in python. The more we write core TVM functions in python, the less portable the entire system becomes for lower level production uses and more resource constrained systems.

I guess I'm not sure I fully understand the problem this is trying to solve?

Thanks,
Matthew

---
[Visit Topic](https://discuss.tvm.apache.org/t/rfc-tvm-object-schema-dsl/7930/13) to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click here](https://discuss.tvm.apache.org/email/unsubscribe/ae65a4234ac8a233e238bb86cdf7a19a8e4223f9c1b23333950b23a3ee078ef0).

[Apache TVM Discuss] [Development/RFC] [RFC] TVM Object Schema DSL

Posted by tqchen via Apache TVM Discuss <no...@discuss.tvm.ai>.

First of all, given that the schema generation itself is de-coupled as a frontend, there won't be a problem for the lower-level production system, as the object themselves are still presented as part of the C++ and build into the system. The schema generation is ran separately just like clang-format (and the language to implement the tool matters less).

One thing that a lot of the discussion rightfully point out is that it is hard to build a generator that handles method binding in a language agnostic way. Given the above consideration and the cost mentioned about the complete generation. The current proposal focused on the problem that we can solve, namely the data layout generation. Notably, the proposal gives a more inplace generation process, which means that we can start from the codebase as it is right now, gradually add object into schema and make use of the feature, without having to do a disruptive transition. The tool will also serve as a clang-format style, which means it can be repeatively invoked, and complete the regions that needs to be completed.

Now back to the overall problems and complexity. There are a few considerations:

- C0: We are adding more language bindings, and would want quick data structure accessor to these language binding(e.g. rust, or even a first class cython based member accessor)
- C1: As we introduce more typing into the TIR, we want to enable direct access of the data structures from the generated code(Allow TIR to access runtime::Array and any ir nodes), which would require such data layout schema of these data structures.
- C2: As we start to enhance the python side of the frontend, we eventually want user to be able to declare their ADT in python, as part of enhanced TVMScript.

While it is true that keeping the current C++ only binding would not gain a lot from the schema generation. There are additonal gains in the area of C0. More importantly, a schema is the blocker to enable C1. Notably, the compiler does not have to depend on python to make use of C1, as we can still generate layout info into a backend language and register there. But python could be a quick starting point.

Of course the above considerations do not force us to use python ast as one of the frontend to the schema. C2 is certainly one motivation to enable this route. Notably, there is no intention to support arbitary python, like TVMscript, we want to define a clear syntax for data layout itself, which is critical to the above enablement, but also acknowledge that it is simply not possible to define a language that handles method def/bindings in all langauges well, thus still allow developers to provide editing directly in the target language. Notably, in most of the objects of interest(IR objects), we intentionally do not have method functions. While there is certainly a bit of complexity being bought in via AST parsing, the goal of a clear pythonic syntax(defined by ourselves) is managable, and aligned with the first class python support philosophy.

Of course our other core philosophy is to not get into the ways of the developers and usecases. If the introduction of the python frontend hampers the developer's ability to custom define a new object, or port any application on resource constrained devices and/or languages, then we would need to think more carefully about it. My understanding is that the current proposal does not provide constraint in that regard.

Moreover, the explicit design choice of inplace generation(e.g. the clang-format approach) instead of the full generation greatly reduces the path for adoption and transition. The codebase can stand without the schema tool and continue to add objects manually. The annotated region get generated (and checked via linting pass) as we gradually add objects that requires schema generation. The code are checked in as part of the codebase alleviating the concern of complexity of a full generator system. While I understand that there might be desire to push for a full-fledged generator, we do feel that the strong need for customization, and gradual adoption would make this path a better one with less resistance.

---
[Visit Topic](https://discuss.tvm.apache.org/t/rfc-tvm-object-schema-dsl/7930/14) to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click here](https://discuss.tvm.apache.org/email/unsubscribe/43508a5383efea3e5eecb7720ab0e626a9da5a875fbf9aa50909b31533e1ef8f).