You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tvm.apache.org by Christopher Sidebottom <no...@github.com.INVALID> on 2022/05/11 10:46:08 UTC

[apache/tvm-rfcs] Add Target Pre-processing RFC (PR #71)

You can view, comment on, or merge this pull request online at:

  https://github.com/apache/tvm-rfcs/pull/71

-- Commit Summary --

  * Add Target Pre-processing RFC

-- File Changes --

    A rfcs/0070-target-preprocessing.md (208)

-- Patch Links --

https://github.com/apache/tvm-rfcs/pull/71.patch
https://github.com/apache/tvm-rfcs/pull/71.diff

-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/71
You are receiving this because you are subscribed to this thread.

Message ID: &lt;apache/tvm-rfcs/pull/71@github.com&gt;

Re: [apache/tvm-rfcs] Add Target Pre-processing RFC (PR #71)

Posted by Andrew Reusch <no...@github.com.INVALID>.
great! so I think we just need some updates to the text of the RFC based on discusison, then we are good to merge.

-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/71#issuecomment-1134999125
You are receiving this because you are subscribed to this thread.

Message ID: <ap...@github.com>

Re: [apache/tvm-rfcs] Add Target Pre-processing RFC (PR #71)

Posted by Eric Lunderberg <no...@github.com.INVALID>.
> Based on my previous re-review of LLVM, thanks to @tqchen, it might help to use my_target.features.dsp rather than my_target.arch.has_dsp and clarifying these are features available to the Target? What do you think?

I like that, and the renaming makes it clear which are boolean parameters and which are variable attributes.  That would also be useful for cleaning up the [Vulkan target](https://github.com/apache/tvm/blob/main/src/target/target_kind.cc#L341), which right now has a large number of boolean attributes that would be better expressed as feature support.

The renaming would also help with avoiding architecture-specific checks at the code review stage.  Since the information defined by the architecture would be pulled over into `my_target.features`, checks that directly access `my_target.arch` should only occur for case (1).

-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/71#issuecomment-1130250850
You are receiving this because you are subscribed to this thread.

Message ID: <ap...@github.com>

Re: [apache/tvm-rfcs] Add Target Pre-processing RFC (PR #71)

Posted by Christopher Sidebottom <no...@github.com.INVALID>.
> Actually, the target JSON only shows the name of the kind, so the target parser (if it operates on the JSON) won't see the details of the kind, unless they are included in the JSON. I think that expanding the contents of the target JSON to contain the kind specifics as well is critical for this feature to make sense. Please clarify of that's a part of the plan.

Hi @kparzysz-quic, can you expand on what's missing here? According to https://discuss.tvm.apache.org/t/rfc-tvm-target-specification/6844, the Target JSON includes the `kind`, `attrs` and `keys`.

-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/71#issuecomment-1185688020
You are receiving this because you are subscribed to this thread.

Message ID: <ap...@github.com>

Re: [apache/tvm-rfcs] Add Target Pre-processing RFC (PR #71)

Posted by Tianqi Chen <no...@github.com.INVALID>.
Thanks @Mousius . Given these fields are pretty relevant to  compiler configurations in traditional domain, it would be nice to also discuss prior approaches(e.g. where those fields normally sits in say LLVM) for reference

-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/71#issuecomment-1124381564
You are receiving this because you are subscribed to this thread.

Message ID: <ap...@github.com>

Re: [apache/tvm-rfcs] Add Target Pre-processing RFC (PR #71)

Posted by Christopher Sidebottom <no...@github.com.INVALID>.
> Thanks @Mousius . Given these fields are pretty relevant to compiler configurations in traditional domain, it would be nice to also discuss prior approaches(e.g. where those fields normally sits in say LLVM) for posterity. This would also help us to make meaningful choices that aligns with existing terminologies. My quick read is that they seems to be aligned, but would be nice to double check.

Good idea @tqchen, one thing I've picked out from a second read of the LLVM approach is the term `Features` rather than `Arch` which we could potentially align on. I've updated the Prior Art section to link out to what I've seen as the flow in LLVM, though I am by no means an LLVM expert.

-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/71#issuecomment-1127465782
You are receiving this because you are subscribed to this thread.

Message ID: <ap...@github.com>

Re: [apache/tvm-rfcs] Add Target Pre-processing RFC (PR #71)

Posted by Christopher Sidebottom <no...@github.com.INVALID>.
Hi @junrushao1994, thanks for the elaborate reply 😸  I don't want to debate our personal principles but I appreciate you sharing them and will reference them where I can.

> **Current `arch`-specifc checks.** Most of the 6 groups of `arch`-specific helper functions, mentioned in the "Motivation" section, are developed by concurrent efforts from multiple parties, and therefore I would say fragmentation is almost a certain thing to happen. On the other hand, fragmentation of those helper functions, which are indeed ad-hoc, is currently confined locally as independent checks without yet polluting the design of global infrastructure.

Yip, this fragmentation occurs due to a lack of a standardised mechanism for these groups to use, which the RFC aims to provide - I'm not sure why we consider that pollution given it should have a positive impact on all groups and aids in providing customisation through a well defined route established by all vendors, providing simplicity and customisation (A1, A2 and A3).

> Existing arch-like attributes. Note that the design of Target attributes in TargetKind is intended to describe the capability of hardware, according to the [Target RFC](https://discuss.tvm.apache.org/t/rfc-tvm-target-specification/6844):

I believe this is still held given the crossover between hardware and code generators, for `c`/`llvm` you will still be able to pass `mattr`/`mcpu`/etc which are the standard way to specify CPU, ISA and extensions in existing compilers (A1). This is also following the TVM Guideline of "Be consistent with existing well-known package’s APIs if the features overlap.
  For example, tensor operation APIs should always be consistent with the numpy API.".

> Target tag system. Given the fact that existing hardware models are enumerable, the [Target RFC](https://discuss.tvm.apache.org/t/rfc-tvm-target-specification/6844) proposes to use "tags" to allow easy creation of targets. For example, Target("nvidia/jetson-agx-xavier") gives full specification of this device, including the cuda target and the ARM host. At the time of writing, it only takes 200 tags to describe [all the CUDA hardware](https://github.com/apache/tvm/blob/a6a34046c432b3766e7c32bbd85c098812a12a68/src/target/tag.cc#L107-L348).

> **What on earth are `Target`s.** Actually, `target` in TVM not only refers to the hardware, but also the codegen targets. For example, LLVM targets means TVM codegens to LLVM, and let LLVM do the rest of compilation. CUDA targets codegens to CUDA source code (i.e. `*.cu` files), and invokes NVCC for compilation.

There's definitely an existing amount of confusion in the `Target` system, but I think it's even more confused than this by looking at the tagged entries, such as:

```
TVM_REGISTER_TARGET_TAG("raspberry-pi/4b-aarch64")
    .set_config({{"kind", String("llvm")},
                 {"mtriple", String("aarch64-linux-gnu")},
                 {"mcpu", String("cortex-a72")},
                 {"mattr", Array<String>{"+neon"}},
                 {"num-cores", Integer(4)},
                 {"host", Map<String, ObjectRef>{{"kind", String("llvm")},
                                                 {"mtriple", String("aarch64-linux-gnu")},
                                                 {"mcpu", String("cortex-a72")},
                                                 {"mattr", Array<String>{"+neon"}},
                                                 {"num-cores", Integer(4)}}}});
```

Which defines a system configuration that then uses a code generator `llvm` with a specific CPU profile, therefore the `Target` system can represent **at minimum** 3 distinct layers: systems/hardware/code generators. Given the principle of having to understand a minimal amount (A2), `Target` needs to be streamlined into understandable parts. This is a digression from the actual RFC, as the features are pre-processed from the tagged `Target`s information.

The problem that I've faced trying to figure out how to implement this is that `llvm` takes `mcpu`/`mattr`/etc which are used to infer features later where-as the `cuda` `Target` has a more complete reflection of the `attrs` available for CUDA. These inconsistencies make it difficult for a user to approach TVM and that isn't requiring a developers to learn the bare minimum (A2). It also diverges from other compilers where you'd expect `mcpu`/`mtriple`/etc to infer much of this information for you (A1).

> Do we need multiple target parsers? My answer is no. If the parsers are maintained by different vendors separately on their own interest, then they could decide how to implement parsers for "keys", "arch", together, without conflicting with other contributors. Therefore, I would say it's already consistent with our principle A3 without having to develop multiple parsers.

> Naming choice. When implementing the Target RFC, I came up with the preprocessor to be backward compatible with existing functionalities that auto-detects the local enviroment (e.g. cuda version), which I have to admit it's probably not the best name. In the context of our RFC, we might want to use names like target_parser instead to be more specific.

I don't mind using one parser, I was just making it more granular and avoiding stepping on peoples toes with the existing pre-processor. Either design seems to fulfil your principle of customisability (A3), more granularly means you only have to implement a small piece of customisation where-as with the single parser it requires some thought as to how to cater for all attributes within a given `TargetKind` parser.

> **Where to dispatch target parsers.** Currently, the preprocessor is dispatched solely based on `TargetKind`, which is admittedly a bit limited and overlooked the possiblity that `aarch64` and `x86` may need completely different parsers. Therefore, here we would love to raise a question: based on which attribute the parser should be dispatched? Previous wisdom in clang seems to suggest dispatching according to `mtriple` (IIUC), so shall we introduce anything similar, for example, dispatching based on `--device aarch64-foo-bar`?

This requires developers to learn more than the bare minimum (A2), as users are required to learn the specifics of how to specify a `Target` in TVM. If an `llvm` `Target` supports `-mcpu`/`-mattr`/`-mtriple` then all of these fields can be reasonably expected to allow the `Target` to infer features given the relation to GCC/LLVM (A1). 

> Distinguishing arch and attrs. Note that the number of possible attributes grow less frequently as we might expect. Also, there is no fundamental difference between arch and attrs given arch is special attrs, and target is designed for this according to point C5 in the [Target RFC](https://discuss.tvm.apache.org/t/rfc-tvm-target-specification/6844). Therefore, I would personally prefer a more flattened access pattern, and therefore we do not need to distinguish arch and attrs.

Based on LLVM, the `Features` inferred from a given CPU/GPU profile can be extensive already, which would lead to a drastic increase in `attrs` if we were to track them this way. For example: 

* https://github.com/llvm/llvm-project/blob/09c2b7c35af8c4bad39f03e9f60df8bd07323028/llvm/lib/Support/X86TargetParser.cpp
* https://github.com/llvm/llvm-project/blob/09c2b7c35af8c4bad39f03e9f60df8bd07323028/llvm/lib/Support/AArch64TargetParser.cpp

If we were to attach all of these directly to `Target` (i.e. `llvm`), that would drastically increase the number of available fields and in all cases only a subset would be used - specific to a given CPU/GPU profile. Therefore I would strongly advise following the pattern (as in LLVM) of using a separate structure for this, clear boundaries provide a much more straight forward developer experience (A2) and allow us to do things as @Lunderberg suggested like plucking just `features` or `arch` to give to `Pass`es rather than the entire `Target`.

> **Folder structure.** According to principle A3, I would propose that different vendors may maintain their own TargetKind and parsers separately, for example, aarch64 could be maintained using:
> 
> * `src/target/target/aarch64/parser.cc`
> * `src/target/target/aarch64/tags.cc`
> * `src/target/target/aarch64/kind.cc`
> * `src/target/target/aarch64/helpers.cc`
>   Where the last item provides pre-defined packed functions mentioned in the "Motivation" section.

I believe the proposed layout also conforms to A3 as it already details two different pre-processors, and you can easily imagine it extending:

* src/target/preprocessors/aarch64.cc
* src/target/preprocessors/cpu.cc
* src/target/preprocessors/cuda.cc
* src/target/preprocessors/woofles.cc
* src/target/preprocessors/x86_64.cc

The composition of `CPU` calling `AArch64` or `x86_64` conditionally also provides multiple vendors to provide pre-processing functionality to `llvm` by virtue of being able to call their own functions within the central `CPU` pre-processor.

# Outcomes
Will try and summarise the above into changes in the RFC which I can implement 😸 

## Single Parser
I can re-word the RFC to use a single parser (`target_parser`) rather than the three separate ones, using the syntax you suggested:

```c++
using TargetJSON = Map<String, ObjectRef>;
using FTVMTargetParser = TypedPackedFunc<TargetJSON(TargetJSON)>;
```

## `Target` Field
I'm strongly in favour of having `target.features` to represent this, do you actively object to this?

## Location
I believe the `src/target/parsers` folder would provide for maximum customisation given parsers can be re-used between `TargetKind`s and therefore serves as a good starting point? Also happy for an RFC in 6 months to move everything to a new folder if we re-factor the `Target`s 😸 

-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/71#issuecomment-1132872196
You are receiving this because you are subscribed to this thread.

Message ID: <ap...@github.com>

Re: [apache/tvm-rfcs] Add Target Pre-processing RFC (PR #71)

Posted by Krzysztof Parzyszek <no...@github.com.INVALID>.
Actually, the target JSON only shows the name of the kind, so the target parser (if it operates on the JSON) won't see the details of the kind, unless they are included in the JSON.  I think that expanding the contents of the target JSON to contain the kind specifics as well is critical for this feature to make sense.  Please clarify of that's a part of the plan.

-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/71#issuecomment-1185647795
You are receiving this because you are subscribed to this thread.

Message ID: <ap...@github.com>

Re: [apache/tvm-rfcs] Add Target Pre-processing RFC (PR #71)

Posted by Andrew Reusch <no...@github.com.INVALID>.
i'm supportive of this idea. i didn't want to derail this RFC by discussing @kparzysz-quic 's idea about the architecture of the llvm target here, but i think that would be a good follow-up discussion.

-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/71#issuecomment-1127894799
You are receiving this because you are subscribed to this thread.

Message ID: <ap...@github.com>

Re: [apache/tvm-rfcs] Add Target Pre-processing RFC (PR #71)

Posted by Christopher Sidebottom <no...@github.com.INVALID>.
> I think this is generally ok. I suggest elaborating a bit more in the text of the RFC that the preprocessors apply to the target kind, and that for all architectures, the code specific to each architecture would need to be handled as a part of the common preprocessor. The use of names like `MyArchPreprocessor` and `MyKeysPreprocessor` suggests that there can be multiple specialized preprocessors. I'd replace them with `CArchPreprocessor` and `CKeysPreprocessor`, since they both show up in the `c` target kind.

Ideally I'd like to convey the ability to attach it to any `TargetKind` rather than implying it's designed for a specific `TargetKind`, maybe `CPUArchPreprocessor` which can be re-used between `llvm` and `c` which calls out to `AArch64ArchPreprocessor` or `x86ArchPreprocessor`?. You can use a single level pre-processor if that's appropriate (i.e. for `rocm`), which would just be `ROCmArchPreprocessor`. I can outline both cases if that'd help? 

-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/71#issuecomment-1129796641
You are receiving this because you are subscribed to this thread.

Message ID: <ap...@github.com>

Re: [apache/tvm-rfcs] Add Target Pre-processing RFC (PR #71)

Posted by Tianqi Chen <no...@github.com.INVALID>.
Thanks folks for discussions. I think they summarizes to the following points

- Q0: Subfield grouping (e.g. features) or simply leave as top-level attrs
- Q1: Folder structure: `target/preprocessors/cuda.cc` vs `target/cuda/cuda_preprocessor.cc`
     - Note that code-reuse is less likely going to be an issue as there can be common includes among sub-folder structures. 
- Q2: How to reconcile attributes specified already for (part of) subfield group, `llvm` can use `mattr` to specify the feature necessary. What happens to CUDA(e.g. we proposing to group hw related limitations to features), and how can we specify some of the existing feature fields during tag registration.



-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/71#issuecomment-1133013784
You are receiving this because you are subscribed to this thread.

Message ID: <ap...@github.com>

Re: [apache/tvm-rfcs] Add Target Pre-processing RFC (PR #71)

Posted by Christopher Sidebottom <no...@github.com.INVALID>.
In the spirit of keeping this simple to review, the text of this RFC has been reformulated to cover the `target_parser` replacement for `preprocessor` which @junrushao1994 proposed - I've raised a follow on for the `features` additional to the `Target` so that can also be reviewed in isolation with updates from the discussion here: https://github.com/apache/tvm-rfcs/pull/78

Hopefully this makes it simpler for everyone 😸 

cc @kparzysz-quic @junrushao1994 @tqchen @areusch @Lunderberg

-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/71#issuecomment-1151200855
You are receiving this because you are subscribed to this thread.

Message ID: <ap...@github.com>

Re: [apache/tvm-rfcs] Add Target Pre-processing RFC (PR #71)

Posted by Junru Shao <no...@github.com.INVALID>.
Thanks @Mousius for drafing this RFC!

First of all, I completely agree on the importance to handle `arch`-specific checks. Use our experience as an example, on CUDA we might want to check if the PTX intrinsic `cp.async.commit_group` is available on certain architecture before tensorizing using that PTX intrinsic, and the existing approach is to add extra flags in `TargetKind`.

To motivate discussion, I would love to provide more context on the system design, and then talk about some specific points I noticed in the current RFC, and finally propose some ideas inspired by this RFC.

## Background

**Design principles.** Just wanted to share some of my design principles when developing TVM, the compiler I believe that aims to work across hardware models:
- A1. Generic. By design, TVM serves the common interest of all vendors. Therefore, it would be great if we could discuss broadly how a design choice would benefit all hardware platforms, including ARM, NV, Intel, Qual, etc.
- A2. Minimal. Ideally, practitioners are supposed to learn only bare minimum to participate in TVM development.
  - Take TIR as an example, for a developer who knows the IR and passes to use and develop TIR, the only extra concept is `Target` .
  - Indeed Relay is in a less ideal state, but Relax is designed to address this issue so I wouldn't worry too much
- A3. Customizable. We want to develop the infrastructure to provide customization opportunities for all vendors to independently work on their extension in a collaborative way, making TVM a platform for everyone. A few examples:
  - TIR scheduling is heading towards this direction that all the schedule primitives are decoupled with each other, so that vendors could develop their own primitives without affecting each other.
  - TVMScript is going to be re-designed this way treating TIR and Relax as independent dialects while the core infra itself supports any 3rdparty IR.
  - Admittedly Relay is much limited with the assumption of Relay => TE => TIR lowering path, with some hooks to hack in other compilation paths, but Relax is going to change this so again I wouldn't worry...

**Current `arch`-specifc checks.** Most of the 6 groups of `arch`-specific helper functions, mentioned in the "Motivation" section, are developed by concurrent efforts from multiple parties, and therefore I would say fragmentation is almost a certain thing to happen. On the other hand, fragmentation of those helper functions, which are indeed ad-hoc, is currently confined locally as independent checks without yet polluting the design of global infrastructure.

**Special target attributes.** Below are a few existing target attributes that serve special semantics, the design of which I don't fully agree with, but now are preserved as legacy:
- **keys.** This is used to guide TOPI dispatch. For example, "llvm --keys=arm_cpu,cpu" first finds if there is any `GenericFunc` registered with `arm_cpu`, and if not, it falls back to `cpu`.
- **libs.** This is used in TOPI to dispatch to vendor library. For example, "cuda -libs=cudnn" prefers dispatching to cuDNN.
- **device.** and **model.** These two sometimes control the behavior of auto-tuning.

**Existing `arch`-like attributes.** Note that the design of `Target` attributes in `TargetKind` is intended to describe the capability of hardware, according to the [Target RFC](https://discuss.tvm.apache.org/t/rfc-tvm-target-specification/6844):
> Pass in additional target-specific attributes like ISA/library extension to a target.
Therefore, currently all the `arch`-like flags are centered around `Target`. As an example:
- The [Vulkan target](https://github.com/apache/tvm/blob/a6a34046c432b3766e7c32bbd85c098812a12a68/src/target/target_kind.cc#L345-L370) comprehensively includes hardware feature support (e.g. whether or not fp16 is supported), physical limits of the device (e.g. max number of threads allowed).
- The [CUDA target](https://github.com/apache/tvm/blob/a6a34046c432b3766e7c32bbd85c098812a12a68/src/target/target_kind.cc#L292-L296) is defined in similar approach but less comprehensive yet. It doesn't require architectural change to grow its attributes.

Note that the number of attributes may grow if there is new hardware feature, but to the best of my knowledge, it could be less common that those hardware features may bloat, mainly because there is concrete cost to grow features on hardwares.

**Target tag system.** Given the fact that existing hardware models are enumerable, the [Target RFC](https://discuss.tvm.apache.org/t/rfc-tvm-target-specification/6844) proposes to use "tags" to allow easy creation of targets. For example, `Target("nvidia/jetson-agx-xavier")` gives full specification of this device, including the cuda target and the ARM host. At the time of writing, it only takes 200 tags to describe [all the CUDA hardware](https://github.com/apache/tvm/blob/a6a34046c432b3766e7c32bbd85c098812a12a68/src/target/tag.cc#L107-L348).

**What on earth are `Target`s.** Actually, `target` in TVM not only refers to the hardware, but also the codegen targets. For example, LLVM targets means TVM codegens to LLVM, and let LLVM do the rest of compilation. CUDA targets codegens to CUDA source code (i.e. `*.cu` files), and invokes NVCC for compilation.

## Discussion

**Naming choice.** When implementing the Target RFC, I came up with the `preprocessor` to be backward compatible with existing functionalities that auto-detects the local enviroment (e.g. cuda version), which I have to admit it's probably not the best name. In the context of our RFC, we might want to use names like `target_parser` instead to be more specific.

**Where to dispatch target parsers.** Currently, the preprocessor is dispatched solely based on `TargetKind`, which is admittedly a bit limited and overlooked the possiblity that `aarch64` and `x86` may need completely different parsers. Therefore, here we would love to raise a question: based on which attribute the parser should be dispatched? Previous wisdom in clang seems to suggest dispatching according to `mtriple` (IIUC), so shall we introduce anything similar, for example, dispatching based on `--device aarch64-foo-bar`?

**Do we need multiple target parsers?** My answer is no. If the parsers are maintained by different vendors separately on their own interest, then they could decide how to implement parsers for "keys", "arch", together, without conflicting with other contributors. Therefore, I would say it's already consistent with our principle A3 without having to develop multiple parsers.

**Function Signature for a target parser.** Note that with the introduction of previous Target RFC, a target object is canonically represented by a JSON-like object. Therefore, the signature could be:
```c++
using TargetJSON = Map<String, ObjectRef>;
using FTVMTargetParser = TypedPackedFunc<TargetJSON(TargetJSON)>;
```
so that it's generic enough and could potentially be useful if vendors in the future want to hack around the Target object. To clarify the discussion [here](https://github.com/apache/tvm-rfcs/pull/71/files#r876079831), I would propose that our parsing code path not go back to string again, i.e.:
- str -> TargetJSON (canonical form) -> TargetJSON (target parser applied) -> Target
- TargetJSON (canonical form) -> TargetJSON (target parser applied) -> Target

**Distinguishing `arch` and `attrs`.** Note that the number of possible attributes grow less frequently as we might expect. Also, there is no fundamental difference between `arch` and `attrs` given `arch` is special attrs, and target is designed for this according to point C5 in the [Target RFC](https://discuss.tvm.apache.org/t/rfc-tvm-target-specification/6844). Therefore, I would personally prefer a more flattened access pattern, and therefore we do not need to distinguish `arch` and `attrs`.

**Folder structure.** According to principle A3, I would propose that different vendors may maintain their own TargetKind and parsers separately, for example, aarch64 could be maintained using:
- `src/target/target/aarch64/parser.cc`
- `src/target/target/aarch64/tags.cc`
- `src/target/target/aarch64/kind.cc`
- `src/target/target/aarch64/helpers.cc`
Where the last item provides pre-defined packed functions mentioned in the "Motivation" section.


-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/71#issuecomment-1132579480
You are receiving this because you are subscribed to this thread.

Message ID: <ap...@github.com>

Re: [apache/tvm-rfcs] Add Target Pre-processing RFC (PR #71)

Posted by Krzysztof Parzyszek <no...@github.com.INVALID>.
> @kparzysz-quic, sorry, I'm still not understanding here - the name of the kind is enough to look it up in the kind registry and gather any kind-specific information. What further details are missing?

I thought the attrs were stored in the kind, but they are in the target itself.  Nevermind.

-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/71#issuecomment-1185758537
You are receiving this because you are subscribed to this thread.

Message ID: <ap...@github.com>

Re: [apache/tvm-rfcs] Add Target Pre-processing RFC (PR #71)

Posted by Andrew Reusch <no...@github.com.INVALID>.
@kparzysz-quic @junrushao1994 @tqchen @Lunderberg can you look at this one again?

-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/71#issuecomment-1184767286
You are receiving this because you are subscribed to this thread.

Message ID: <ap...@github.com>

Re: [apache/tvm-rfcs] Add Target Pre-processing RFC (PR #71)

Posted by Junru Shao <no...@github.com.INVALID>.
@Mousius Thank you so much for your response! This makes lots of sense to me! Also, thanks for including my personal principles in the discussion! It's my personal principles which are completely okay to disagree with :-)

> I'm not sure why we consider that pollution given it should have a positive impact on all groups and aids in providing customisation through a well defined route established by all vendors, providing simplicity and customisation (A1, A2 and A3).

Sorry my words may lead to some potential miscommunication. As I mentioned in the very beginning, I completely agree with the idea of handling `arch` more systematically and it's important to us, so in no means I consider it pollution. I mean the fragmentation itself is not good but hasn't become extremely bad right now, because they are not used globally at infrastructural level, so there is no namespace polluting.

> I believe this is still held given the crossover between hardware and code generators, for c/llvm you will still be able to pass mattr/mcpu/etc which are the standard way to specify CPU, ISA and extensions in existing compilers.

Agreed. It is designed to be consistent with existing wisdom, i.e. LLVM/GCC mattr/mcpu/etc., so I believe it's not controversial.

> There's definitely an existing amount of confusion in the Target system, but I think it's even more confused than this by looking at the tagged entries, such as: .... Which defines a system configuration that then uses a code generator llvm with a specific CPU profile, therefore the Target system can represent at minimum 3 distinct layers: systems/hardware/code generators.

Yes, I think your understanding is correct. Per [Target RFC](https://discuss.tvm.apache.org/t/rfc-tvm-target-specification/6844), this object represents host compiler info (C0), heterogeneous compilation settings (C1), compiler behavior control (C2), additional ISA/library (C4). Therefore, the tag system (C4) is introduced to reduce the cognitive overhead, because the end users can use a simple plain string tag `Target("raspberry-pi/4b-aarch64")` to represent all C0/1/2/4, instead of understanding every detail of the hardware themselves (A2). In other words, to the end users, the vision of having tag system is that they only need a simple string to do configure TVM codegen on their their hardware.

> where-as the cuda Target has a more complete reflection of the attrs available for CUDA. These inconsistencies make it difficult for a user to approach TVM and that isn't requiring a developers to learn the bare minimum (A2). It also diverges from other compilers where you'd expect mcpu/mtriple/etc to infer much of this information for you (A1).

Right, in fact I share the same confusion with you with it comes to CUDA/Vulkan/etc., where (to best of my knowledge) there is no precedent of mcpu/mattr/mtriple/etc (except for LLVM NVPTX backend). Certainly, we should develop unified approach to address this confusion, which could possibly be the `arch` proposed in the RFC.

> I don't mind using one parser, I was just making it more granular and avoiding stepping on peoples toes with the existing pre-processor. Either design seems to fulfil your principle of customisability (A3), more granularly means you only have to implement a small piece of customisation where-as with the single parser it requires some thought as to how to cater for all attributes within a given TargetKind parser.

Thanks for sharing your thoughts! Definitely. The issue of having multiple parsers is that the order of their application is less explicit, so I would be a little bit leaning towards a single one, while the implementation could share common helper functions.

> > Where to dispatch target parsers. Currently, the preprocessor is dispatched solely based on TargetKind, which is admittedly a bit limited and overlooked the possiblity that aarch64 and x86 may need completely different parsers. Therefore, here we would love to raise a question: based on which attribute the parser should be dispatched? Previous wisdom in clang seems to suggest dispatching according to mtriple (IIUC), so shall we introduce anything similar, for example, dispatching based on --device aarch64-foo-bar?
>
> This requires developers to learn more than the bare minimum (A2), as users are required to learn the specifics of how to specify a Target in TVM. If an llvm Target supports -mcpu/-mattr/-mtriple then all of these fields can be reasonably expected to allow the Target to infer features given the relation to GCC/LLVM (A1).

Right. I agree. `TargetKind`-based dispatch is certainly the clearest approach as you mentioned (A2). The reason I'm thinking about (not 100% sure though) other directions is: what if different vendors (say Qualcomm and ARM) want to customize their own target parsing logic separately, while their `TargetKind` is the same `llvm`? Admittedly I don't have a perfect answer for this and would love to hear about your thoughts.

> Based on LLVM, the Features inferred from a given CPU/GPU profile can be extensive already, which would lead to a drastic increase in attrs if we were to track them this way. ... Therefore I would strongly advise following the pattern (as in LLVM) of using a separate structure for this, clear boundaries provide a much more straight forward developer experience (A2) and allow us to do things as @Lunderberg suggested like plucking just features or arch to give to Passes rather than the entire Target.

Thank you for detailing the challenges! I'm convinced that `target.features` may be the best way to go. A couple of more questions:
- What's the difference between `target.features` and `target.arch`? If they are the same thing, I would say `features` is certainly a better name
- IIUC we are not going to introduce 2-level of indirection like `target.arch.features`. Is that correct?
- Is it possible to completely replace `target.attrs` with more structured configuration? For example, `target.features` for C4 in the Target RFC, while `target.hardware_limits` for maximum shared memory, etc (just a hypothetical name, please feel free to suggest better ones)

> I believe the src/target/parsers folder would provide for maximum customisation given parsers can be re-used between TargetKinds and therefore serves as a good starting point? Also happy for an RFC in 6 months to move everything to a new folder if we re-factor the Targets 😸

Sounds good to me!

Overall, I think we are in broad agreement, and thanks for sharing all the context!

-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/71#issuecomment-1133541236
You are receiving this because you are subscribed to this thread.

Message ID: <ap...@github.com>

Re: [apache/tvm-rfcs] Add Target Pre-processing RFC (PR #71)

Posted by Krzysztof Parzyszek <no...@github.com.INVALID>.
```C++
Map<String, ObjectRef> TargetNode::Export() const {
  Map<String, ObjectRef> result = {
      {"kind", this->kind->name},                        <------------
      {"tag", this->tag},
      {"keys", this->keys},
  };
  if (this->host.defined()) {
    result.Set("host", this->GetHost().value_or(Target())->Export());
  }
  for (const auto& kv : attrs) {
    result.Set(kv.first, kv.second);
  }
  return result;
}
```

-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/71#issuecomment-1185696517
You are receiving this because you are subscribed to this thread.

Message ID: <ap...@github.com>

Re: [apache/tvm-rfcs] Add Target Pre-processing RFC (PR #71)

Posted by Christopher Sidebottom <no...@github.com.INVALID>.
@areusch I replied to your comments, could you take a look and see if it makes more sense to you? Then if we're all happy I'll do a final update on the text 😸 

-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/71#issuecomment-1167403744
You are receiving this because you are subscribed to this thread.

Message ID: <ap...@github.com>

Re: [apache/tvm-rfcs] Add Target Pre-processing RFC (PR #71)

Posted by Tianqi Chen <no...@github.com.INVALID>.
Thanks folks for discussions. I think they summarizes to the following points

- Q0: Subfield grouping (e.g. features) or simply leave as top-level attrs
- Q1: Folder structure: `target/preprocessors/cuda.cc` vs `target/cuda/cuda_preprocessor.cc`
     - Note that code-reuse is less likely going to be an issue as there can be common includes among sub-folder structures.
- Q2: How to reconcile attributes specified already for (part of) subfield group, `llvm` can use `mattr` to specify the feature necessary. What happens to CUDA(e.g. we proposing to group hw related limitations to features), and how can we specify some of the fields(under the proposal will go into a subfield `feature`) during tag registration.


-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/71#issuecomment-1133026002
You are receiving this because you are subscribed to this thread.

Message ID: <ap...@github.com>

Re: [apache/tvm-rfcs] Add Target Pre-processing RFC (PR #71)

Posted by Krzysztof Parzyszek <no...@github.com.INVALID>.
I think it's a good idea overall.  I'm in favor of adding more internal flexibility to targets.

-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/71#issuecomment-1125416129
You are receiving this because you are subscribed to this thread.

Message ID: <ap...@github.com>

Re: [apache/tvm-rfcs] Add Target Pre-processing RFC (PR #71)

Posted by Christopher Sidebottom <no...@github.com.INVALID>.
> I like it overall, though I do have one potential concern: By making it easier to query the architecture compared to cross-architecture features, will developers more often use architecture-specific checks that unnecessarily limit TVM features to specific architectures?

Unfortunately, this RFC neither improves nor worsens the situation you describe, I think the only way of really solving how the information is used is to limit the information visible to these functions. `Target.current()` is global and you have access to it all right now (see: https://github.com/apache/tvm/blob/main/python/tvm/topi/arm_cpu/conv2d_int8.py#L174-L177). Ideally people would move to using `.arch.feature` or we only make the `.arch` property visible to the internals (rather than the full `Target`), as you suggested, but that compiler infrastructure change would be out of scope of this RFC.

Also, `.arch` may be misleading here, as it's a description of the architecture which means it'll include features such as `.arch.has_dsp`, enabling 2. and 3. based on those available features within the architecture. To further motivate this, the preprocessor can produce a `register_count` property on the `.arch` property if a specific CPU/attr/triple/etc is seen and that could be accessed via `my_target.arch.register_count`.

Based on my previous re-review of LLVM, thanks to @tqchen, it might help to use `my_target.features.dsp` rather than `my_target.arch.has_dsp` and clarifying these are `features` available to the `Target`? What do you think?

-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/71#issuecomment-1130237913
You are receiving this because you are subscribed to this thread.

Message ID: <ap...@github.com>

Re: [apache/tvm-rfcs] Add Target Pre-processing RFC (PR #71)

Posted by Andrew Reusch <no...@github.com.INVALID>.
Merged #71 into main.

-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/71#event-7112016296
You are receiving this because you are subscribed to this thread.

Message ID: <ap...@github.com>

Re: [apache/tvm-rfcs] Add Target Pre-processing RFC (PR #71)

Posted by Christopher Sidebottom <no...@github.com.INVALID>.
@kparzysz-quic, sorry, I'm still not understanding here - the name of the kind is enough to look it up in the kind registry and gather any kind-specific information. What further details are missing? 

-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/71#issuecomment-1185744822
You are receiving this because you are subscribed to this thread.

Message ID: <ap...@github.com>

Re: [apache/tvm-rfcs] Add Target Pre-processing RFC (PR #71)

Posted by Andrew Reusch <no...@github.com.INVALID>.
@Mousius want to make the other suggested updates? then I'm good with this RFC

-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/71#issuecomment-1193570733
You are receiving this because you are subscribed to this thread.

Message ID: <ap...@github.com>