You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tvm.apache.org by Florin Blanaru <no...@github.com.INVALID> on 2022/06/29 14:40:26 UTC

[apache/tvm-rfcs] [RFC] Name mangling in IRModules (PR #84)

[Rendered](https://github.com/gigiblender/tvm-rfcs/blob/mangling-rfc/rfcs/0077_name_mangling_ir_modules.md)
You can view, comment on, or merge this pull request online at:

  https://github.com/apache/tvm-rfcs/pull/84

-- Commit Summary --

  * [RFC] RFC-name-mangling-ir-modules

-- File Changes --

    A rfcs/0077_name_mangling_ir_modules.md (80)

-- Patch Links --

https://github.com/apache/tvm-rfcs/pull/84.patch
https://github.com/apache/tvm-rfcs/pull/84.diff

-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/84
You are receiving this because you are subscribed to this thread.

Message ID: &lt;apache/tvm-rfcs/pull/84@github.com&gt;

Re: [apache/tvm-rfcs] [RFC] Name mangling in IRModules (PR #84)

Posted by Florin Blanaru <no...@github.com.INVALID>.
> A couple of thoughts...
> 
>     * We should have a common facility that represents a scope in which names need to be unique.  Conceptually it could be a set of "known" strings (that are presumed used), and something (like a private counter) to generate a unique suffix.  The use would be `new_name = GenerateName("foo")`.  If `foo` had already been generated, `new_name` would be set to `"foo4"` (for example).  This should not produce global variables (or any IR objects), just names.
>
> 
>     * This RFC still doesn't settle the question whether "name_hint" is a final name or not.  I'm open to requiring that global symbols are given final names from the beginning, not hints.

Thanks for the comments @kparzysz-quic. I agree that we could have the `NameSupply` methods return Strings and make sure that `GlobalVars` are created using strings provided by a NameSupply (eventually one contained within the IRModule). This way, the `NameSupply` could be used for other name mangling scenarios where GlobalVars are not needed.

Regarding the `name_hint` being a final name. I am too of this opinion (it should be a final name) and believe that it already works this way. If I am not wrong, the only edge case is where we use the `global_symbol` attribute instead.

@mbs-octoml  @tqchen any thoughts? 


-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/84#issuecomment-1173984996
You are receiving this because you are subscribed to this thread.

Message ID: <ap...@github.com>

Re: [apache/tvm-rfcs] [RFC] Name mangling in IRModules (PR #84)

Posted by Andrew Reusch <no...@github.com.INVALID>.
@gigiblender is this ready for a last review?

-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/84#issuecomment-1193602324
You are receiving this because you are subscribed to this thread.

Message ID: <ap...@github.com>

Re: [apache/tvm-rfcs] [RFC] Name mangling in IRModules (PR #84)

Posted by Tianqi Chen <no...@github.com.INVALID>.
> Regarding the name_hint being a final name. I am too of this opinion (it should be a final name) and believe that it already works this way. If I am not wrong, the only edge case is where we use the global_symbol attribute instead.

We should define "final name". 
In this context, it is helpful to look at existing practices. This is a topic related to linking in most compilers, and they are closely related to the linkage type.

- For a function that have ExternalLinkage, the name **cannot** change, and whatever name being supplied is the final name, this is because the users of the library needs to be able to refer to the function using the provided name.
- For a function that have PrivateLinkage, the name can and should be able to change, this is because when we link multiple functions together, there can be two private functions that comes with the same name, as a result, the conflicted function can be renamed. Private functions are only referred within the module, so we can afford to rewrite all the callers of the function to the new name.
- During optimization, if a function with PrivateLinkage is not called by any functions, we can safely remove that function, but we cannot do that for ExternalLinkage.

There are more linkage types in LLVM but this two types roughly are sufficient for our discussion purposes.

Now come back to our context, we roughly need two things:
- N0: We need an IR node to uniquely identify a function, storing hints to the name, so new names can be generated base on the hint, but not necessarily pinning down the name (due to the need to support PrivateLinkage). That is `GlobalVar`. 
- N1: We need a way to uniquely specify the name, or specify that the linkage type of the function is external.


Note that N0 means we cannot simply enforce a `name`  or `name_hint` field to be final, it have to be a combination with another attribute that derives the linkage type. So our current convention is:

- When a function have `global_symbol` attribute, it specifies the "final name"(as in symbol in the symbol tabel), and indicate that the LinkageType is External
- When a function does not have `global_symbol` attribute, it means the LinkageType is private, at some time point, the compiler can choose to pick the final name, by attaching global symbol based on the `name_hint` of the function. But that should be deferred to as late as possible so we can perform optimizations like eliminate dead functions.





-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/84#issuecomment-1178996339
You are receiving this because you are subscribed to this thread.

Message ID: <ap...@github.com>

Re: [apache/tvm-rfcs] [RFC] Name mangling in IRModules (PR #84)

Posted by Krzysztof Parzyszek <no...@github.com.INVALID>.
`IRModule` already has functions to create new globals.  With the name-generating `NameSupply` added to the module, these functions could use that implicitly.

-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/84#issuecomment-1172379948
You are receiving this because you are subscribed to this thread.

Message ID: <ap...@github.com>

Re: [apache/tvm-rfcs] [RFC] Name mangling in IRModules (PR #84)

Posted by Krzysztof Parzyszek <no...@github.com.INVALID>.
A couple of thoughts...

- We should have a common facility that represents a scope in which names need to be unique.  Conceptually it could be a set of "known" strings (that are presumed used), and something (like a private counter) to generate a unique suffix.  The use would be `new_name = GenerateName("foo")`.  If `foo` had already been generated, `new_name` would be set to `"foo4"` (for example).  This should not produce global variables (or any IR objects), just names.
- This RFC still doesn't settle the question whether "name_hint" is a final name or not.  I'm open to requiring that global symbols are given final names from the beginning, not hints.

-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/84#issuecomment-1171508847
You are receiving this because you are subscribed to this thread.

Message ID: <ap...@github.com>

Re: [apache/tvm-rfcs] [RFC] Name mangling in IRModules (PR #84)

Posted by Florin Blanaru <no...@github.com.INVALID>.
Adding a [draft PR](https://github.com/apache/tvm/pull/12066) to this discussion. 



-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/84#issuecomment-1181766176
You are receiving this because you are subscribed to this thread.

Message ID: <ap...@github.com>

Re: [apache/tvm-rfcs] [RFC] Name mangling in IRModules (PR #84)

Posted by Tianqi Chen <no...@github.com.INVALID>.
> the purpose is not to codify it so that we can't change it, but just so that we are all on the same page here. perhaps we can incorporate definitions like this into the RFC text and into helper functions?

I agree

-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/84#issuecomment-1181697337
You are receiving this because you are subscribed to this thread.

Message ID: <ap...@github.com>

Re: [apache/tvm-rfcs] [RFC] Name mangling in IRModules (PR #84)

Posted by Andrew Reusch <no...@github.com.INVALID>.
we discussed this at the [community meeting](https://discuss.tvm.apache.org/t/next-tvm-community-meeting-july-13/13120/2) yesterday. here are notes:

- @areusch: one of suggestions raised was to make it possible to reconstruct a NameSupply from an IRModule. we might need to make it possible to initialize it from multiple IRmodules since we currently lower Relay to many separate IRModules in the compiler.
- @kparzysz-quic : what's the use of having multiple modules, unless they're built for separate devices. do we need `mod_name` as a field?
  - `mod_name` is not to distinguish different IRModules at compile time; more to distinguish separate models at runtime via namespace prefix.
- @kparzysz-quic : what is the connection between the name that we attach to a global and the name that actually ends up in an object file? there is a strong need to have a degree of control over the names that end up in the object file and their visibility. how are we addressing this need? i think there is a lack of this in TVM now. we don't have a way
  - @tqchen: good point. we should document the linkage type. right now there is an implicit convention: the `name_hint` is not final (it's just an internal-to-compiler concept) and there's a GlobalVar which enforces that there is external linkage type with a fixed name. e.g. if we are splitting the compilation flow, then there is an expectation that when calling into a symbol, the name won't change. another rule: if a function has attribute `kGlobalSymbol`, you cannot change it or even delete it.
  - @manupa-arm : when we say we want to have control: do we want this to extend to Relay? if a Relay module contains a GlobalVar, should that be emitted with exactly that name?
    - @tqchen : if an IRModule has a function with attr `kGlobalSymbol` we should respect that. but as a rule of thumb we should delay attaching this attr til as late of possible, so we should try to avoid that symbol.
    - @areusch : we also don't have this ability in scheduling; it would be hard to attach that attribute to a Relay function e.g. main because we prefix lowered functions
    - @tqchen: we could special-case an AOT entry function when the compiler takes special care around a symbol.
    - filed https://github.com/apache/tvm/issues/12098 to document the convention.
 - @areusch some context for this RFC: in the C++ runtime, we previously mostly cared that a GlobalVar name could be passed to Module.GetFunction() and then that function would find that implemented PrimFunc. When we implemented AOT in the C runtime, we changed to expecting to be able to directly call those names as C symbols. This caused a bit of a problem when we moved to implement both the C and C++ AOTExecutor wrappers, where metadata lookups could be different because of name mangling. To solve that, we intended to move in a direction where the names of all symbols were fixed at birth, thus NameSupply.
   - @kparzysz-quic: who is typically the caller of these functions?
   - @areusch typically an executor
   - @kparzysz-quic can we modify the names of the global? shouldn't matter so long as everyone agrees


-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/84#issuecomment-1184817996
You are receiving this because you are subscribed to this thread.

Message ID: <ap...@github.com>

Re: [apache/tvm-rfcs] [RFC] Name mangling in IRModules (PR #84)

Posted by Tianqi Chen <no...@github.com.INVALID>.
Thanks @gigiblender ! a few points for consideration:

- Ideally NameSupply should interact well with our current linkage spec, e.g. if a function have an attribute `global_symbol`, it means the name is "final"(because external users are expecting to look it up with that name), likely we need to have global. Anything that does not have a `global_symbol` attr can subject to change.
- Would be useful to initialize a NameSupply from an existing IRModule, so it does not need to be carried through out passes and is only needed for pass that regenerate global vars.

In addition to that, it would be really nice to get some smart naming resolve mechanism.

For example, it would be really nice to have the following effect (by parsing the suffix integer separately from prefix string

```python
x = get_next_name("fun12", existing_names=["func12", "func13", "func4"])
assert x == "func14"
```

This is to avoid the case where we have "func12_0_0_0_0" when multiple set of prefixes are called and they usually looks confusion.





-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/84#issuecomment-1170277851
You are receiving this because you are subscribed to this thread.

Message ID: <ap...@github.com>

Re: [apache/tvm-rfcs] [RFC] Name mangling in IRModules (PR #84)

Posted by Andrew Reusch <no...@github.com.INVALID>.
Merged #84 into main.

-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/84#event-7058767959
You are receiving this because you are subscribed to this thread.

Message ID: <ap...@github.com>

Re: [apache/tvm-rfcs] [RFC] Name mangling in IRModules (PR #84)

Posted by Andrew Reusch <no...@github.com.INVALID>.
@tqchen i think one goal we should have with this RFC is to codify things like
> When a function does not have global_symbol attribute, it means the LinkageType is private

the purpose is not to codify it so that we can't change it, but just so that we are all on the same page here. perhaps we can incorporate definitions like this into the RFC text and into helper functions?

-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/84#issuecomment-1180909320
You are receiving this because you are subscribed to this thread.

Message ID: <ap...@github.com>

Re: [apache/tvm-rfcs] [RFC] Name mangling in IRModules (PR #84)

Posted by Andrew Reusch <no...@github.com.INVALID>.
i've put this RFC on the meeting agenda for this week's Community Meeting

-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/84#issuecomment-1180909624
You are receiving this because you are subscribed to this thread.

Message ID: <ap...@github.com>

Re: [apache/tvm-rfcs] [RFC] Name mangling in IRModules (PR #84)

Posted by Andrew Reusch <no...@github.com.INVALID>.
thanks @gigiblender , the RFC has been accepted! please create a [tracking issue](github.com/apache/tvm/issues) and mention this PR in that issue to link it.

-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/84#issuecomment-1194446926
You are receiving this because you are subscribed to this thread.

Message ID: <ap...@github.com>

Re: [apache/tvm-rfcs] [RFC] Name mangling in IRModules (PR #84)

Posted by Florin Blanaru <no...@github.com.INVALID>.
> @gigiblender is this ready for a last review?

@areusch yes

-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/84#issuecomment-1193604132
You are receiving this because you are subscribed to this thread.

Message ID: <ap...@github.com>

Re: [apache/tvm-rfcs] [RFC] Name mangling in IRModules (PR #84)

Posted by Florin Blanaru <no...@github.com.INVALID>.
> Thanks @gigiblender ! a few points for consideration:
> 
>     * Ideally NameSupply should interact well with our current linkage spec, e.g. if a function have an attribute `global_symbol`, it means the name is "final"(because external users are expecting to look it up with that name), likely we need to have global. Anything that does not have a `global_symbol` attr can subject to change.
> 
>     * Would be useful to initialize a NameSupply from an existing IRModule, so it does not need to be carried through out passes and is only needed for pass that regenerate global vars.
> 
> 
> In addition to that, it would be really nice to get some smart naming resolve mechanism.
> 
> For example, it would be really nice to have the following effect (by parsing the suffix integer separately from prefix string
> 
> ```python
> x = get_next_name("fun12", existing_names=["func12", "func13", "func4"])
> assert x == "func14"
> ```
> 
> This is to avoid the case where we have "func12_0_0_0_0" when multiple set of prefixes are called and they usually looks confusion.

Thank you for feedback @tqchen! I agree with your suggestions and amended the text with further clarifications. 

-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/84#issuecomment-1171207368
You are receiving this because you are subscribed to this thread.

Message ID: <ap...@github.com>

Re: [apache/tvm-rfcs] [RFC] Name mangling in IRModules (PR #84)

Posted by Andrew Reusch <no...@github.com.INVALID>.
cc @mbs-octoml @Mousius @manupa-arm @MichaelJKlaiber @grant-arm @kparzysz-quic 

-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/84#issuecomment-1171465597
You are receiving this because you are subscribed to this thread.

Message ID: <ap...@github.com>