You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tvm.apache.org by Krzysztof Parzyszek <no...@github.com.INVALID> on 2022/06/27 22:16:01 UTC

[apache/tvm-rfcs] [RFC] Encapsulate LLVM target for use with LLVM libraries (PR #83)

You can view, comment on, or merge this pull request online at:

  https://github.com/apache/tvm-rfcs/pull/83

-- Commit Summary --

  * [RFC] Encapsulate LLVM target for use with LLVM libraries

-- File Changes --

    A rfcs/0080-llvm-target.md (174)

-- Patch Links --

https://github.com/apache/tvm-rfcs/pull/83.patch
https://github.com/apache/tvm-rfcs/pull/83.diff

-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/83
You are receiving this because you are subscribed to this thread.

Message ID: &lt;apache/tvm-rfcs/pull/83@github.com&gt;

Re: [apache/tvm-rfcs] [RFC] Create LLVM scope class for use with LLVM libraries (PR #83)

Posted by Tianqi Chen <no...@github.com.INVALID>.
Logically, there are two kinds of operations involved:
- K0: operations that deserializes/deserializes the data structure  (llvm::Module)
- K1: operations that transforms the llvm::Module

We are certainly in agreement that all K1 should be enclosed with option save/recovery scope. Parse/Save seems to belong to K0. From a logic reasoning pov, operations in K0 should not be impacted by cli opt as a result my comment.

But  I think @kparzysz-quic what you mean is that cl opt might breach into K0 as well. If that is really the case, it would indeed be helpful to take the setting/recover at the load/store level. Would be great to reason and confirm a bit. Since the ability to de-couple K0 and K1 is nice from the overall code structuring pov



-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/83#issuecomment-1180419819
You are receiving this because you are subscribed to this thread.

Message ID: <ap...@github.com>

Re: [apache/tvm-rfcs] [RFC] Create LLVM scope class for use with LLVM libraries (PR #83)

Posted by Tianqi Chen <no...@github.com.INVALID>.
Merged #83 into main.

-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/83#event-7003735230
You are receiving this because you are subscribed to this thread.

Message ID: <ap...@github.com>

Re: [apache/tvm-rfcs] [RFC] Create LLVM scope class for use with LLVM libraries (PR #83)

Posted by Krzysztof Parzyszek <no...@github.com.INVALID>.
Both `LoadIR` or `ParseIR` take an optional pre-existing `LLVMContext`, and return a pair `{llvm::Module, LLVMScope}`, where the scope object in the pair is newly constructed, and contains the given `LLVMContext` (or create a new one if none was given).  In the first call to `ParseIR`, the caller can take the second element of the pair and just use it as the scope.  In a subsequent call, the caller can simply discard the second element, which will then be automatically destroyed.
An example of calling `ParseIR` or `LoadIR` when a LLVM scope already exists is in [`HandleImport` in codegen_llvm.cc](https://github.com/apache/tvm/pull/11933/files#diff-f61b04b100f5145f2681340c81d3f2af221239594ed01e2e24896522329ce92cR275).

You bring up a good point about when the scope "takes effect", or binds to LLVM.  I propose that the scope becomes active when a new `LLVMContext` is created (i.e. the scope is really determined by `LLVMContext`, not by the `LLVMScope`).  We could also define "takes effect" to mean that locks (or failure-to-create object) could be acquired.  That would mean that `ParseIR` and `LoadIR` return an active scope.  Also, if they are called with a context argument (meaning that they are called from an active scope), they wouldn't need to create a new context, and so they wouldn't block (if we were to add locks).  In such case they would return another scope object with the same `LLVMContext`.  This shouldn't cause any problems even if both scope objects were used, but could make the code confusing to read.  If we follow the convention that the "extra" scope is to be ignored, this would avoid the confusion.  There may be details to be sorted out about releasing locks, etc, but those are implementation details.

_Side note: Ideally, prior to the activation of the scope, none of LLVM code or data types would be exposed to the user, but this RFC doesn't try to go that far.  The main concern here is a platform on which saving/restoring LLVM flags could be done._

I also suggest to create two classes: `LLVMScope` and `LLVMTarget`.  The `LLVMTarget` would contain all the LLVM target flags and options, while `LLVMScope` would deal with activation, `ParseIR` and `LoadIR`.  The `LLVMTarget` object could be passed around, copied, etc (without concerns about locking or activating anything), and its purpose would be to have all the target flags/options in one place, and it could only exist once scope has been activated.

Something like
```C++
class LLVMTarget {
public:
  LLVMTarget(const LLVMTarget&);  // copy ctor, etc.

private:
  friend class LLVMScope;
  LLVMTarget(ctx, values extracted from tvm::Target);

  std::shared_ptr<llvm::LLVMContext> ctx_;    // set via the private ctor, passed from LLVMContext
  llvm::TargetMachine *tm;
  ...
};

class LLVMScope {
public:
  LLVMScope(const Target& target) {
    // Parse the target into individual attributes, e.g. mtriple, etc. This would allow users
    // to create LLVMScope from Target and modify it before creating LLVMTarget.
    //
    // If the target has non-empty list of LLVM flags, it would be considered "modifying"
    // the global state, otherwise it would be "read-only".  This would allow creating
    // multiple "read-only" concurrent scopes, but only one "modifying" one.
  }

  void activate() {
     ctx_ = std::shared_ptr<llvm::LLVMContext>(CreateNewContext(is_read_only));
  }

  static std::pair<llvm::Module, LLVMScope> ParseIR(std::string ir) {   // same for LoadIR
    auto ctx = std::shared_ptr<llvm::LLVMContext>(CreateNewContext(false /*assume modifying*/));
    llvm::Module m = llvm::parse_string_etc(ctx, ...)
    LLVMScope scp(ctx, get_target_string(m));   // create an already active scope
    return {m, scp};
  }

  LLVMTarget getTargetInfo() {
    ICHECK(ctx_ != nullptr) << "must activate first";
    ...
  }

private:
  static llvm::LLVMContext* CreateNewContext(bool is_read_only) {
     // Create context with appropriate locking
  }
 
  LLVMScope(std::shared_ptr<llvm::LLVMContext> ctx, const Target& target);
  std::shared_ptr<llvm::LLVMContext> ctx_;
};
```


-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/83#issuecomment-1179203419
You are receiving this because you are subscribed to this thread.

Message ID: <ap...@github.com>

Re: [apache/tvm-rfcs] [RFC] Encapsulate LLVM target for use with LLVM libraries (PR #83)

Posted by Tianqi Chen <no...@github.com.INVALID>.
cc @junrushao1994 

-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/83#issuecomment-1170296100
You are receiving this because you are subscribed to this thread.

Message ID: <ap...@github.com>

Re: [apache/tvm-rfcs] [RFC] Create LLVM scope class for use with LLVM libraries (PR #83)

Posted by Krzysztof Parzyszek <no...@github.com.INVALID>.
> > The reason is that ParseIR needs to make calls to LLVM functions to create the llvm::Module.
> 
> I get this part, my main question is whether we can initializeLLVM in the Target constructor but do option resetting in enter/exit, i could miss something here.

I see what you mean.  My concern is that since command line options can be used almost everywhere in LLVM (not just optimizations or code generation), we should try to do the saving as early as possible (and restoring as late as possible).  Ideally all the LLVM calls in the program would be enclosed in the save/restore functions (even the bitcode reader registers a couple of command line flags).

At the same time we could make a deliberate decision to only apply the save/restore to a part of LLVM functionality, i.e. allow some kinds of LLVM function calls to happen outside of the save/restore scope.  Let me know what you think.

-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/83#issuecomment-1180408342
You are receiving this because you are subscribed to this thread.

Message ID: <ap...@github.com>

Re: [apache/tvm-rfcs] [RFC] Create LLVM scope class for use with LLVM libraries (PR #83)

Posted by Tianqi Chen <no...@github.com.INVALID>.
In that case, I think it would be nice to state we handle K1, and leave ONLY serialization/deserialization out. This way we can have With style API, which indicate that we are explicit entering the scope for any processing.

```c++
auto data = LLVMTarget::ParseIR(file_name);
With<LLVMTarget> scope(mod_data);

``

-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/83#issuecomment-1181985883
You are receiving this because you are subscribed to this thread.

Message ID: <ap...@github.com>

Re: [apache/tvm-rfcs] [RFC] Encapsulate LLVM target for use with LLVM libraries (PR #83)

Posted by Krzysztof Parzyszek <no...@github.com.INVALID>.
> my one question, which i think is mainly open-ended now, is how we handle LLVM arch that have unique flags that, right now, would show up in TVM as target-specific llvm flags.

I'd have to see a practical example of that to get an idea of what's expected.  For Hexagon we'd want to automatically append certain flags (that enable auto-vectorization in LLVM, for example).  I had one idea for that, but it may need a revision.

-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/83#issuecomment-1175448669
You are receiving this because you are subscribed to this thread.

Message ID: <ap...@github.com>

Re: [apache/tvm-rfcs] [RFC] Create LLVM scope class for use with LLVM libraries (PR #83)

Posted by Krzysztof Parzyszek <no...@github.com.INVALID>.
> Just to followup, why does ParseIR require activation of a scope, or is it possible that ParseIR returns a tuple of Target and Module, where activation is done separately.

The reason is that `ParseIR` needs to make calls to LLVM functions to create the `llvm::Module`.


-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/83#issuecomment-1179274109
You are receiving this because you are subscribed to this thread.

Message ID: <ap...@github.com>

Re: [apache/tvm-rfcs] [RFC] Create LLVM scope class for use with LLVM libraries (PR #83)

Posted by Krzysztof Parzyszek <no...@github.com.INVALID>.
Should we still wait for review from Junru?

-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/83#issuecomment-1185759039
You are receiving this because you are subscribed to this thread.

Message ID: <ap...@github.com>

Re: [apache/tvm-rfcs] [RFC] Create LLVM scope class for use with LLVM libraries (PR #83)

Posted by Krzysztof Parzyszek <no...@github.com.INVALID>.
To address the second question: we shouldn't allow nesting of LLVM scopes.  The reason is that if the target doesn't specify any command line flags, the assumption is that it will rely on LLVM's defaults.  If we were to nest scopes, and the outer scope did modify flags, then the inner scope (without flags) would not have a way of knowing that the global state has been altered.

-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/83#issuecomment-1179206731
You are receiving this because you are subscribed to this thread.

Message ID: <ap...@github.com>

Re: [apache/tvm-rfcs] [RFC] Encapsulate LLVM target for use with LLVM libraries (PR #83)

Posted by Krzysztof Parzyszek <no...@github.com.INVALID>.
@tqchen @junrushao1994 Do you have any additional comments?

-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/83#issuecomment-1176373905
You are receiving this because you are subscribed to this thread.

Message ID: <ap...@github.com>

Re: [apache/tvm-rfcs] [RFC] Encapsulate LLVM target for use with LLVM libraries (PR #83)

Posted by Krzysztof Parzyszek <no...@github.com.INVALID>.
[Rendered](https://github.com/apache/tvm-rfcs/blob/aba8fa123cd0de1e3a8d752f0b55741aa674a1b0/rfcs/0080-llvm-target.md)

-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/83#issuecomment-1167976257
You are receiving this because you are subscribed to this thread.

Message ID: <ap...@github.com>

Re: [apache/tvm-rfcs] [RFC] Create LLVM scope class for use with LLVM libraries (PR #83)

Posted by Krzysztof Parzyszek <no...@github.com.INVALID>.
It does, yes: https://github.com/llvm/llvm-project/blob/main/llvm/lib/Bitcode/Reader/BitcodeReader.cpp#L91-L99

The flags above come from the latest development branch, so in order to handle all LLVM versions we'd need to check all of them since 4.0 (which I'm ok doing), and we will need to keep this code up to date with the latest developments in LLVM.

On the other hand, these aren't options that are very likely to be used by TVM users, and the issue with command line flags only really applies to deserialization (since serialization would happen after the flags had been saved).  I don't think it's unreasonable to state that we only handle (K1), but we'd need to ensure that (K1) also included creation of any data structures that assist in modifying or in analysis of llvm::Module.


-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/83#issuecomment-1180438182
You are receiving this because you are subscribed to this thread.

Message ID: <ap...@github.com>

Re: [apache/tvm-rfcs] [RFC] Create LLVM scope class for use with LLVM libraries (PR #83)

Posted by Krzysztof Parzyszek <no...@github.com.INVALID>.
I updated the text of the RFC, the rendered link above, and the [prototype draft PR](https://github.com/apache/tvm/pull/11933).

-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/83#issuecomment-1183772746
You are receiving this because you are subscribed to this thread.

Message ID: <ap...@github.com>

Re: [apache/tvm-rfcs] [RFC] Create LLVM scope class for use with LLVM libraries (PR #83)

Posted by Tianqi Chen <no...@github.com.INVALID>.
> The reason is that ParseIR needs to make calls to LLVM functions to create the llvm::Module.

I get this part, my main question is whether we can initializeLLVM in the Target constructor but do option resetting in enter/exit, i could miss something here.

-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/83#issuecomment-1179837365
You are receiving this because you are subscribed to this thread.

Message ID: <ap...@github.com>

Re: [apache/tvm-rfcs] [RFC] Create LLVM scope class for use with LLVM libraries (PR #83)

Posted by Tianqi Chen <no...@github.com.INVALID>.
Thanks @kparzysz-quic . Sorry for the delayed reply since we are taking break here.

Overall I like the direction we are going. Just to figure out the spectrum of possible APIs

My main question is how are we going to interact with multiple ParseIR calls. Some example would be helpful. For example, is it OK for us to have nested LLVMScope

```c++
void Example() {
     LLVMScope scope1(target);
     {
          // what is the effect here, seems mod_data1 is immediate in scope
          auto mod_data1 = LLVMScope::ParseIR(name);
     }
}
```

I also wonder if there is a way to "defer" the scope initialization. e.g. can LLVMModule be created, stored in the LLVMScope, but the target does not take in-effect until we enter the scope.  We call InitializeLLVM in the constructor of LLVMTarget, but do other things like option setting in the enter stage.Something like

```c++
void Example() {
     LLVMTarget target1(target);
     auto mod_data = LLVMTarget::ParseIR(name);
     // enter target1 scope
     With<LLVMTarget> scope1(target1);
     {
           // entering target in mod_data
          With<LLVMTarget> scope2(mod_data);
     }
}
```
I think the main question is how coupled the operations related to LLVM are.

-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/83#issuecomment-1178983089
You are receiving this because you are subscribed to this thread.

Message ID: <ap...@github.com>

Re: [apache/tvm-rfcs] [RFC] Create LLVM scope class for use with LLVM libraries (PR #83)

Posted by Tianqi Chen <no...@github.com.INVALID>.
Just to followup, why does ParseIR require activation of a scope, or is it possible that ParseIR returns a tuple of Target and Module, where activation is done separately.

I am asking this mainly to see if we can get an explicit With style API

-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/83#issuecomment-1179234699
You are receiving this because you are subscribed to this thread.

Message ID: <ap...@github.com>

Re: [apache/tvm-rfcs] [RFC] Create LLVM scope class for use with LLVM libraries (PR #83)

Posted by Tianqi Chen <no...@github.com.INVALID>.
I think we can go ahead and merge

-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/83#issuecomment-1185770803
You are receiving this because you are subscribed to this thread.

Message ID: <ap...@github.com>