You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tvm.apache.org by Andrew Reusch via Apache TVM Discuss <no...@discuss.tvm.ai> on 2021/05/12 19:51:50 UTC

[Apache TVM Discuss] [Development/RFC] [RFC] [uTVM] Embedded C Runtime Interface


cc @MJKlaiber 

@Mousius thanks for splitting this off into another RFC. I agree implementing a low-overhead embedded interface is super important. A couple thoughts:

At a high level, it would be great to explicitly spell out the entire interface we expect to implement here. I think it might be useful to include an entire `main()` program (either here or perhaps linked as a branch if it's long) just to ensure we aren't leaving anything out.

### Runtime vs compile time knowledge

A key question we should tackle here is when model metadata should be available. Basically there are two scenarios:

S1. The user wants to use model metadata in the compilation flow.

S2. The user wants to write functions that make use of model metadata at runtime.

My opinion is we need to support both. So any metadata here e.g. stored in a struct should also be present in some JSON created as part of Model Library Format.

### Model Input and Output Allocation

I think it'd be great to illustrate how we expect users to allocate model inputs and outputs. This is kind of there, but it would be great to propose the thing end-to-end. In particular, I'm curious how a user should size the tensors. One such possible sketch is to generate code like:
```
typedef struct {
    uint8_t input1[1 * 32 * 32 * 3];   // dimensions are examples
    int8_t input2[10 * 5 * 5 * 3];
} tvm_model_input_t;
```
This allows users with simple memory layout requirements to just declare the struct in the correct memory address space, and fill data as needed. It also serves as documentation-as-code of the required inputs and memory. We could move the buffer sizes to be constants, too. I want to ensure users retain control of all memory allocations, but we should design the API such that the typical case is very easy to use.

### Custom-Workspace Compilation

I would take this a step further and ask if we can make the workspace size a `#define` constant such that the user could allocate the space at compile time. Or whether we expect this to live in the Model Library Format metadata as a means to access it at compile time. For example, instead of:

```
TVMSetWorkspaces(&context, malloc(TVMGetWorkspaceSize(model, 0));
TVMExecute(&my_model, inputs, outputs, context);
```

I'd like people to be able to:
```
uint8_t g_workspace[TVM_MODEL_NAME_WORKSPACE_BYTES];

void main() {
  TVMSetWorkspaces(&context, g_workspace);
}
```

Finally, is it possible that whatever context is needed to identify the workspace could optionally live in flash? This has some benefits e.g. in simple deployment scenarios when the workspace is allocated as global memory. In this case, it's not possible to overwrite it with invalid pointers, which is a class of bugs that can be hard to trace down on embedded platforms

### Context

> Paired with the model descriptor, this provides any contextual information required to run the model, such as an application driven workspace configuration:
> 
> ```
> typedef struct {
> 	void** workspace; /** Pointers to different memory to use as a workspace */
> } TVMContext;
> ```

I'd like to avoid general-purpose structs if possible, at least at this phase of the implementation. While I think it's likely some top-level glue struct will eventually be a useful entry point for developers (and something is likely going to be needed as `resource_handle`, I think there are still quite a few things related to e.g. multi-core and accelerator dispatch yet to be decided. Rather than provide a sort of "kitchen sink" struct, I'd like to encourage us to define dedicated places for each orthogonal aspect of computing the  I think it'd be great to make progress on the API in this RFC and tackle the accelerator dispatch question in a follow-on.

### Generated APIs vs function pointers

When considering how to write user-facing APIs, I think we have a couple of choices:

G1. Generate a function call table e.g. `TVMModel` and write wrapper functions around it.

G2. Generate a wrapper function with a standard interface (or perhaps a standard templated model interface).

Here, I'm not necessarily proposing to generate a wrapper function with model-specific signatures (though that has been proposed elsewhere). Instead, I am just wondering whether it's necessary to place the `entrypoint` function pointer in `TVMModel`. It seems like we may have some desire to generate model-specific C++ metadata outside of that generated by the AOT codegen, so I wonder if it's worth it to just build a small codegen dedicated to this user-facing API now. Doing this would also remove the need for "accessor" functions such as `TVMGetTVMVersionMajor`.

### Accelerator binding

If possible, I'd like to defer this to a separate RFC. I think there are lots of questions to be answered there and it'd be necessary to review a lifecycle diagram of the accelerator to do so. I think that would be better placed in a separate RFC.





---
[Visit Topic](https://discuss.tvm.apache.org/t/rfc-utvm-embedded-c-runtime-interface/9951/3) to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click here](https://discuss.tvm.apache.org/email/unsubscribe/e68a4c662bb9d1abad80bfdbab5f585d4e831385104d4d71b49aeaba499e8466).