You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2021/09/22 18:50:28 UTC

[GitHub] [tvm-rfcs] areusch commented on pull request #34: Wiring up the PrimFunc resource_handle

areusch commented on pull request #34:
URL: https://github.com/apache/tvm-rfcs/pull/34#issuecomment-925204958


   okay I've done a bit more research on this one. Here is the thing causing tension here:
   - at present, `tir.call_packed` codegens to `TVMFuncCall` and `TVMFuncCall` does not in fact accept `resource_handle` as an argument. it is not possible to supply `resource_handle` to `tir.call_packed`.
   - however, that is not actually the ultimate goal of this PR. the ultimate goal of this PR is to supply `resource_handle` to `tir.call_cpacked`, which doesn't use `TVMFuncCall` and instead invokes the `TVMBackendPackedCFunc` directly.
   - this basically works in a very narrow set of use cases which meet all of these conditions:
       a) when we don't need `TVMFuncCall` because we can get `TVMBackendPackedCFunc*` directly
       b) when we know that we have the `resource_handle` that PackedFunc expects.
       c) when the C interface API to AOT is used and therefore AOT Executor has a `void* context` to be supplied to device kernel launch PackedFunc
   
   This really has quite a narrow use case: in the C runtime with the AOT executor and when the C interface API is used. Other use cases disqualify this usage pattern:
   - If the PackedFunc API is used we currently have no good way to pass the proposed `TVM<Model>_Devices` struct into AOTExecutor. We could pass it as OpaqueHandle if we had to.
   - If the C++ runtime is used, in _most_ cases we cannot get the `TVMBackendPackedCFunc*` directly and must operate with the opaque `TVMFunctionHandle` and `TVMFuncCall`
   - It is possible that within a given compiled AOT module, we could employ `tir.call_cpacked` to refer to other DSO-exportable functions directly. However, this wouldn't necessarily extend to non-DSO-exportable functions, which in any case typically gain the `context` closure from `sptr_to_self` and/or a `this` capture. Also, doing so would hit the linker limitations discussed in https://discuss.tvm.apache.org/t/introduce-artifact-a-container-for-generated-code/11099/2.
   
   Taken together, this means we need the following restrictions on this usage pattern:
   1. `-runtime` must be `c` due to the final bullet point above.
   2. `-executor` must be `aot`. 
   3. `-unpacked-api` must be `0`. When `-unpacked-api` is `1`, we just append this to the extern call args, correct?
   4. `-interface-api` is likely `c`. As discussed above, this _could_ work with `packed`, but the awkward OpaqueHandle dtype must be used.
   
   Considering this altogether, this is essentially an attempt to simplify the multi-`Module` late-bound import system for use with resource-constrained embedded devices. The key piece of missing information needed without such a system is the `context` pointer. I see two routes forward here:
   1. We add `resource_handle` to `tir.call_cpacked` and resolve the incongruity between usage for a device kernel function and the current usage in the C Runtime as `Module*`.
   2. We just lower the extern schedule to a CallNode whose `args` includes an OpaqueHandle at a position which the driver is aware of. For instance, following the current suggestion, such OpaqueHandle would be the final arg. The question here is how to indicate to the executor that such argument should be filled from the `TVMDevices` struct. This could be done either as an attr of the CallNode or by introducing a new `dtype` (I sort of favor the former, as `void*` is exactly OpaqueHandle).
   
   Since ultimately the driver must be written knowing where the "context" is going to be (either as `this` member or as `resource_handle`), the driver must also be designed around this choice. If portability between C and C++ runtimes is desired, it will need to choose an approach that works with both. While I appreciate that `Module` is generally overly bloated for many embedded applications, we do rely on it for the purposes of autotvm and providing the TVM RPC server. Moving away from it seems like it would add significant cognitive overhead to the design of microTVM. Given we may need `resource_handle` to be the `Module*` should `TVMBackendGetFuncFromEnv` need to be implemented on microTVM (admittedly, this means that the Module import mechanism is in use which is even more bloated), it kind of seems like we should avoid trying to repurpose `resource_handle` here and just adopt route 2 above.
   
   Ultimately at tension here is `TVMBackendGetFuncFromEnv` and the need to provide a less bloated `context` for use with the TVM C runtime. I'm not sold on this opinion, just writing down my thoughts. Perhaps others could weigh in here. Supporting Module's import mechanism on the C runtime is indeed considerable extra complexity for questionable benefit. And, this RFC is part of a larger solution which supplants the need for Module import. Perhaps the outcome of the Article RFC (e.g. decision on what the long-term expectations of the compiler are on the module load process) also play in here as well.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org