You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2021/09/01 09:11:09 UTC
[GitHub] [tvm-rfcs] manupa-arm commented on a change in pull request #23: [RFC][TIR] TIR Pinned Memory Representation

manupa-arm commented on a change in pull request #23:
URL: https://github.com/apache/tvm-rfcs/pull/23#discussion_r699345104



##########
File path: rfcs/0023-associating-allocate-nodes-with-pinned-memories.md
##########
@@ -0,0 +1,158 @@
+
+- Feature Name: TIR Pinned Memory Representation
+- Start Date: 2021-06-01
+- RFC PR: https://github.com/apache/tvm-rfcs/pull/23
+- GitHub Issue: TBD
+
+# 1. Summary
+
+This RFC proposes how pinned memories could be associated with tir.allocate nodes (and allocate_const nodes) and used by passes in the lowering process.
+
+# 2. Motivation 
+
+Currently, TVM relies on dynamic (alloc and free style) allocations in runtime to manage the intermediary memory used by operators and the network. This is sometimes not desirable, especially in microTVM.
+
+The current design of [Unified Static Memory Planner (USMP)](https://github.com/apache/tvm-rfcs/pull/9), enables the user option to provide buffers to place workspace and constant tensors.
+
+```
+    tvmc compile my_model.tflite 
+    --executor=aot 
+    --target=accel,c  
+    --with-workspace-buffer= "name=dtcm;target=c;size=1000" # Here the size is more of a hint/guide provided to USMP
+    --with-workspace-buffer= "name=sram;target=c,accel"
+    --with-parameter-buffer= "name=itcm;target=c;size=5000" # Here the size is more of a hint/guide provided to USMP
+    --with-parameter-buffer= "name=flash;target=c,accel"
+```
+
+```
+    // The User Application 
+        extern  const TVMModel my_model;
+        __attribute__((section( "ITCM" )  const uint8_t   my_model_params_1[TVM_MY_MODEL_ITCM_PARAMETER_BUFFER_SIZE] = <param_1_data>;
+        __attribute__((section( "FLASH" ), aligned( 16 )))  const uint8_t my_model_params_2[TVM_MY_MODEL_FLASH_PARAMETER_BUFFER_SIZE] = <param_2_data>;
+        __attribute__((section( "DTCM" )  static uint8_t workspace_buffer_1[TVM_MY_MODEL_DTCM_WORKSPACE_BUFFER_SIZE];
+        __attribute__((section( "SRAM" ), aligned( 16 )))  static uint8_t workspace_buffer_2[TVM_MY_MODEL_SRAM_WORKSPACE_BUFFER_SIZE];
+
+    int main(...) {
+         ...
+         TVMContext context;
+         TVMInputs_my_model_1 inputs = {input};
+         TVMOutputs_my_model_1 outputs = {output};
+         TVMWorkspaces_my_model workspaces = {
+             .sram = &workspace_buffer_1,
+             .dtcm = &workspace_buffer_2,
+         };
+         TVMParameters_my_model parameters = {
+             .flash = &my_model_params_1,
+             .itcm = &my_model_params_2
+         };
+         TVMSetWorkspaces(&context, &workspaces);
+         TVMSetParameters(&context, parameters);
+         TVMExecute(&my_model, &inputs, &outputs, &context);
+    }
+```
+
+Therefore, we'd need a way to represent the association of each of these memories, that the user will pin the buffers to, closer to allocate nodes in TIR.
+
+# 3. Guide-level explanation
+
+This is not particularly a user-facing feature.
+
+However, the intention here is to associate the name of the memory (along with other information) closer to the allocate IR node. Therefore at the end of the compilation, metadata module (+ header) will generate pointer structs to-be passed in the from the application layer. 
+
+ In the case where user does not wish to pass in workspace or constant buffers, metadata module will generate a workspace buffer and a constant buffer, as explained in U1 of [USMP](https://github.com/apache/tvm-rfcs/pull/9).
+
+ # 4. Reference-level explanation
+
+ At the IR, we ll need to associate each allocate node with one (or more) memories that it can end up, because the scheduling might be satisfied with placing buffers in any of the memories in a given set of memories. Therefore, the scheduler might want the memory planners to decide which memory to use based on finding the allocation that fit.

Review comment:
       Hi @csullivan,
   
   Let's say there are multiple physical memories that have same bandwidth and latency and schedulers are just fine placing the buffer in each of them as long as all things fit. In the scenario, the scheduler would want keep more than one candidate for the memory planner to decide.
   
   However, based on the comments, I feel we are going to use two fields for these two purposes. See my latest response below. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org