You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2022/05/12 12:55:49 UTC

[GitHub] [tvm] apeskov commented on pull request #11228: [QNN] Enable constant folding for QNN operations.

apeskov commented on PR #11228:
URL: https://github.com/apache/tvm/pull/11228#issuecomment-1124959258

   @manupa-arm 
   
   > Do you have a good reason why we need to compound this behaviour?
   As in short that's because of BYOC. @masahi answers that quite correctly in previous discussion
   
   Will try to explain a little bit more detailed. 
   
   In my particular case I have to know the tensor is constant or not before applying "partition_for_xxx" pass. Imagine that you have device which is able to process conv2 primitive only when weights are constants. Term "constant" in that case means that weight data are available on device initialisation step and device is able to apply some HW specific transformations and copy in proper HW specific memory. Moreover, we do not know type of weight transformation during tvm compilation because it depends on particular type of HW and device state. 
   
   So we have to partition graph with taking into account this requirements. Patterns may look like next:
   ```
   pat_1 = is_op("qnn.conv2d")(wildcard(), is_constant())  # Good. Strong requirements of constants
   pat_2 = is_op("qnn.conv2d")(wildcard(), wildcard())     # Bad. No restrictions. Will match anywhere, with and without const
   ```
   
   The pattern 'pat_2' is not suitable for our case because it will treat second argument as regular var regardless of whether it's constant or not. Weight tensor will be passed to BYOC function as regular argument of method Run(), but not for Init(). So we would like to use 'pat_1'. 
   
   To support 'pat_1'  we have to fold all constant subgraphs (like a 'qnn.quantize(const_weight_fp32)') to real constants before applying partitioner pass, otherwise the pattern will be skipped.  Applying legalization pass before constant folding will decompose 'qnn.conv2d' as well and pattern 'pat_1' will not be matched anyway. Totally, using legalization + constant_folding
   before partitioning doesn't help.
   
   The shortest way I found is to conditionally decompose qnn primitives only for constant subgraphs. That is equivalent of adding qnn primitives into constant folding pass. And I think it's right direction. 
   
   One of alternative way is to introduce one more pattern helper like `is_constant_subgraph()` and implement lazy initialisation on BYOC side. But it looks slightly unnatural.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org