You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by "walterddr (via GitHub)" <gi...@apache.org> on 2023/11/16 18:40:20 UTC

[I] [multistage][feature] RelDistribution-based optimization [pinot]

walterddr opened a new issue, #12015:
URL: https://github.com/apache/pinot/issues/12015

   Trackers
   ===
   - [ ] enable trait-based partition optimization for more accurate leaf-stage direct exchange/shuffle
   - [ ] extend current direct exchange/shuffle logic to the entire query plan 
   - [ ] revisit long-term distribution 
   
   Details
   ===
   
   Trait-based optimziation
   ------
   Goals of the first enablement of trait-based optimization:
   
   1. ensure traits are propagated properly across the rel-tree
   2. ensure the leaf-stage optimization for direct exchange is correct --> previously it was blindly done via tableOptions
   3. ensure that no regression occurs for existing, correct direct exchange
   
   Extended partition optimization beyond leaf-stage
   ------
   in additional to dealing with RelDistribution trait we also need to deal with physical information passed in via hints, specifically
   
   - colocated join hint: ensure colocated join hints are enforced 
       - when colocated hint exist and table are not colocated --> issue warning or fail with excpeiton
       - when no colocated join hints are passed, colocation is best-effort, unless explicitly disabled via hint
   - partition key and functions: data can be partitioned by the same key but not the same function 
       - ensure partition/colocate are checked against both function and name
   
   - auto TableOptions
       - table partition key, size, function can be read from tableConfigs;
       - with RelDistribution and partition/colocated hint, table Options are no longer needed inside query
   
   Execute long-term optimization
   ------
   Several goals are listed down but can be discussed further
   
   - stage parallelism 
       - direct exchanges are not currently tested against stage parallelism, they are all 1-1
       - should support per-partition fan-out
   - relDistribution trait insertion before exchange rules
       - relDistribution trait can be walked before exchange thus exchange insertion can be done when expected and actual trait differs between input node result relDistribution and current node expected result relDistribution
   
   
   See more discussion in #12012 for detail impl plan
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org