You are viewing a plain text version of this content. The canonical link for it is here.

Posted to github@arrow.apache.org by "metesynnada (via GitHub)" <gi...@apache.org> on 2023/04/03 15:45:07 UTC

[GitHub] [arrow-datafusion] metesynnada commented on pull request #5754: Improving optimizer performance by eliminating unnecessary sort and distribution passes, add more SymmetricHashJoin improvements

metesynnada commented on PR #5754:
URL: https://github.com/apache/arrow-datafusion/pull/5754#issuecomment-1494562080

   > Can we have a different physical optimizers list for the plans with/without unbounded sources?
   
   I think this would cause problems while we are using bounded and unbounded sources together in the same query. 
   
   > And I think the bound/unbounded source should be an attribute or method for Source Operators only.
   
   Assigning the responsibility to each ExecutionPlan to determine whether its input is unbounded or not, similar to order/distribution information, seems to be the optimal strategy for unifying unbounded and bounded execution. This approach maintains a separation of concerns and empowers us to make atomic decisions with our best effort. 
   
   Attempting to solve this problem globally may lead to inflexible design patterns and technical debt. Nonetheless, we currently have the capability to optimize and handle complex queries with a combination of both unbounded and bounded sources, which is a robust solution.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org