You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/03/17 01:40:14 UTC

[GitHub] [arrow-datafusion] xudong963 edited a comment on issue #1972: DataFusion Optimizer framework discussion

xudong963 edited a comment on issue #1972:
URL: https://github.com/apache/arrow-datafusion/issues/1972#issuecomment-1069816879


   > I'm planning to add the `volcano/cascades optimizer` framework for datafusion. After I get more familiar with datafusion, I will add more detail and RFC for this part.
   
   Cascades is an optimization framework that uses a cost-based approach to explore possible executable
   of the search space. So I have some my thoughts:
   1. It requires a solid and as accurate as possible cost-based optimization.
   2. It is likely to lead to misguidance, which produces local convergence of the search and fails to produce an optimal solution.
   3. It will have a relatively large impact on the current codebase. (Of course, if it can be verified that there is a clear benefit in most scenarios, I support its introduction)
   4. I prefer to continue experimenting with egg, @pjmore has done many works https://github.com/apache/arrow-datafusion/pull/1485, fyi https://github.com/apache/arrow-datafusion/issues/440


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org