You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/03/10 14:49:31 UTC
[GitHub] [arrow-datafusion] alamb commented on issue #1972: DataFusion Optimizer framework discussion
alamb commented on issue #1972:
URL: https://github.com/apache/arrow-datafusion/issues/1972#issuecomment-1064143265
Thank you for bringing this up @mingmwang. My view is that the current optimizer framework could definitely use improving 👍
> t is more a visitor model (like the old Presto optimizer rules).
I think I would say the current optimizer rules "aspire" to be a visitor model; Right now the interface for an optimizer looks like this:
```rust
pub trait OptimizerRule {
/// Rewrite `plan` to an optimized form
fn optimize(
&self,
plan: &LogicalPlan,
execution_props: &ExecutionProps,
) -> Result<LogicalPlan>;
/// A human readable name for this optimizer rule
fn name(&self) -> &str;
}
```
which as you say requires each optimizer rule to handle the recursion itself. There are helpers in https://github.com/apache/arrow-datafusion/blob/master/datafusion/src/optimizer/utils.rs such as `optimize_children` and `from_plan` but I would say they are not particularly ergonomic
So what I had hoped to do at some point was to make an actual visitor / mutator pattern for rewriting `LogicalPlan`. We have done this for `Expr` rewriting [expr_rewriter.rs](https://github.com/apache/arrow-datafusion/blob/master/datafusion/src/logical_plan/expr_rewriter.rs) and I think it works very well to separate out the structure traversal from the actual changes (see, for example, the code for simplifying expressions here: [simplify_expressions.rs](https://github.com/apache/arrow-datafusion/blob/master/datafusion/src/optimizer/simplify_expressions.rs)
cc @Dandandan and @realno who may also have idea on this
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org