You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/10/18 15:08:52 UTC

[GitHub] [arrow-datafusion] andygrove opened a new pull request, #3880: Multiple optimizer passes

andygrove opened a new pull request, #3880:
URL: https://github.com/apache/arrow-datafusion/pull/3880

   # Which issue does this PR close?
   
   <!--
   We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes #123` indicates that this PR will close issue #123.
   -->
   
   Closes https://github.com/apache/arrow-datafusion/issues/3879
   
    # Rationale for this change
   <!--
    Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed.
    Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes.  
   -->
   
   We miss optimizations if we only run the rules once
   
   # What changes are included in this PR?
   <!--
   There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR.
   -->
   
   # Are there any user-facing changes?
   <!--
   If there are user-facing changes then we may require documentation to be updated before approving the PR.
   -->
   
   <!--
   If there are any breaking changes to public APIs, please add the `api change` label.
   -->


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] andygrove commented on a diff in pull request #3880: Multiple optimizer passes

Posted by GitBox <gi...@apache.org>.
andygrove commented on code in PR #3880:
URL: https://github.com/apache/arrow-datafusion/pull/3880#discussion_r998392490


##########
benchmarks/src/bin/tpch.rs:
##########
@@ -665,6 +665,7 @@ mod tests {
     }
 
     #[tokio::test]
+    #[ignore] // https://github.com/apache/arrow-datafusion/issues/3881

Review Comment:
   I plan on fixing this before marking this PR as ready for review



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] andygrove commented on a diff in pull request #3880: Multiple optimizer passes

Posted by GitBox <gi...@apache.org>.
andygrove commented on code in PR #3880:
URL: https://github.com/apache/arrow-datafusion/pull/3880#discussion_r998391914


##########
benchmarks/expected-plans/q19.txt:
##########
@@ -3,7 +3,7 @@ Projection: SUM(lineitem.l_extendedprice * Int64(1) - lineitem.l_discount) AS re
     Projection: lineitem.l_extendedprice, lineitem.l_discount
       Filter: part.p_brand = Utf8("Brand#12") AND part.p_container IN ([Utf8("SM CASE"), Utf8("SM BOX"), Utf8("SM PACK"), Utf8("SM PKG")]) AND lineitem.l_quantity >= Decimal128(Some(100),15,2) AND lineitem.l_quantity <= Decimal128(Some(1100),15,2) AND part.p_size <= Int32(5) OR part.p_brand = Utf8("Brand#23") AND part.p_container IN ([Utf8("MED BAG"), Utf8("MED BOX"), Utf8("MED PKG"), Utf8("MED PACK")]) AND lineitem.l_quantity >= Decimal128(Some(1000),15,2) AND lineitem.l_quantity <= Decimal128(Some(2000),15,2) AND part.p_size <= Int32(10) OR part.p_brand = Utf8("Brand#34") AND part.p_container IN ([Utf8("LG CASE"), Utf8("LG BOX"), Utf8("LG PACK"), Utf8("LG PKG")]) AND lineitem.l_quantity >= Decimal128(Some(2000),15,2) AND lineitem.l_quantity <= Decimal128(Some(3000),15,2) AND part.p_size <= Int32(15)
         Inner Join: lineitem.l_partkey = part.p_partkey
-          Filter: lineitem.l_shipmode IN ([Utf8("AIR"), Utf8("AIR REG")]) AND lineitem.l_shipinstruct = Utf8("DELIVER IN PERSON") AND lineitem.l_quantity >= Decimal128(Some(100),15,2) AND lineitem.l_quantity <= Decimal128(Some(1100),15,2) OR lineitem.l_quantity >= Decimal128(Some(1000),15,2) AND lineitem.l_quantity <= Decimal128(Some(2000),15,2) OR lineitem.l_quantity >= Decimal128(Some(2000),15,2) AND lineitem.l_quantity <= Decimal128(Some(3000),15,2)
+          Filter: lineitem.l_quantity >= Decimal128(Some(100),15,2) AND lineitem.l_quantity <= Decimal128(Some(1100),15,2) OR lineitem.l_quantity >= Decimal128(Some(1000),15,2) AND lineitem.l_quantity <= Decimal128(Some(2000),15,2) OR lineitem.l_quantity >= Decimal128(Some(2000),15,2) AND lineitem.l_quantity <= Decimal128(Some(3000),15,2) AND lineitem.l_shipmode IN ([Utf8("AIR"), Utf8("AIR REG")]) AND lineitem.l_shipinstruct = Utf8("DELIVER IN PERSON")

Review Comment:
   The order changed here



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] isidentical commented on a diff in pull request #3880: Multiple optimizer passes

Posted by GitBox <gi...@apache.org>.
isidentical commented on code in PR #3880:
URL: https://github.com/apache/arrow-datafusion/pull/3880#discussion_r998491734


##########
datafusion/optimizer/src/optimizer.rs:
##########
@@ -189,38 +198,55 @@ impl Optimizer {
     where
         F: FnMut(&LogicalPlan, &dyn OptimizerRule),
     {
+        let mut plan_str = format!("{}", plan.display_indent());
         let mut new_plan = plan.clone();
-        log_plan("Optimizer input", plan);
+        let mut i = 0;
+        while i < optimizer_config.max_passes {
+            log_plan(&format!("Optimizer input (pass {})", i), &new_plan);
 
-        for rule in &self.rules {
-            let result = rule.optimize(&new_plan, optimizer_config);
-            match result {
-                Ok(plan) => {
-                    new_plan = plan;
-                    observer(&new_plan, rule.as_ref());
-                    log_plan(rule.name(), &new_plan);
-                }
-                Err(ref e) => {
-                    if optimizer_config.skip_failing_rules {
-                        // Note to future readers: if you see this warning it signals a
-                        // bug in the DataFusion optimizer. Please consider filing a ticket
-                        // https://github.com/apache/arrow-datafusion
-                        warn!(
+            for rule in &self.rules {
+                let result = rule.optimize(&new_plan, optimizer_config);
+                match result {
+                    Ok(plan) => {
+                        new_plan = plan;
+                        observer(&new_plan, rule.as_ref());
+                        log_plan(rule.name(), &new_plan);
+                    }
+                    Err(ref e) => {
+                        if optimizer_config.skip_failing_rules {
+                            // Note to future readers: if you see this warning it signals a
+                            // bug in the DataFusion optimizer. Please consider filing a ticket
+                            // https://github.com/apache/arrow-datafusion
+                            warn!(
                             "Skipping optimizer rule '{}' due to unexpected error: {}",
                             rule.name(),
                             e
                         );
-                    } else {
-                        return Err(DataFusionError::Internal(format!(
-                            "Optimizer rule '{}' failed due to unexpected error: {}",
-                            rule.name(),
-                            e
-                        )));
+                        } else {
+                            return Err(DataFusionError::Internal(format!(
+                                "Optimizer rule '{}' failed due to unexpected error: {}",
+                                rule.name(),
+                                e
+                            )));
+                        }
                     }
                 }
             }
+            log_plan(&format!("Optimized plan (pass {})", i), &new_plan);
+
+            // TODO this is an expensive way to see if the optimizer did anything and
+            // it would be better to change the OptimizerRule trait to return an Option
+            // instead
+            let new_plan_str = format!("{}", new_plan.display_indent());
+            if plan_str == new_plan_str {
+                // plan did not change, so no need to continue trying to optimize
+                debug!("optimizer pass {} did not make changes", i);
+                break;
+            }

Review Comment:
   +1! This is probably a follow-up item; but instead of having the optimizer decide on this (by returning an option), it might also make sense to compute a unique plan id (bottom up) so that we can also use this to detect optimization cycles. 
   
   A very basic example is (assuming each letter is a unique plan id) `A -> B -> C -> A -> B -> [max passes times more]`, where even though the previous plan is different from the current one we would still need to exit the loop. Having a unique id would mean we can just store a set somewhere and check against `if known_plans.contains(new_plan.id)` and it would break the loop.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] andygrove commented on a diff in pull request #3880: Multiple optimizer passes

Posted by GitBox <gi...@apache.org>.
andygrove commented on code in PR #3880:
URL: https://github.com/apache/arrow-datafusion/pull/3880#discussion_r998392185


##########
benchmarks/expected-plans/q19.txt:
##########
@@ -3,7 +3,7 @@ Projection: SUM(lineitem.l_extendedprice * Int64(1) - lineitem.l_discount) AS re
     Projection: lineitem.l_extendedprice, lineitem.l_discount
       Filter: part.p_brand = Utf8("Brand#12") AND part.p_container IN ([Utf8("SM CASE"), Utf8("SM BOX"), Utf8("SM PACK"), Utf8("SM PKG")]) AND lineitem.l_quantity >= Decimal128(Some(100),15,2) AND lineitem.l_quantity <= Decimal128(Some(1100),15,2) AND part.p_size <= Int32(5) OR part.p_brand = Utf8("Brand#23") AND part.p_container IN ([Utf8("MED BAG"), Utf8("MED BOX"), Utf8("MED PKG"), Utf8("MED PACK")]) AND lineitem.l_quantity >= Decimal128(Some(1000),15,2) AND lineitem.l_quantity <= Decimal128(Some(2000),15,2) AND part.p_size <= Int32(10) OR part.p_brand = Utf8("Brand#34") AND part.p_container IN ([Utf8("LG CASE"), Utf8("LG BOX"), Utf8("LG PACK"), Utf8("LG PKG")]) AND lineitem.l_quantity >= Decimal128(Some(2000),15,2) AND lineitem.l_quantity <= Decimal128(Some(3000),15,2) AND part.p_size <= Int32(15)
         Inner Join: lineitem.l_partkey = part.p_partkey
-          Filter: lineitem.l_shipmode IN ([Utf8("AIR"), Utf8("AIR REG")]) AND lineitem.l_shipinstruct = Utf8("DELIVER IN PERSON") AND lineitem.l_quantity >= Decimal128(Some(100),15,2) AND lineitem.l_quantity <= Decimal128(Some(1100),15,2) OR lineitem.l_quantity >= Decimal128(Some(1000),15,2) AND lineitem.l_quantity <= Decimal128(Some(2000),15,2) OR lineitem.l_quantity >= Decimal128(Some(2000),15,2) AND lineitem.l_quantity <= Decimal128(Some(3000),15,2)
+          Filter: lineitem.l_quantity >= Decimal128(Some(100),15,2) AND lineitem.l_quantity <= Decimal128(Some(1100),15,2) OR lineitem.l_quantity >= Decimal128(Some(1000),15,2) AND lineitem.l_quantity <= Decimal128(Some(2000),15,2) OR lineitem.l_quantity >= Decimal128(Some(2000),15,2) AND lineitem.l_quantity <= Decimal128(Some(3000),15,2) AND lineitem.l_shipmode IN ([Utf8("AIR"), Utf8("AIR REG")]) AND lineitem.l_shipinstruct = Utf8("DELIVER IN PERSON")
             TableScan: lineitem projection=[l_partkey, l_quantity, l_extendedprice, l_discount, l_shipinstruct, l_shipmode]
-          Filter: part.p_size >= Int32(1) AND part.p_brand = Utf8("Brand#12") AND part.p_container IN ([Utf8("SM CASE"), Utf8("SM BOX"), Utf8("SM PACK"), Utf8("SM PKG")]) AND part.p_size <= Int32(5) OR part.p_brand = Utf8("Brand#23") AND part.p_container IN ([Utf8("MED BAG"), Utf8("MED BOX"), Utf8("MED PKG"), Utf8("MED PACK")]) AND part.p_size <= Int32(10) OR part.p_brand = Utf8("Brand#34") AND part.p_container IN ([Utf8("LG CASE"), Utf8("LG BOX"), Utf8("LG PACK"), Utf8("LG PKG")]) AND part.p_size <= Int32(15)
-            TableScan: part projection=[p_partkey, p_brand, p_size, p_container]
+          Filter: part.p_brand = Utf8("Brand#12") AND part.p_container IN ([Utf8("SM CASE"), Utf8("SM BOX"), Utf8("SM PACK"), Utf8("SM PKG")]) AND part.p_size <= Int32(5) OR part.p_brand = Utf8("Brand#23") AND part.p_container IN ([Utf8("MED BAG"), Utf8("MED BOX"), Utf8("MED PKG"), Utf8("MED PACK")]) AND part.p_size <= Int32(10) OR part.p_brand = Utf8("Brand#34") AND part.p_container IN ([Utf8("LG CASE"), Utf8("LG BOX"), Utf8("LG PACK"), Utf8("LG PKG")]) AND part.p_size <= Int32(15) AND part.p_size >= Int32(1)

Review Comment:
   The order changed here



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] andygrove commented on a diff in pull request #3880: Multiple optimizer passes

Posted by GitBox <gi...@apache.org>.
andygrove commented on code in PR #3880:
URL: https://github.com/apache/arrow-datafusion/pull/3880#discussion_r998448651


##########
datafusion/optimizer/src/projection_push_down.rs:
##########
@@ -509,7 +509,25 @@ fn optimize_plan(
 
             from_plan(plan, &expr, &new_inputs)
         }
-    }
+    };
+
+    // when this rule is applied multiple times it will insert duplicate nested projections,
+    // so we catch this here
+    let with_dupe_projection_removed = match new_plan? {
+        LogicalPlan::Projection(p) => match p.input.as_ref() {
+            LogicalPlan::Projection(p2) if projection_equal(&p, p2) => {
+                LogicalPlan::Projection(p2.clone())
+            }
+            _ => LogicalPlan::Projection(p),
+        },
+        other => other,
+    };

Review Comment:
   This is the fix for https://github.com/apache/arrow-datafusion/issues/3881



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] andygrove merged pull request #3880: Multiple optimizer passes

Posted by GitBox <gi...@apache.org>.
andygrove merged PR #3880:
URL: https://github.com/apache/arrow-datafusion/pull/3880


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] andygrove commented on a diff in pull request #3880: Multiple optimizer passes

Posted by GitBox <gi...@apache.org>.
andygrove commented on code in PR #3880:
URL: https://github.com/apache/arrow-datafusion/pull/3880#discussion_r998369098


##########
datafusion/core/tests/sql/subqueries.rs:
##########
@@ -394,9 +394,9 @@ order by cntrycode;"#;
               Projection: AVG(customer.c_acctbal) AS __value, alias=__sq_1
                 Aggregate: groupBy=[[]], aggr=[[AVG(customer.c_acctbal)]]
                   Filter: customer.c_acctbal > Decimal128(Some(0),15,2) AND substr(customer.c_phone, Int64(1), Int64(2)) IN ([Utf8("13"), Utf8("31"), Utf8("23"), Utf8("29"), Utf8("30"), Utf8("18"), Utf8("17")])
-                    TableScan: customer projection=[c_phone, c_acctbal], partial_filters=[CAST(customer.c_acctbal AS Decimal128(30, 15)) > Decimal128(Some(0),30,15), substr(customer.c_phone, Int64(1), Int64(2)) IN ([Utf8("13"), Utf8("31"), Utf8("23"), Utf8("29"), Utf8("30"), Utf8("18"), Utf8("17")])]"#
+                    TableScan: customer projection=[c_phone, c_acctbal], partial_filters=[CAST(customer.c_acctbal AS Decimal128(30, 15)) > Decimal128(Some(0),30,15), substr(customer.c_phone, Int64(1), Int64(2)) IN ([Utf8("13"), Utf8("31"), Utf8("23"), Utf8("29"), Utf8("30"), Utf8("18"), Utf8("17")]), customer.c_acctbal > Decimal128(Some(0),15,2)]"#

Review Comment:
   Example of an optimization that we previously missed



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] andygrove commented on a diff in pull request #3880: Multiple optimizer passes

Posted by GitBox <gi...@apache.org>.
andygrove commented on code in PR #3880:
URL: https://github.com/apache/arrow-datafusion/pull/3880#discussion_r998837473


##########
datafusion/optimizer/src/optimizer.rs:
##########
@@ -189,38 +198,55 @@ impl Optimizer {
     where
         F: FnMut(&LogicalPlan, &dyn OptimizerRule),
     {
+        let mut plan_str = format!("{}", plan.display_indent());
         let mut new_plan = plan.clone();
-        log_plan("Optimizer input", plan);
+        let mut i = 0;
+        while i < optimizer_config.max_passes {
+            log_plan(&format!("Optimizer input (pass {})", i), &new_plan);
 
-        for rule in &self.rules {
-            let result = rule.optimize(&new_plan, optimizer_config);
-            match result {
-                Ok(plan) => {
-                    new_plan = plan;
-                    observer(&new_plan, rule.as_ref());
-                    log_plan(rule.name(), &new_plan);
-                }
-                Err(ref e) => {
-                    if optimizer_config.skip_failing_rules {
-                        // Note to future readers: if you see this warning it signals a
-                        // bug in the DataFusion optimizer. Please consider filing a ticket
-                        // https://github.com/apache/arrow-datafusion
-                        warn!(
+            for rule in &self.rules {
+                let result = rule.optimize(&new_plan, optimizer_config);
+                match result {
+                    Ok(plan) => {
+                        new_plan = plan;
+                        observer(&new_plan, rule.as_ref());
+                        log_plan(rule.name(), &new_plan);
+                    }
+                    Err(ref e) => {
+                        if optimizer_config.skip_failing_rules {
+                            // Note to future readers: if you see this warning it signals a
+                            // bug in the DataFusion optimizer. Please consider filing a ticket
+                            // https://github.com/apache/arrow-datafusion
+                            warn!(
                             "Skipping optimizer rule '{}' due to unexpected error: {}",
                             rule.name(),
                             e
                         );
-                    } else {
-                        return Err(DataFusionError::Internal(format!(
-                            "Optimizer rule '{}' failed due to unexpected error: {}",
-                            rule.name(),
-                            e
-                        )));
+                        } else {
+                            return Err(DataFusionError::Internal(format!(
+                                "Optimizer rule '{}' failed due to unexpected error: {}",
+                                rule.name(),
+                                e
+                            )));
+                        }
                     }
                 }
             }
+            log_plan(&format!("Optimized plan (pass {})", i), &new_plan);
+
+            // TODO this is an expensive way to see if the optimizer did anything and
+            // it would be better to change the OptimizerRule trait to return an Option
+            // instead
+            let new_plan_str = format!("{}", new_plan.display_indent());
+            if plan_str == new_plan_str {
+                // plan did not change, so no need to continue trying to optimize
+                debug!("optimizer pass {} did not make changes", i);
+                break;
+            }

Review Comment:
   That's a great idea. Thanks @isidentical. I will write up an issue.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] Dandandan commented on pull request #3880: Multiple optimizer passes

Posted by GitBox <gi...@apache.org>.
Dandandan commented on PR #3880:
URL: https://github.com/apache/arrow-datafusion/pull/3880#issuecomment-1282740124

   It probably makes sense to have some information now how long the optimization takes & maybe provide a conservative number for the maximum optimization passes as @alamb hinted in an earlier PR
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] andygrove commented on a diff in pull request #3880: Multiple optimizer passes

Posted by GitBox <gi...@apache.org>.
andygrove commented on code in PR #3880:
URL: https://github.com/apache/arrow-datafusion/pull/3880#discussion_r999482681


##########
datafusion/optimizer/src/optimizer.rs:
##########
@@ -189,38 +198,55 @@ impl Optimizer {
     where
         F: FnMut(&LogicalPlan, &dyn OptimizerRule),
     {
+        let mut plan_str = format!("{}", plan.display_indent());
         let mut new_plan = plan.clone();
-        log_plan("Optimizer input", plan);
+        let mut i = 0;
+        while i < optimizer_config.max_passes {
+            log_plan(&format!("Optimizer input (pass {})", i), &new_plan);
 
-        for rule in &self.rules {
-            let result = rule.optimize(&new_plan, optimizer_config);
-            match result {
-                Ok(plan) => {
-                    new_plan = plan;
-                    observer(&new_plan, rule.as_ref());
-                    log_plan(rule.name(), &new_plan);
-                }
-                Err(ref e) => {
-                    if optimizer_config.skip_failing_rules {
-                        // Note to future readers: if you see this warning it signals a
-                        // bug in the DataFusion optimizer. Please consider filing a ticket
-                        // https://github.com/apache/arrow-datafusion
-                        warn!(
+            for rule in &self.rules {
+                let result = rule.optimize(&new_plan, optimizer_config);
+                match result {
+                    Ok(plan) => {
+                        new_plan = plan;
+                        observer(&new_plan, rule.as_ref());
+                        log_plan(rule.name(), &new_plan);
+                    }
+                    Err(ref e) => {
+                        if optimizer_config.skip_failing_rules {
+                            // Note to future readers: if you see this warning it signals a
+                            // bug in the DataFusion optimizer. Please consider filing a ticket
+                            // https://github.com/apache/arrow-datafusion
+                            warn!(
                             "Skipping optimizer rule '{}' due to unexpected error: {}",
                             rule.name(),
                             e
                         );
-                    } else {
-                        return Err(DataFusionError::Internal(format!(
-                            "Optimizer rule '{}' failed due to unexpected error: {}",
-                            rule.name(),
-                            e
-                        )));
+                        } else {
+                            return Err(DataFusionError::Internal(format!(
+                                "Optimizer rule '{}' failed due to unexpected error: {}",
+                                rule.name(),
+                                e
+                            )));
+                        }
                     }
                 }
             }
+            log_plan(&format!("Optimized plan (pass {})", i), &new_plan);
+
+            // TODO this is an expensive way to see if the optimizer did anything and
+            // it would be better to change the OptimizerRule trait to return an Option
+            // instead
+            let new_plan_str = format!("{}", new_plan.display_indent());
+            if plan_str == new_plan_str {
+                // plan did not change, so no need to continue trying to optimize
+                debug!("optimizer pass {} did not make changes", i);
+                break;
+            }

Review Comment:
   I filed https://github.com/apache/arrow-datafusion/issues/3892 to track this



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] andygrove commented on pull request #3880: Multiple optimizer passes

Posted by GitBox <gi...@apache.org>.
andygrove commented on PR #3880:
URL: https://github.com/apache/arrow-datafusion/pull/3880#issuecomment-1284040664

   > It probably makes sense to have some information now how long the optimization takes & maybe provide a conservative number for the maximum optimization passes as @alamb hinted in an earlier PR
   
   I reduced the default `max_passes` to 3 and added debug logging for optimizer elapsed time


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org