You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/12/24 17:12:56 UTC

[GitHub] [arrow] Dandandan commented on a change in pull request #9007: ARROW-11029: [Rust] [DataFusion] Improve join order optimization [WIP]

Dandandan commented on a change in pull request #9007:
URL: https://github.com/apache/arrow/pull/9007#discussion_r548630284



##########
File path: rust/datafusion/src/optimizer/hash_build_probe_order.rs
##########
@@ -55,7 +53,25 @@ fn get_num_rows(logical_plan: &LogicalPlan) -> Option<usize> {
             let num_rows_input = get_num_rows(input);
             num_rows_input.map(|rows| std::cmp::min(*limit, rows))
         }
-        _ => None,
+        LogicalPlan::Aggregate { .. } => {
+            // we cannot yet predict how many rows will be produced by an aggregate because
+            // we do not know the cardinality of the grouping keys
+            None
+        }
+        LogicalPlan::Filter { .. } => {
+            // we cannot yet predict how many rows will be produced by a filter because
+            // we don't know how selective it is (how many rows it will filter out)
+            None
+        }
+        _ => {

Review comment:
       Looks good. I wonder if it is better to explicitly match on all arms, so if we add a new LP alternative we have to think about the implications here.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org