You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/07/21 20:22:36 UTC

[GitHub] [arrow-datafusion] alamb commented on a change in pull request #764: #723 limit pruning rule to simple expression

alamb commented on a change in pull request #764:
URL: https://github.com/apache/arrow-datafusion/pull/764#discussion_r674312486



##########
File path: datafusion/src/physical_optimizer/pruning.rs
##########
@@ -399,40 +406,63 @@ impl<'a> PruningExpressionBuilder<'a> {
         Ok(Self {
             column,
             column_expr,
+            op: correct_operator,
             scalar_expr,
             field,
             required_columns,
-            reverse_operator,
         })
     }
 
-    fn correct_operator(&self, op: Operator) -> Operator {
-        if !self.reverse_operator {
-            return op;
-        }
-
-        match op {
-            Operator::Lt => Operator::Gt,
-            Operator::Gt => Operator::Lt,
-            Operator::LtEq => Operator::GtEq,
-            Operator::GtEq => Operator::LtEq,
-            _ => op,
-        }
+    fn op(&self) -> Operator {
+        self.op
     }
 
     fn scalar_expr(&self) -> &Expr {
-        self.scalar_expr
+        &self.scalar_expr
     }
 
     fn min_column_expr(&mut self) -> Result<Expr> {
         self.required_columns
-            .min_column_expr(&self.column, self.column_expr, self.field)
+            .min_column_expr(&self.column, &self.column_expr, self.field)
     }
 
     fn max_column_expr(&mut self) -> Result<Expr> {
         self.required_columns
-            .max_column_expr(&self.column, self.column_expr, self.field)
+            .max_column_expr(&self.column, &self.column_expr, self.field)
+    }
+}
+
+fn rewrite_expr_to_prunable(
+    column_expr: &Expr,
+    op: Operator,
+    scalar_expr: &Expr,
+) -> Result<(Expr, Operator, Expr)> {

Review comment:
       Thank you @lvheyang  -- given the current implementation gives incorrect results, I suggest we take a conservative here. A wise mentor of mine once told me "I can make it go as fast as you want if you don't constrain me to be correct"
   
   Thus, what I recommend is to update this PR so that it works for expr types we know work (e.g. just the ones you list above is more than adequate).
   
   Then we can expand the list of supported Expr types as future PRs. 
   
   If you try and special case as many Expr types as possible in this PR I think it is going to be quite challenging. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org