You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/11/04 07:44:05 UTC

[GitHub] [arrow-datafusion] mingmwang commented on a diff in pull request #3912: Expression boundary analysis framework

mingmwang commented on code in PR #3912:
URL: https://github.com/apache/arrow-datafusion/pull/3912#discussion_r1013706030


##########
datafusion/physical-expr/src/expressions/binary.rs:
##########
@@ -665,87 +656,81 @@ impl PhysicalExpr for BinaryExpr {
             children[1].clone(),
         )))
     }
-}
-
-impl PartialEq<dyn Any> for BinaryExpr {
-    fn eq(&self, other: &dyn Any) -> bool {
-        down_cast_any_ref(other)
-            .downcast_ref::<Self>()
-            .map(|x| self.left.eq(&x.left) && self.op == x.op && self.right.eq(&x.right))
-            .unwrap_or(false)
-    }
-}
-
-struct BinaryExprStats {
-    op: Operator,
-    left: Arc<dyn PhysicalExpr>,
-    right: Arc<dyn PhysicalExpr>,
-}
 
-impl PhysicalExprStats for BinaryExprStats {
-    fn boundaries(&self, columns: &[ColumnStatistics]) -> Option<ExprBoundaries> {
+    /// Return the boundaries of this binary expression's result. If the expression itself
+    /// is a comparison which changes the boundaries of one of its inputs (a = 20 would pin
+    /// a to 20), then it might update the input's boundaries directly on the context.
+    fn analyze(&self, context: &mut AnalysisContext) -> Option<ExprBoundaries> {

Review Comment:
   > Since AnalysisContext is in terms of the "current schema" it will likely almost always end up being modified if the execution plan modifies the schema (which most do).
   > 
   > I find that the mix of "input" and "output" of a `mut &AnalysisContext` to be confusing to explain.
   > 
   > Since an `AnalysisContext` has `ExprBoundaries` in it anyways, I recommend the signature be
   > 
   > ```rust
   > fn analyze(&self) -> AnalysisContext {
   > ...
   > }
   > ```
   > 
   > And then pass each input can produce the context / boundary analysis for itself and the parent operators can combine them
   > 
   > Given that the inputs will produced an owned AnalysisContext it will be clear what is input and what is output and I think it can be made quite fast in Rust
   
   I would prefer this change also. It makes the signature more clear. It is similar to the `EquivalenceProperties` in the ExecutionPlan, mixing "input" and "output" is very confusing. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org