You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "alamb (via GitHub)" <gi...@apache.org> on 2023/03/18 16:23:07 UTC

[GitHub] [arrow-datafusion] alamb opened a new pull request, #5638: Minor: reduce cloneing in `infer_placeholder_types`

alamb opened a new pull request, #5638:
URL: https://github.com/apache/arrow-datafusion/pull/5638

   # Which issue does this PR close?
   
   related to faster planning: https://github.com/apache/arrow-datafusion/issues/5637
   related to like inference: https://github.com/apache/arrow-datafusion/issues/5617
   
   # Rationale for this change
   
   This function has a bunch of unecessary `clone()` which likely means running parameter type inference can't be done in allout-auto-activation
   
   # What changes are included in this PR?
   
   Avoids a bunch of expr copies when inferring placeholder types, using less code
   
   
   # Are these changes tested?
   Existing tests cover this functionality
   
   # Are there any user-facing changes?
   No


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] avantgardnerio commented on a diff in pull request #5638: Minor: reduce cloneing in `infer_placeholder_types`

Posted by "avantgardnerio (via GitHub)" <gi...@apache.org>.
avantgardnerio commented on code in PR #5638:
URL: https://github.com/apache/arrow-datafusion/pull/5638#discussion_r1141396158


##########
datafusion/common/src/error.rs:
##########
@@ -417,6 +414,11 @@ impl DataFusionError {
         // return last checkpoint (which may be the original error)
         last_datafusion_error
     }
+
+    /// wraps self in Self::Context with a description
+    pub fn context(self, description: impl Into<String>) -> Self {

Review Comment:
   Yay, thank you! I'm happy to see this being used and improved.



##########
datafusion/sql/src/expr/mod.rs:
##########
@@ -499,36 +499,33 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> {
     }
 }
 
-/// Find all `PlaceHolder` tokens in a logical plan, and try to infer their type from context
-fn infer_placeholder_types(expr: Expr, schema: DFSchema) -> Result<Expr> {
-    rewrite_expr(expr, |expr| {
-        let expr = match expr {
-            Expr::BinaryExpr(BinaryExpr { left, op, right }) => {
-                let left = (*left).clone();
-                let right = (*right).clone();
-                let lt = left.get_type(&schema);
-                let rt = right.get_type(&schema);
-                let left = match (&left, rt) {
-                    (Expr::Placeholder { id, data_type }, Ok(dt)) => Expr::Placeholder {
-                        id: id.clone(),
-                        data_type: Some(data_type.clone().unwrap_or(dt)),
-                    },
-                    _ => left.clone(),
-                };
-                let right = match (&right, lt) {
-                    (Expr::Placeholder { id, data_type }, Ok(dt)) => Expr::Placeholder {
-                        id: id.clone(),
-                        data_type: Some(data_type.clone().unwrap_or(dt)),
-                    },
-                    _ => right.clone(),
-                };
-                Expr::BinaryExpr(BinaryExpr {
-                    left: Box::new(left),
-                    op,
-                    right: Box::new(right),
-                })
+// modifies expr if it is a placeholder with datatype of right
+fn rewrite_placeholder(expr: &mut Expr, other: &Expr, schema: &DFSchema) -> Result<()> {
+    if let Expr::Placeholder { id: _, data_type } = expr {
+        if data_type.is_none() {
+            let other_dt = other.get_type(schema);
+            match other_dt {
+                Err(e) => {
+                    return Err(e.context(format!(
+                        "Can not find type of {other} needed to infer type of {expr}"
+                    )))?;
+                }
+                Ok(dt) => {
+                    *data_type = Some(dt);
+                }
             }
-            _ => expr.clone(),
+        };
+    }
+    Ok(())
+}
+
+/// Find all [`Expr::PlaceHolder`] tokens in a logical plan, and try to infer their type from context
+fn infer_placeholder_types(expr: Expr, schema: &DFSchema) -> Result<Expr> {
+    rewrite_expr(expr, |mut expr| {
+        // Default to assuming the arguments are the same type
+        if let Expr::BinaryExpr(BinaryExpr { left, op: _, right }) = &mut expr {
+            rewrite_placeholder(left.as_mut(), right.as_ref(), schema)?;
+            rewrite_placeholder(right.as_mut(), left.as_ref(), schema)?;

Review Comment:
   Nice! :)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] comphead commented on a diff in pull request #5638: Minor: reduce cloneing in `infer_placeholder_types`

Posted by "comphead (via GitHub)" <gi...@apache.org>.
comphead commented on code in PR #5638:
URL: https://github.com/apache/arrow-datafusion/pull/5638#discussion_r1141068734


##########
datafusion/sql/src/expr/mod.rs:
##########
@@ -499,36 +499,33 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> {
     }
 }
 
-/// Find all `PlaceHolder` tokens in a logical plan, and try to infer their type from context
-fn infer_placeholder_types(expr: Expr, schema: DFSchema) -> Result<Expr> {
-    rewrite_expr(expr, |expr| {
-        let expr = match expr {
-            Expr::BinaryExpr(BinaryExpr { left, op, right }) => {
-                let left = (*left).clone();
-                let right = (*right).clone();
-                let lt = left.get_type(&schema);
-                let rt = right.get_type(&schema);
-                let left = match (&left, rt) {
-                    (Expr::Placeholder { id, data_type }, Ok(dt)) => Expr::Placeholder {
-                        id: id.clone(),
-                        data_type: Some(data_type.clone().unwrap_or(dt)),
-                    },
-                    _ => left.clone(),
-                };
-                let right = match (&right, lt) {
-                    (Expr::Placeholder { id, data_type }, Ok(dt)) => Expr::Placeholder {
-                        id: id.clone(),
-                        data_type: Some(data_type.clone().unwrap_or(dt)),
-                    },
-                    _ => right.clone(),
-                };
-                Expr::BinaryExpr(BinaryExpr {
-                    left: Box::new(left),
-                    op,
-                    right: Box::new(right),
-                })
+// modifies expr if it is a placeholder with datatype of right
+fn rewrite_placeholder(expr: &mut Expr, other: &Expr, schema: &DFSchema) -> Result<()> {
+    if let Expr::Placeholder { id: _, data_type } = expr {
+        if data_type.is_none() {
+            let other_dt = other.get_type(schema);
+            match other_dt {
+                Err(e) => {
+                    return Err(e.context(format!(
+                        "Can not find type of {other} needed to infer type of {expr}"
+                    )))?;
+                }
+                Ok(dt) => {
+                    *data_type = Some(dt);
+                }
             }
-            _ => expr.clone(),
+        };
+    }
+    Ok(())
+}
+
+/// Find all [`Expr::PlaceHolder`] tokens in a logical plan, and try to infer their type from context
+fn infer_placeholder_types(expr: Expr, schema: &DFSchema) -> Result<Expr> {
+    rewrite_expr(expr, |mut expr| {
+        // Default to assuming the arguments are the same type
+        if let Expr::BinaryExpr(BinaryExpr { left, op: _, right }) = &mut expr {
+            rewrite_placeholder(left.as_mut(), right.as_ref(), schema)?;
+            rewrite_placeholder(right.as_mut(), left.as_ref(), schema)?;

Review Comment:
   Much more concise!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #5638: Minor: reduce cloneing in `infer_placeholder_types`

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on code in PR #5638:
URL: https://github.com/apache/arrow-datafusion/pull/5638#discussion_r1141047895


##########
datafusion/sql/src/expr/mod.rs:
##########
@@ -78,7 +78,7 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> {
         let mut expr = self.sql_expr_to_logical_expr(sql, schema, planner_context)?;
         expr = self.rewrite_partial_qualifier(expr, schema);
         self.validate_schema_satisfies_exprs(schema, &[expr.clone()])?;
-        let expr = infer_placeholder_types(expr, schema.clone())?;

Review Comment:
   This cloned the schema as well as cloning all BinaryExprs



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #5638: Minor: reduce cloning in `infer_placeholder_types`

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on code in PR #5638:
URL: https://github.com/apache/arrow-datafusion/pull/5638#discussion_r1141975083


##########
datafusion/common/src/error.rs:
##########
@@ -417,6 +414,11 @@ impl DataFusionError {
         // return last checkpoint (which may be the original error)
         last_datafusion_error
     }
+
+    /// wraps self in Self::Context with a description
+    pub fn context(self, description: impl Into<String>) -> Self {

Review Comment:
   Yeah, it is really nice -- thanks for adding it!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] alamb merged pull request #5638: Minor: reduce cloning in `infer_placeholder_types`

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb merged PR #5638:
URL: https://github.com/apache/arrow-datafusion/pull/5638


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org