You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "jackwener (via GitHub)" <gi...@apache.org> on 2023/04/05 14:57:33 UTC

[GitHub] [arrow-datafusion] jackwener opened a new pull request, #5883: fix: type coercion for expr/subquery in InSubquery

jackwener opened a new pull request, #5883:
URL: https://github.com/apache/arrow-datafusion/pull/5883

   # Which issue does this PR close?
   
   Closes #.
   
   # Rationale for this change
   
   # What changes are included in this PR?
   
   # Are these changes tested?
   
    type coercion should do for expr/subquery.
   
   # Are there any user-facing changes?
   
   <!--
   If there are user-facing changes then we may require documentation to be updated before approving the PR.
   -->
   
   <!--
   If there are any breaking changes to public APIs, please add the `api change` label.
   -->


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] liukun4515 commented on a diff in pull request #5883: fix: type coercion for expr/subquery in InSubquery

Posted by "liukun4515 (via GitHub)" <gi...@apache.org>.
liukun4515 commented on code in PR #5883:
URL: https://github.com/apache/arrow-datafusion/pull/5883#discussion_r1160386833


##########
datafusion/expr/src/expr_schema.rs:
##########
@@ -326,7 +317,12 @@ impl ExprSchemable for Expr {
     }
 }
 
-fn cast_subquery(subquery: Subquery, cast_to_type: &DataType) -> Result<Subquery> {
+/// cast subquery in InSubquery/ScalarSubquery to a given type.

Review Comment:
   I find this method is used in `scalar subquer`, but the comment is for Insubquery/scalarquery.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] liukun4515 commented on a diff in pull request #5883: fix: type coercion for expr/subquery in InSubquery

Posted by "liukun4515 (via GitHub)" <gi...@apache.org>.
liukun4515 commented on code in PR #5883:
URL: https://github.com/apache/arrow-datafusion/pull/5883#discussion_r1160391649


##########
datafusion/optimizer/src/analyzer/type_coercion.rs:
##########
@@ -749,16 +754,30 @@ mod test {
     };
     use crate::test::assert_analyzed_plan_eq;
 
+    fn empty() -> Arc<LogicalPlan> {

Review Comment:
   👍



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] liukun4515 commented on a diff in pull request #5883: fix: type coercion for expr/subquery in InSubquery

Posted by "liukun4515 (via GitHub)" <gi...@apache.org>.
liukun4515 commented on code in PR #5883:
URL: https://github.com/apache/arrow-datafusion/pull/5883#discussion_r1160387347


##########
datafusion/optimizer/src/analyzer/type_coercion.rs:
##########
@@ -147,20 +148,24 @@ impl TreeNodeRewriter for TypeCoercionRewriter {
                 subquery,
                 negated,
             } => {
-                let expr_type = expr.get_type(&self.schema)?;
                 let new_plan = analyze_internal(&self.schema, &subquery.subquery)?;
+                let expr_type = expr.get_type(&self.schema)?;
                 let subquery_type = new_plan.schema().field(0).data_type();
-                let expr = if &expr_type == subquery_type {
-                    expr
-                } else {
-                    Box::new(expr.cast_to(subquery_type, &self.schema)?)
+                let result_type = comparison_coercion(&expr_type, subquery_type).ok_or(DataFusionError::Plan(

Review Comment:
   maybe the common_type or coercion_type is better.
   the result type of `in subquery` or `comparation` is `boolean`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] jackwener merged pull request #5883: fix: type coercion for expr/subquery in InSubquery

Posted by "jackwener (via GitHub)" <gi...@apache.org>.
jackwener merged PR #5883:
URL: https://github.com/apache/arrow-datafusion/pull/5883


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] liukun4515 commented on a diff in pull request #5883: fix: type coercion for expr/subquery in InSubquery

Posted by "liukun4515 (via GitHub)" <gi...@apache.org>.
liukun4515 commented on code in PR #5883:
URL: https://github.com/apache/arrow-datafusion/pull/5883#discussion_r1160397695


##########
datafusion/optimizer/src/analyzer/type_coercion.rs:
##########
@@ -1428,33 +1404,47 @@ mod test {
     }
 
     #[test]
-    fn in_subquery() -> Result<()> {
-        let empty_inside = Arc::new(LogicalPlan::EmptyRelation(EmptyRelation {
-            produce_one_row: false,
-            schema: Arc::new(DFSchema::new_with_metadata(
-                vec![DFField::new_unqualified("a_int32", DataType::Int32, true)],
-                std::collections::HashMap::new(),
-            )?),
-        }));
-        let empty_outside = Arc::new(LogicalPlan::EmptyRelation(EmptyRelation {
-            produce_one_row: false,
-            schema: Arc::new(DFSchema::new_with_metadata(
-                vec![DFField::new_unqualified("a_int64", DataType::Int64, true)],
-                std::collections::HashMap::new(),
-            )?),
-        }));
+    fn in_subquery_cast_subquery() -> Result<()> {
+        let empty_int32 = empty_with_type(DataType::Int32);
+        let empty_int64 = empty_with_type(DataType::Int64);
+
+        let in_subquery_expr = Expr::InSubquery {
+            expr: Box::new(col("a")),
+            subquery: Subquery {
+                subquery: empty_int32,
+                outer_ref_columns: vec![],
+            },
+            negated: false,
+        };
+        let plan = LogicalPlan::Filter(Filter::try_new(in_subquery_expr, empty_int64)?);
+        // add cast for subquery
+        let expected = "\
+        Filter: a IN (<subquery>)\
+        \n  Subquery:\
+        \n    Projection: CAST(a AS Int64)\
+        \n      EmptyRelation\
+        \n  EmptyRelation";
+        assert_analyzed_plan_eq(Arc::new(TypeCoercion::new()), &plan, expected)?;
+        Ok(())
+    }
+
+    #[test]
+    fn in_subquery_cast_expr() -> Result<()> {

Review Comment:
   could you please add or replace the cast case for decimal data type?
   the coercion type or the common type is not same with `expr` and `subquery`, this case will both cast `expr` and `subquer` together.



##########
datafusion/optimizer/src/analyzer/type_coercion.rs:
##########
@@ -1428,33 +1404,47 @@ mod test {
     }
 
     #[test]
-    fn in_subquery() -> Result<()> {
-        let empty_inside = Arc::new(LogicalPlan::EmptyRelation(EmptyRelation {
-            produce_one_row: false,
-            schema: Arc::new(DFSchema::new_with_metadata(
-                vec![DFField::new_unqualified("a_int32", DataType::Int32, true)],
-                std::collections::HashMap::new(),
-            )?),
-        }));
-        let empty_outside = Arc::new(LogicalPlan::EmptyRelation(EmptyRelation {
-            produce_one_row: false,
-            schema: Arc::new(DFSchema::new_with_metadata(
-                vec![DFField::new_unqualified("a_int64", DataType::Int64, true)],
-                std::collections::HashMap::new(),
-            )?),
-        }));
+    fn in_subquery_cast_subquery() -> Result<()> {
+        let empty_int32 = empty_with_type(DataType::Int32);
+        let empty_int64 = empty_with_type(DataType::Int64);
+
+        let in_subquery_expr = Expr::InSubquery {
+            expr: Box::new(col("a")),
+            subquery: Subquery {
+                subquery: empty_int32,
+                outer_ref_columns: vec![],
+            },
+            negated: false,
+        };
+        let plan = LogicalPlan::Filter(Filter::try_new(in_subquery_expr, empty_int64)?);
+        // add cast for subquery
+        let expected = "\
+        Filter: a IN (<subquery>)\
+        \n  Subquery:\
+        \n    Projection: CAST(a AS Int64)\
+        \n      EmptyRelation\
+        \n  EmptyRelation";
+        assert_analyzed_plan_eq(Arc::new(TypeCoercion::new()), &plan, expected)?;
+        Ok(())
+    }
+
+    #[test]
+    fn in_subquery_cast_expr() -> Result<()> {

Review Comment:
   could you please add or replace the cast case for decimal data type?
   the coercion type or the common type is not same with `expr` and `subquery`, this case will both cast `expr` and `subquery` together.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] jackwener commented on pull request #5883: fix: type coercion for expr/subquery in InSubquery

Posted by "jackwener (via GitHub)" <gi...@apache.org>.
jackwener commented on PR #5883:
URL: https://github.com/apache/arrow-datafusion/pull/5883#issuecomment-1499919708

   thanks @liukun4515 , has resolved them


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] liukun4515 commented on a diff in pull request #5883: fix: type coercion for expr/subquery in InSubquery

Posted by "liukun4515 (via GitHub)" <gi...@apache.org>.
liukun4515 commented on code in PR #5883:
URL: https://github.com/apache/arrow-datafusion/pull/5883#discussion_r1160391099


##########
datafusion/optimizer/src/analyzer/type_coercion.rs:
##########
@@ -147,20 +148,24 @@ impl TreeNodeRewriter for TypeCoercionRewriter {
                 subquery,
                 negated,
             } => {
-                let expr_type = expr.get_type(&self.schema)?;
                 let new_plan = analyze_internal(&self.schema, &subquery.subquery)?;
+                let expr_type = expr.get_type(&self.schema)?;
                 let subquery_type = new_plan.schema().field(0).data_type();
-                let expr = if &expr_type == subquery_type {
-                    expr
-                } else {
-                    Box::new(expr.cast_to(subquery_type, &self.schema)?)
+                let result_type = comparison_coercion(&expr_type, subquery_type).ok_or(DataFusionError::Plan(
+                    format!(
+                        "expr type {expr_type:?} can't cast to {subquery_type:?} in InSubquery"
+                    ),
+                ))?;
+                let new_subquery = Subquery {
+                    subquery: Arc::new(analyze_internal(
+                        &self.schema,
+                        &subquery.subquery,

Review Comment:
   can we use the `new_plan` which is generated by `analyze_internal`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] liukun4515 commented on a diff in pull request #5883: fix: type coercion for expr/subquery in InSubquery

Posted by "liukun4515 (via GitHub)" <gi...@apache.org>.
liukun4515 commented on code in PR #5883:
URL: https://github.com/apache/arrow-datafusion/pull/5883#discussion_r1160386833


##########
datafusion/expr/src/expr_schema.rs:
##########
@@ -326,7 +317,12 @@ impl ExprSchemable for Expr {
     }
 }
 
-fn cast_subquery(subquery: Subquery, cast_to_type: &DataType) -> Result<Subquery> {
+/// cast subquery in InSubquery/ScalarSubquery to a given type.

Review Comment:
   I find this method is used in `scalar subquer`, but the comment is for Insubquery/scalarquery.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org