You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "jackwener (via GitHub)" <gi...@apache.org> on 2023/04/23 15:57:35 UTC

[GitHub] [arrow-datafusion] jackwener opened a new pull request, #6102: feat: avoid cast expr in Between as much as possible.

jackwener opened a new pull request, #6102:
URL: https://github.com/apache/arrow-datafusion/pull/6102

   # Which issue does this PR close?
   
   
   part of #5928.
   
   # Rationale for this change
   
   Part of job about type_coercion.
   
   # What changes are included in this PR?
   
   avoid cast expr in Between as much as possible.
   
   # Are these changes tested?
   
   <!--
   We typically require tests for all PRs in order to:
   1. Prevent the code from being accidentally broken by subsequent changes
   2. Serve as another way to document the expected behavior of the code
   
   If tests are not included in your PR, please explain why (for example, are they covered by existing tests)?
   -->
   
   # Are there any user-facing changes?
   
   <!--
   If there are user-facing changes then we may require documentation to be updated before approving the PR.
   -->
   
   <!--
   If there are any breaking changes to public APIs, please add the `api change` label.
   -->


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] jackwener closed pull request #6102: enhance: avoid cast expr in Between as much as possible.

Posted by "jackwener (via GitHub)" <gi...@apache.org>.
jackwener closed pull request #6102: enhance: avoid cast expr in Between as much as possible.
URL: https://github.com/apache/arrow-datafusion/pull/6102


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] mingmwang commented on a diff in pull request #6102: enhance: avoid cast expr in Between as much as possible.

Posted by "mingmwang (via GitHub)" <gi...@apache.org>.
mingmwang commented on code in PR #6102:
URL: https://github.com/apache/arrow-datafusion/pull/6102#discussion_r1174890770


##########
datafusion/optimizer/src/analyzer/type_coercion.rs:
##########
@@ -312,49 +313,39 @@ impl TreeNodeRewriter for TypeCoercionRewriter {
             }) => {
                 let expr_type = expr.get_type(&self.schema)?;
                 let low_type = low.get_type(&self.schema)?;
-                let low_coerced_type = comparison_coercion(&expr_type, &low_type)
-                    .ok_or_else(|| {
-                        DataFusionError::Internal(format!(
-                            "Failed to coerce types {expr_type} and {low_type} in BETWEEN expression"
-                        ))
-                    })?;
                 let high_type = high.get_type(&self.schema)?;
-                let high_coerced_type = comparison_coercion(&expr_type, &low_type)
-                    .ok_or_else(|| {
-                        DataFusionError::Internal(format!(
-                            "Failed to coerce types {expr_type} and {high_type} in BETWEEN expression"
-                        ))
-                    })?;
-                let coercion_type =
-                    comparison_coercion(&low_coerced_type, &high_coerced_type)
-                        .ok_or_else(|| {
-                            DataFusionError::Internal(format!(
-                                "Failed to coerce types {expr_type} and {high_type} in BETWEEN expression"
-                            ))
-                        })?;
-                let expr = Expr::Between(Between::new(
-                    Box::new(expr.cast_to(&coercion_type, &self.schema)?),
-                    negated,
-                    Box::new(low.cast_to(&coercion_type, &self.schema)?),
-                    Box::new(high.cast_to(&coercion_type, &self.schema)?),
-                ));
-                Ok(expr)
+                let between_common_type = comparison_coercion(&low_type, &high_type).ok_or(DataFusionError::Internal(format!("Failed to coerce types {low_type} and {high_type} in BETWEEN expression")))?;
+                // Because low/high is Literal, cast them as much as possible to avoid cast expr.

Review Comment:
   Why the high and low are always `Literal`?  



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] mingmwang commented on pull request #6102: enhance: avoid cast expr in Between as much as possible.

Posted by "mingmwang (via GitHub)" <gi...@apache.org>.
mingmwang commented on PR #6102:
URL: https://github.com/apache/arrow-datafusion/pull/6102#issuecomment-1519562955

   @jackwener 
   Could you please help to check the behavior of PG?
   
   In Spark 3.x, there is a configuration to control the String/Date type cast conversions.
   
   ```scala
     val LEGACY_CAST_DATETIME_TO_STRING =
       buildConf("spark.sql.legacy.typeCoercion.datetimeToString.enabled")
         .internal()
         .doc("If it is set to true, date/timestamp will cast to string in binary comparisons " +
           s"with String when ${ANSI_ENABLED.key} is false.")
         .version("3.0.0")
         .booleanConf
         .createWithDefault(false)
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] jackwener commented on pull request #6102: enhance: avoid cast expr in Between as much as possible.

Posted by "jackwener (via GitHub)" <gi...@apache.org>.
jackwener commented on PR #6102:
URL: https://github.com/apache/arrow-datafusion/pull/6102#issuecomment-1519943673

   @mingmwang thanks, this PR is wrong. I will fix it


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org