You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "wangyum (via GitHub)" <gi...@apache.org> on 2023/04/11 14:50:21 UTC

[GitHub] [spark] wangyum opened a new pull request, #40743: [SPARK-42597][SQL][FOLLOW-UP] Exclude `EqualNullSafe` when unwrapping date type to timestamp type

wangyum opened a new pull request, #40743:
URL: https://github.com/apache/spark/pull/40743

   ### What changes were proposed in this pull request?
   
   This PR excludes `EqualNullSafe` when unwrapping date type to timestamp type. This is because the unwrapped expression contains a UDF, which cannot pushed down to the data source:
   ```
   EqualNullSafe(Cast(ts, DateType), date) -> If(IsNull(ts), FalseLiteral, And(GreaterThanOrEqual(ts, Cast(date, TimestampType)), LessThan(ts, Cast(date + 1, TimestampType))))
   ```
   
   ### Why are the changes needed?
   
   Fix bug.
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   No.
   
   ### How was this patch tested?
   
   Update existing unit test.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] wangyum commented on a diff in pull request #40743: [SPARK-42597][SQL][FOLLOW-UP] Exclude `EqualNullSafe` when unwrapping date type to timestamp type

Posted by "wangyum (via GitHub)" <gi...@apache.org>.
wangyum commented on code in PR #40743:
URL: https://github.com/apache/spark/pull/40743#discussion_r1164229328


##########
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/UnwrapCastInBinaryComparisonSuite.scala:
##########
@@ -392,9 +392,8 @@ class UnwrapCastInBinaryComparisonSuite extends PlanTest with ExpressionEvalHelp
       (f5 >= castTimestamp(dateLit) && f5 < castTimestamp(dateAddOne)) ||
         (f6 >= castTimestampNTZ(dateLit) && f6 < castTimestampNTZ(dateAddOne)))
     assertEquivalent(
-      castDate(f5) <=> dateLit || castDate(f6) === dateLit,
-      (f5 >= castTimestamp(dateLit) && f5 < castTimestamp(dateAddOne)) ||
-        (f6 >= castTimestampNTZ(dateLit) && f6 < castTimestampNTZ(dateAddOne)))
+      castDate(f5) <=> dateLit || castDate(f6) <=> dateLit,
+      (f5 >= castTimestamp(dateLit) && f5 < castTimestamp(dateAddOne)) || castDate(f6) <=> dateLit)

Review Comment:
   timestamp is non nullable and timestampNTZ is null able.
   `castDate(f5) <=> dateLit` test non nullable and `castDate(f6) <=> dateLit` test nullable.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on pull request #40743: [SPARK-42597][SQL][FOLLOW-UP] Exclude `EqualNullSafe` when unwrapping date type to timestamp type

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on PR #40743:
URL: https://github.com/apache/spark/pull/40743#issuecomment-1504456010

   @wangyum I think we should mention two things:
   1. the current rewrite is incorrect for EqualNullSafe
   2. even if we fix the rewrite, it will contain `If` and can't be pushed down to the data source.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on pull request #40743: [SPARK-42597][SQL][FOLLOW-UP] Exclude `EqualNullSafe` when unwrapping date type to timestamp type

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on PR #40743:
URL: https://github.com/apache/spark/pull/40743#issuecomment-1505211706

   shall we still rewrite `EqualNullSafe` if column is not nullable?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #40743: [SPARK-42597][SQL][FOLLOW-UP] Exclude `EqualNullSafe` when unwrapping date type to timestamp type

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on PR #40743:
URL: https://github.com/apache/spark/pull/40743#issuecomment-1504447656

   qq:
   
   > because the unwrapped expression contains a UDF
   
   Where is the UDF, and how it's related to UDF?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] wangyum closed pull request #40743: [SPARK-42597][SQL][FOLLOW-UP] Only rewrite `EqualNullSafe` when the left side is non-nullable

Posted by "wangyum (via GitHub)" <gi...@apache.org>.
wangyum closed pull request #40743: [SPARK-42597][SQL][FOLLOW-UP] Only rewrite `EqualNullSafe` when the left side is non-nullable
URL: https://github.com/apache/spark/pull/40743


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] wangyum commented on pull request #40743: [SPARK-42597][SQL][FOLLOW-UP] Only rewrite `EqualNullSafe` when the left side is non-nullable

Posted by "wangyum (via GitHub)" <gi...@apache.org>.
wangyum commented on PR #40743:
URL: https://github.com/apache/spark/pull/40743#issuecomment-1509442721

   Merged to master.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] wangyum commented on pull request #40743: [SPARK-42597][SQL][FOLLOW-UP] Exclude `EqualNullSafe` when unwrapping date type to timestamp type

Posted by "wangyum (via GitHub)" <gi...@apache.org>.
wangyum commented on PR #40743:
URL: https://github.com/apache/spark/pull/40743#issuecomment-1504453830

   @HyukjinKwon The `EqualNullSafe(Cast(ts, DateType), date)` shoud rewrite to `If(IsNull(ts), FalseLiteral, And(GreaterThanOrEqual(ts, Cast(date, TimestampType)), LessThan(ts, Cast(date + 1, TimestampType))))`.  It contains `If`.  So there is no benefit if rewrite `EqualNullSafe(Cast(ts, DateType), date)`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #40743: [SPARK-42597][SQL][FOLLOW-UP] Exclude `EqualNullSafe` when unwrapping date type to timestamp type

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on code in PR #40743:
URL: https://github.com/apache/spark/pull/40743#discussion_r1164214803


##########
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/UnwrapCastInBinaryComparisonSuite.scala:
##########
@@ -392,9 +392,8 @@ class UnwrapCastInBinaryComparisonSuite extends PlanTest with ExpressionEvalHelp
       (f5 >= castTimestamp(dateLit) && f5 < castTimestamp(dateAddOne)) ||
         (f6 >= castTimestampNTZ(dateLit) && f6 < castTimestampNTZ(dateAddOne)))
     assertEquivalent(
-      castDate(f5) <=> dateLit || castDate(f6) === dateLit,
-      (f5 >= castTimestamp(dateLit) && f5 < castTimestamp(dateAddOne)) ||
-        (f6 >= castTimestampNTZ(dateLit) && f6 < castTimestampNTZ(dateAddOne)))
+      castDate(f5) <=> dateLit || castDate(f6) <=> dateLit,
+      (f5 >= castTimestamp(dateLit) && f5 < castTimestamp(dateAddOne)) || castDate(f6) <=> dateLit)

Review Comment:
   shall we test nullable column as well?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org