You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/06/15 03:58:51 UTC

[GitHub] [spark] WangGuangxin opened a new pull request, #36873: [SPARK-24994][SQL][FOLLOW-UP] Disable Unwrap cast optimize when casting from Long to Float/ Double or from Integer to Float

WangGuangxin opened a new pull request, #36873:
URL: https://github.com/apache/spark/pull/36873

   ### What changes were proposed in this pull request?
   Cast from Integer to Float or from Long to Double/Float may loss precision if the length of Integer/Long beyonds the **significant digits** of a Double(which is 15 or 16 digits) or Float(which is 7 or 8 digits).
   
   For example, ```select *, cast(a as int) from (select cast(33554435 as foat) a )``` gives `33554436` instead of `33554435`.
   
   When it comes the optimization rule `UnwrapCastInBinaryComparison`, it may result in incorrect (confused) result .
   We can reproduce it with following script.
   ```
   spark.range(10).map(i => 64707595868612313L).createOrReplaceTempView("tbl")
   val df = sql("select * from tbl where cast(value as double) = cast('64707595868612313' as double)")
   df.explain(true)
   df.show()
   ```
   
   With we disable this optimization rule , it returns 10 records.
   But if we enable this optimization rule, it returns empty, since the sql is optimized to
   ```
   select * from tbl where value = 64707595868612312L
   ``` 
   
   ### Why are the changes needed?
   Fix the behavior that may confuse users (or maybe a bug?)
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   
   ### How was this patch tested?
   Add a new UT


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on pull request #36873: [SPARK-39476][SQL] Disable Unwrap cast optimize when casting from Long to Float/ Double or from Integer to Float

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on PR #36873:
URL: https://github.com/apache/spark/pull/36873#issuecomment-1157137986

   thanks, merging to master/3.3/3.2/3.1!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] ulysses-you commented on pull request #36873: [SPARK-24994][SQL][FOLLOW-UP] Disable Unwrap cast optimize when casting from Long to Float/ Double or from Integer to Float

Posted by GitBox <gi...@apache.org>.
ulysses-you commented on PR #36873:
URL: https://github.com/apache/spark/pull/36873#issuecomment-1155959494

   you should file a new ticket since SPARK-24994 is already released ..


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan closed pull request #36873: [SPARK-39476][SQL] Disable Unwrap cast optimize when casting from Long to Float/ Double or from Integer to Float

Posted by GitBox <gi...@apache.org>.
cloud-fan closed pull request #36873: [SPARK-39476][SQL] Disable Unwrap cast optimize when casting from Long to Float/ Double or from Integer to Float
URL: https://github.com/apache/spark/pull/36873


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #36873: [SPARK-39476][SQL] Disable Unwrap cast optimize when casting from Long to Float/ Double or from Integer to Float

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on PR #36873:
URL: https://github.com/apache/spark/pull/36873#issuecomment-1156803849

   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #36873: [SPARK-39476][SQL] Disable Unwrap cast optimize when casting from Long to Float/ Double or from Integer to Float

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #36873:
URL: https://github.com/apache/spark/pull/36873#discussion_r897659097


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/UnwrapCastInBinaryComparison.scala:
##########
@@ -358,8 +358,16 @@ object UnwrapCastInBinaryComparison extends Rule[LogicalPlan] {
     toType.sameType(literalType) &&
       !fromExp.foldable &&
       toType.isInstanceOf[NumericType] &&
-      ((fromExp.dataType.isInstanceOf[NumericType] && Cast.canUpCast(fromExp.dataType, toType)) ||
-        fromExp.dataType.isInstanceOf[BooleanType])
+      canUnwrapCast(fromExp.dataType, toType)
+  }
+
+  private def canUnwrapCast(from: DataType, to: DataType): Boolean = (from, to) match {
+    case (BooleanType, _) => true
+    case (IntegerType, FloatType) => false

Review Comment:
   can we add some code comments to explain the reason?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] WangGuangxin commented on pull request #36873: [SPARK-39476][SQL] Disable Unwrap cast optimize when casting from Long to Float/ Double or from Integer to Float

Posted by GitBox <gi...@apache.org>.
WangGuangxin commented on PR #36873:
URL: https://github.com/apache/spark/pull/36873#issuecomment-1155994453

   > you should file a new ticket since [SPARK-24994](https://issues.apache.org/jira/browse/SPARK-24994) is already released ..
   
   Thanks for remind, updated. @ulysses-you  @sunchao @dongjoon-hyun @cloud-fan Please help review this


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org