You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Bruce Robbins (JIRA)" <ji...@apache.org> on 2018/09/17 20:19:00 UTC

[jira] [Commented] (SPARK-22036) BigDecimal multiplication sometimes returns null

    [ https://issues.apache.org/jira/browse/SPARK-22036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16618104#comment-16618104 ] 

Bruce Robbins commented on SPARK-22036:
---------------------------------------

[~mgaido] In this change, you modified how precision and scale are determined when literals are promoted to decimal. For example, before the change, an integer literal's precision and scale would be hardcoded to DecimalType(10, 0). After the change, it's based on the number of digits in the literal.

However, that new behavior for literals is not toggled by {{spark.sql.decimalOperations.allowPrecisionLoss}} like the other changes in behavior introduced by the PR.

As a result, there are cases where we see truncation and rounding in 2.3/2.4 that we don't see in 2.2, and this change in behavior is not controllable via the configuration setting. E.g,:

In 2.2:
{noformat}
scala> sql("select 26393499451/(1e6 * 1000) as c1").printSchema
root
 |-- c1: decimal(27,13) (nullable = true) <== 13 decimal digits
scala> sql("select 26393499451/(1e6 * 1000) as c1").show
+----------------+
|              c1|
+----------------+
|26.3934994510000|
+----------------+
{noformat}
In 2.3 and up:
{noformat}
scala> sql("set spark.sql.decimalOperations.allowPrecisionLoss").show
+--------------------+-----+
|                 key|value|
+--------------------+-----+
|spark.sql.decimal...| true|
+--------------------+-----+
scala> sql("select 26393499451/(1e6 * 1000) as c1").printSchema
root
 |-- c1: decimal(12,7) (nullable = true)
scala> sql("select 26393499451/(1e6 * 1000) as c1").show
+----------+
|        c1|
+----------+
|26.3934995| <== result is truncated and rounded up.
+----------+
scala> sql("set spark.sql.decimalOperations.allowPrecisionLoss=false").show
+--------------------+-----+
|                 key|value|
+--------------------+-----+
|spark.sql.decimal...|false|
+--------------------+-----+
scala> sql("select 26393499451/(1e6 * 1000) as c1").printSchema
root
 |-- c1: decimal(12,7) (nullable = true)
scala> sql("select 26393499451/(1e6 * 1000) as c1").show
+----------+
|        c1|
+----------+
|26.3934995| <== result is still truncated and rounded up.
+----------+
scala> 
{noformat}
I can force it to behave the old way, at least for this case, by explicitly casting the literal:
{noformat}
scala> sql("select 26393499451/(1e6 * cast(1000 as decimal(10, 0))) as c1").show
+----------------+
|              c1|
+----------------+
|26.3934994510000|
+----------------+
{noformat}
Do you think it makes sense for {{spark.sql.decimalOperations.allowPrecisionLoss}} to also toggle how literal promotion happens (the old way vs. the new way)?

> BigDecimal multiplication sometimes returns null
> ------------------------------------------------
>
>                 Key: SPARK-22036
>                 URL: https://issues.apache.org/jira/browse/SPARK-22036
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.2.0
>            Reporter: Olivier Blanvillain
>            Assignee: Marco Gaido
>            Priority: Major
>             Fix For: 2.3.0
>
>
> The multiplication of two BigDecimal numbers sometimes returns null. Here is a minimal reproduction:
> {code:java}
> object Main extends App {
>   import org.apache.spark.{SparkConf, SparkContext}
>   import org.apache.spark.sql.SparkSession
>   import spark.implicits._
>   val conf = new SparkConf().setMaster("local[*]").setAppName("REPL").set("spark.ui.enabled", "false")
>   val spark = SparkSession.builder().config(conf).appName("REPL").getOrCreate()
>   implicit val sqlContext = spark.sqlContext
>   case class X2(a: BigDecimal, b: BigDecimal)
>   val ds = sqlContext.createDataset(List(X2(BigDecimal(-0.1267333984375), BigDecimal(-1000.1))))
>   val result = ds.select(ds("a") * ds("b")).collect.head
>   println(result) // [null]
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org