You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Liang-Chi Hsieh (JIRA)" <ji...@apache.org> on 2016/03/03 08:36:18 UTC

[jira] [Commented] (SPARK-13612) Multiplication of BigDecimal columns not working as expected

    [ https://issues.apache.org/jira/browse/SPARK-13612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15177428#comment-15177428 ] 

Liang-Chi Hsieh commented on SPARK-13612:
-----------------------------------------

Because the internal type for BigDecimal would be Decimal(38, 18) by default, (you can print the schema of x and y), the result scale of x("a") * y("b") will be 18 + 18 = 36. That is detected to have overflow so you get a null value back.

You can cast the decimal column to proper precision and scale, e.g.:

{{code}}

val newX = x.withColumn("a", x("a").cast(DecimalType(10, 1)))
val newY = y.withColumn("b", y("b").cast(DecimalType(10, 1)))

newX.join(newY, newX("id") === newY("id")).withColumn("z", newX("a") * newY("b")).show

+---+----+---+----+------+
| id|   a| id|   b|     z|
+---+----+---+----+------+
|  1|10.0|  1|10.0|100.00|
+---+----+---+----+------+

{{code}}


> Multiplication of BigDecimal columns not working as expected
> ------------------------------------------------------------
>
>                 Key: SPARK-13612
>                 URL: https://issues.apache.org/jira/browse/SPARK-13612
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.6.0
>            Reporter: Varadharajan
>
> Please consider the below snippet:
> {code}
> case class AM(id: Int, a: BigDecimal)
> case class AX(id: Int, b: BigDecimal)
> val x = sc.parallelize(List(AM(1, 10))).toDF
> val y = sc.parallelize(List(AX(1, 10))).toDF
> x.join(y, x("id") === y("id")).withColumn("z", x("a") * y("b")).show
> {code}
> output:
> {code}
> | id|                   a| id|                   b|   z|
> |  1|10.00000000000000...|  1|10.00000000000000...|null|
> {code}
> Here the multiplication of the columns ("z") return null instead of 100.
> As of now we are using the below workaround, but definitely looks like a serious issue.
> {code}
> x.join(y, x("id") === y("id")).withColumn("z", x("a") / (expr("1") / y("b"))).show
> {code}
> {code}
> | id|                   a| id|                   b|                   z|
> |  1|10.00000000000000...|  1|10.00000000000000...|100.0000000000000...|
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org