You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Meeraj Kunnumpurath <me...@servicesymphony.com> on 2016/10/09 01:50:39 UTC

Scientific Notation and Precision Error

Hello,

I have a dataset in which some of the row for a numeric column have data
represented in scientific format. When I enable schema inference, I get a
precision error trying to set decimal value, in any operation involving the
rows for which the column value is represented in scientific notation. An
example of the literal that is causing the issue is 1.225e+006.

scala> val df = spark.read.option("header", "true").option("inferSchema",
"true").csv("sales_data.csv")

df: org.apache.spark.sql.DataFrame = [id: bigint, date: string ... 19 more
fields]


scala> df.select(sum("price")).show

16/10/09 05:46:01 ERROR Executor: Exception in task 0.0 in stage 63.0 (TID
68)

java.lang.IllegalArgumentException: requirement failed: Decimal precision 7
exceeds max precision 6

at scala.Predef$.require(Predef.scala:224)

at org.apache.spark.sql.types.Decimal.set(Decimal.scala:112)
Many thanks

-- 
*Meeraj Kunnumpurath*


*Director and Executive PrincipalService Symphony Ltd00 44 7702 693597*

*00 971 50 409 0169meeraj@servicesymphony.com <me...@servicesymphony.com>*