You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Arnav kumar <ak...@gmail.com> on 2018/01/31 02:48:22 UTC

Issue with Cast in Spark Sql

Hi Experts

I am trying to convert a string with decimal value to decimal in Spark Sql
and load it into Hive/Sql Server.

In Hive instead of getting converted to decimal all my values are coming as
null.

In Sql Server instead of getting decimal values are coming without precision

Can you please let me know if this is any kind of limitation

Here is my code


//select the required columns from actual data frame
val query ="""select eventId,
cast(eventData.latitude as Decimal(10,10)) as Latitude,
cast(eventData.longitude as Decimal(10,10)) as Longitude from event"""

//creating event data frame
val eventTableDF = sparkSession.sql(query)
//printing the schema for debugging purpose
eventTableDF.printSchema()

root
 |-- eventId: string (nullable = true)
 |-- Latitude: decimal(10,10) (nullable = true)
 |-- Longitude: decimal(10,10) (nullable = true)



 val eventTableDF = sparkSession.sql(query)
  import sparkSession.implicits._

eventTableDF.write.mode(org.apache.spark.sql.SaveMode.Append).insertInto(eventTable)





With Best Regards
Arnav Kumar

Re: Issue with Cast in Spark Sql

Posted by naresh Goud <na...@gmail.com>.
Spark/Hive converting decimal to null value if we specify the precision
more than available precision in file.  Below example give you details. I
am not sure why its converting into Null.
Note: You need to trim string before casting to decimal

Table data with col1 and col2 columns


 val r = sqlContext.sql("select col2  from nd2629.test")
+-------------+
|         col2|
+-------------+
|         1.00|
|          2.0|
|      123.798|
| 123456.67777|
+-------------+



val r = sqlContext.sql("select CAST(TRIM(col2) as decimal(10,4)) from
nd2629.test")

+-----------+
|        _c0|
+-----------+
|     1.0000|
|     2.0000|
|   123.7980|
|123456.6778|
+-----------+



 val r = sqlContext.sql("select CAST(TRIM(col2) as decimal(10,5)) from
nd2629.test")
+---------+
|      _c0|
+---------+
|  1.00000|
|  2.00000|
|123.79800|
|     null|
+---------+


you need to specify the precision value as max precision value for column -1

in above case max precision is 5 (123456.*67777*) so we should specify
decimal(10,5)


Thank you,
Naresh




On Tue, Jan 30, 2018 at 8:48 PM, Arnav kumar <ak...@gmail.com> wrote:

> Hi Experts
>
> I am trying to convert a string with decimal value to decimal in Spark Sql
> and load it into Hive/Sql Server.
>
> In Hive instead of getting converted to decimal all my values are coming
> as null.
>
> In Sql Server instead of getting decimal values are coming without
> precision
>
> Can you please let me know if this is any kind of limitation
>
> Here is my code
>
>
> //select the required columns from actual data frame
> val query ="""select eventId,
> cast(eventData.latitude as Decimal(10,10)) as Latitude,
> cast(eventData.longitude as Decimal(10,10)) as Longitude from event"""
>
> //creating event data frame
> val eventTableDF = sparkSession.sql(query)
> //printing the schema for debugging purpose
> eventTableDF.printSchema()
>
> root
>  |-- eventId: string (nullable = true)
>  |-- Latitude: decimal(10,10) (nullable = true)
>  |-- Longitude: decimal(10,10) (nullable = true)
>
>
>
>  val eventTableDF = sparkSession.sql(query)
>   import sparkSession.implicits._
>   eventTableDF.write.mode(org.apache.spark.sql.SaveMode.
> Append).insertInto(eventTable)
>
>
>
>
>
> With Best Regards
> Arnav Kumar
>
>
>
>