You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Vinod KC (Jira)" <ji...@apache.org> on 2022/04/27 16:30:00 UTC
[jira] [Comment Edited] (SPARK-25177) When dataframe decimal type column having scale higher than 6, 0 values are shown in scientific notation
[ https://issues.apache.org/jira/browse/SPARK-25177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17528877#comment-17528877 ]
Vinod KC edited comment on SPARK-25177 at 4/27/22 4:29 PM:
-----------------------------------------------------------
In case, if anyone looking for a workaround to convert 0 in scientific notation to plaintext, this code snippet may help.
{code:java}
import org.apache.spark.sql.types.Decimal
val handleBigDecZeroUDF = udf((decimalVal:Decimal) => {
if (decimalVal.scale > 6) {
decimalVal.toBigDecimal.bigDecimal.toPlainString()
} else {
decimalVal.toString()
}
})
spark.sql("create table testBigDec (a decimal(10,7), b decimal(10,6), c decimal(10,8))")
spark.sql("insert into testBigDec values(0, 0,0)")
spark.sql("insert into testBigDec values(1, 1, 1)")
val df = spark.table("testBigDec")
df.show(false) // this will show scientific notation
// use custom UDF `handleBigDecZeroUDF` to convert zero into plainText notation
df.select(handleBigDecZeroUDF(col("a")).as("a"),col("b"),handleBigDecZeroUDF(col("c")).as("c")).show(false)
// Result of df.show(false)
+---------+--------+----------+
|a |b |c |
+---------+--------+----------+
|0E-7 |0.000000|0E-8 |
|1.0000000|1.000000|1.00000000|
+---------+--------+----------+
// Result using handleBigDecZeroUDF
+---------+--------+----------+
|a |b |c |
+---------+--------+----------+
|0.0000000|0.000000|0.00000000|
|1.0000000|1.000000|1.00000000|
+---------+--------+----------+
{code}
was (Author: vinodkc):
In case, if anyone looking for a workaround to convert 0 in scientific notation to plaintext, this code snippet may help.
{code:java}
import org.apache.spark.sql.types.Decimal
val handleBigDecZeroUDF = udf((decimalVal:Decimal) => {
if (decimalVal.scale > 6) {
decimalVal.toBigDecimal.bigDecimal.toPlainString()
} else {
decimalVal.toString()
}
})
spark.sql("create table testBigDec (a decimal(10,7), b decimal(10,6), c decimal(10,8))")
spark.sql("insert into testBigDec values(0, 0,0)")
spark.sql("insert into testBigDec values(1, 1, 1)")
val df = spark.table("testBigDec")
df.show(false) // this will show scientific notation
// use custom UDF `handleBigDecZeroUDF` to convert zero into plainText notation
df.select(handleBigDecZeroUDF(col("a")).as("a"),md5(handleBigDecZeroUDF(col("a"))).as("a-md5"),col("b"),handleBigDecZeroUDF(col("c")).as("c")).show(false) {code}
> When dataframe decimal type column having scale higher than 6, 0 values are shown in scientific notation
> --------------------------------------------------------------------------------------------------------
>
> Key: SPARK-25177
> URL: https://issues.apache.org/jira/browse/SPARK-25177
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 2.4.0
> Reporter: Vinod KC
> Priority: Minor
> Labels: bulk-closed
>
> If scale of decimal type is > 6 , 0 value will be shown in scientific notation and hence, when the dataframe output is saved to external database, it fails due to scientific notation on "0" values.
> Eg: In Spark
> --------------
> spark.sql("create table test (a decimal(10,7), b decimal(10,6), c decimal(10,8))")
> spark.sql("insert into test values(0, 0,0)")
> spark.sql("insert into test values(1, 1, 1)")
> spark.table("test").show()
> | a | b | c |
> | 0E-7 |0.000000| 0E-8 |//If scale > 6, zero is displayed in scientific notation|
> |1.0000000|1.000000|1.00000000|
>
> Eg: In Postgress
> --------------
> CREATE TABLE Testdec (a DECIMAL(10,7), b DECIMAL(10,6), c DECIMAL(10,8));
> INSERT INTO Testdec VALUES (0,0,0);
> INSERT INTO Testdec VALUES (1,1,1);
> select * from Testdec;
> Result:
> a | b | c
> -----------++---------------------------------------
> 0.0000000 | 0.000000 | 0.00000000
> 1.0000000 | 1.000000 | 1.00000000
> We can make spark SQL result consistent with other Databases like Postgresql
>
--
This message was sent by Atlassian Jira
(v8.20.7#820007)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org