You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (JIRA)" <ji...@apache.org> on 2018/08/07 05:20:00 UTC
[jira] [Created] (SPARK-25040) Empty string for double and float
types should be nulls in JSON
Hyukjin Kwon created SPARK-25040:
------------------------------------
Summary: Empty string for double and float types should be nulls in JSON
Key: SPARK-25040
URL: https://issues.apache.org/jira/browse/SPARK-25040
Project: Spark
Issue Type: Bug
Components: SQL
Affects Versions: 2.2.0, 2.4.0
Reporter: Hyukjin Kwon
The issue itself seems to be a behaviour change between 1.6 and 2.x for treating empty string as null or not in double and float.
{code}
{"a":"a1","int":1,"other":4.4}
{"a":"a2","int":"","other":""}
{code}
code :
{code}
val config = new SparkConf().setMaster("local[5]").setAppName("test")
val sc = SparkContext.getOrCreate(config)
val sql = new SQLContext(sc)
val file_path = this.getClass.getClassLoader.getResource("Sanity4.json").getFile
val df = sql.read.schema(null).json(file_path)
df.show(30)
{code}
then in spark 1.6, result is
{code}
+---+----+-----+
| a| int|other|
+---+----+-----+
| a1| 1| 4.4|
| a2|null| null|
+---+----+-----+
{code}
{code}
root
|-- a: string (nullable = true)
|-- int: long (nullable = true)
|-- other: double (nullable = true)
{code}
but in spark 2.2, result is
{code}
+----+----+-----+
| a| int|other|
+----+----+-----+
| a1| 1| 4.4|
|null|null| null|
+----+----+-----+
{code}
{code}
root
|-- a: string (nullable = true)
|-- int: long (nullable = true)
|-- other: double (nullable = true)
{code}
Another easy reproducer:
{code}
spark.read.schema("a DOUBLE, b FLOAT")
.option("mode", "FAILFAST").json(Seq("""{"a":"", "b": ""}""", """{"a": 1.1, "b": 1.1}""").toDS)
{code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org