You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Jonathan (Jira)" <ji...@apache.org> on 2019/10/26 05:15:00 UTC

[jira] [Created] (SPARK-29610) Keys with Null values are discarded when using to_json function

Jonathan created SPARK-29610:
--------------------------------

             Summary: Keys with Null values are discarded when using to_json function
                 Key: SPARK-29610
                 URL: https://issues.apache.org/jira/browse/SPARK-29610
             Project: Spark
          Issue Type: Bug
          Components: Build
    Affects Versions: 2.4.4
            Reporter: Jonathan


When calling to_json on a Struct if a key has Null as a value then the key is thrown away.
{code:java}
import pyspark
import pyspark.sql.functions as F
l = [("a", "foo"), ("b", None)]
df = spark.createDataFrame(l, ["id", "data"]) 
(
  df.select(F.struct("*").alias("payload"))
    .withColumn("payload", 
      F.to_json(F.col("payload"))
    ).select("payload")
    .show()
){code}
Produces the following output:
{noformat}
+--------------------+
|             payload|
+--------------------+
|{"id":"a","data":...|
|          {"id":"b"}|
+--------------------+{noformat}
The `data` key in the second row has just been silently deleted.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org