You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Jonathan Simozar (JIRA)" <ji...@apache.org> on 2016/09/27 22:12:20 UTC

[jira] [Created] (SPARK-17695) Deserialization error when using DataFrameReader.json on JSON line that contains an empty JSON object

Jonathan Simozar created SPARK-17695:
----------------------------------------

             Summary: Deserialization error when using DataFrameReader.json on JSON line that contains an empty JSON object
                 Key: SPARK-17695
                 URL: https://issues.apache.org/jira/browse/SPARK-17695
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 2.0.0
         Environment: Scala 2.11.7
            Reporter: Jonathan Simozar


When using the {{DataFrameReader}} method {{json}} on the JSON
{noformat}{"field1":{},"field2":"a"}{noformat}
{{field1}} is removed at deserialization.


This can be reproduced in the example below.
{code:java}// create spark context
val sc: SparkContext = new SparkContext("local[*]", "My App")
// create spark session
val sparkSession: SparkSession = SparkSession.builder().config(sc.getConf).getOrCreate()
// create rdd
val strings = sc.parallelize(Seq(
  """{"field1":{},"field2":"a"}"""
))
// create json DataSet[Row], convert back to RDD, and print lines to stdout
sparkSession.read.json(strings)
  .toJSON.collect().foreach(println)
{code}
*stdout*
{noformat}
{"field2":"a"}
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org