You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Jonathan Simozar (JIRA)" <ji...@apache.org> on 2016/09/27 22:12:20 UTC
[jira] [Created] (SPARK-17695) Deserialization error when using
DataFrameReader.json on JSON line that contains an empty JSON object
Jonathan Simozar created SPARK-17695:
----------------------------------------
Summary: Deserialization error when using DataFrameReader.json on JSON line that contains an empty JSON object
Key: SPARK-17695
URL: https://issues.apache.org/jira/browse/SPARK-17695
Project: Spark
Issue Type: Bug
Components: SQL
Affects Versions: 2.0.0
Environment: Scala 2.11.7
Reporter: Jonathan Simozar
When using the {{DataFrameReader}} method {{json}} on the JSON
{noformat}{"field1":{},"field2":"a"}{noformat}
{{field1}} is removed at deserialization.
This can be reproduced in the example below.
{code:java}// create spark context
val sc: SparkContext = new SparkContext("local[*]", "My App")
// create spark session
val sparkSession: SparkSession = SparkSession.builder().config(sc.getConf).getOrCreate()
// create rdd
val strings = sc.parallelize(Seq(
"""{"field1":{},"field2":"a"}"""
))
// create json DataSet[Row], convert back to RDD, and print lines to stdout
sparkSession.read.json(strings)
.toJSON.collect().foreach(println)
{code}
*stdout*
{noformat}
{"field2":"a"}
{noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org