You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (JIRA)" <ji...@apache.org> on 2019/05/21 04:23:39 UTC

[jira] [Updated] (SPARK-15835) The read path of json doesn't support write path when schema contains Options

     [ https://issues.apache.org/jira/browse/SPARK-15835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hyukjin Kwon updated SPARK-15835:
---------------------------------
    Labels: bulk-closed  (was: )

> The read path of json doesn't support write path when schema contains Options
> -----------------------------------------------------------------------------
>
>                 Key: SPARK-15835
>                 URL: https://issues.apache.org/jira/browse/SPARK-15835
>             Project: Spark
>          Issue Type: Bug
>            Reporter: Burak Yavuz
>            Priority: Major
>              Labels: bulk-closed
>
> my schema contains optional fields. When these fields are written in json (and all of these records are None), the field will be omitted during writes. When reading, these fields can't be found and this throws an exception.
> Either during writes, the fields should be included as `null`, or the Dataset should not require the field to exist in the DataFrame if the field is an Option (which may be a better solution)
> {code}
> case class Bug(field1: String, field2: Option[String])
> Seq(Bug("abc", None)).toDS.write.json("/tmp/sqlBug")
> spark.read.json("/tmp/sqlBug").as[Bug]
> {code}
> stack trace:
> {code}
> org.apache.spark.sql.AnalysisException: cannot resolve '`field2`' given input columns: [field1]
> at org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42)
> 	at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$2.applyOrElse(CheckAnalysis.scala:62)
> 	at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$2.applyOrElse(CheckAnalysis.scala:59)
> 	at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformUp$1.apply(TreeNode.scala:287)
> 	at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformUp$1.apply(TreeNode.scala:287)
> 	at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:68)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org