You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (JIRA)" <ji...@apache.org> on 2016/11/04 18:42:58 UTC
[jira] [Assigned] (SPARK-18260) from_json can throw a better
exception when it can't find the column or be nullSafe
[ https://issues.apache.org/jira/browse/SPARK-18260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Apache Spark reassigned SPARK-18260:
------------------------------------
Assignee: (was: Apache Spark)
> from_json can throw a better exception when it can't find the column or be nullSafe
> -----------------------------------------------------------------------------------
>
> Key: SPARK-18260
> URL: https://issues.apache.org/jira/browse/SPARK-18260
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Reporter: Burak Yavuz
> Priority: Blocker
>
> I got this exception:
> {code}
> SparkException: Job aborted due to stage failure: Task 0 in stage 13028.0 failed 4 times, most recent failure: Lost task 0.3 in stage 13028.0 (TID 74170, 10.0.138.84, executor 2): java.lang.NullPointerException
> at org.apache.spark.sql.catalyst.expressions.JsonToStruct.eval(jsonExpressions.scala:490)
> at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificPredicate.eval(Unknown Source)
> at org.apache.spark.sql.catalyst.expressions.codegen.GeneratePredicate$$anonfun$create$2.apply(GeneratePredicate.scala:71)
> at org.apache.spark.sql.catalyst.expressions.codegen.GeneratePredicate$$anonfun$create$2.apply(GeneratePredicate.scala:71)
> at org.apache.spark.sql.execution.FilterExec$$anonfun$17$$anonfun$apply$2.apply(basicPhysicalOperators.scala:211)
> at org.apache.spark.sql.execution.FilterExec$$anonfun$17$$anonfun$apply$2.apply(basicPhysicalOperators.scala:210)
> at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:463)
> at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
> at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:231)
> at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:225)
> at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:804)
> at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:804)
> {code}
> This was because the column that I called `from_json` on didn't exist for all of my rows. Either from_json should be null safe, or it should fail with a better error message
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org