You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by robert <zh...@yahoo.com> on 2016/11/11 19:53:50 UTC
spark sql query of nested json lists data
I am new to the spark sql development. I have a json file with nested arrays.
I can extract/query these arrays. However, when I add order by clause, I get
exceptions: here is the step:
1) val a = sparkSession.sql("SELECT Tables.TableName, Tables.TableType,
Tables.TableExecOrder, Tables.Columns FROM tblConfig LATERAL VIEW
explode(TargetTable.Tables[0]) s AS Tables")
a.show(5)
output:
+---------+---------+--------------+--------------------+
|TableName|TableType|TableExecOrder| Columns|
+---------+---------+--------------+--------------------+
| TB0| Final| 0|[[name,INT], [nam...|
| TB1| temp| 2|[[name,INT], [nam...|
| TB2| temp| 1|[[name,INT], [nam...|
+---------+---------+--------------+--------------------+
2) a.createOrReplaceTempView("aa")
sparkSession.sql("SELECT TableName, TableExecOrder, Columns FROM aa
order by TableExecOrder").show(5)
Output: exception
16/11/11 11:17:00 ERROR TaskResultGetter: Exception while getting task
result
com.esotericsoftware.kryo.KryoException: java.lang.NullPointerException
Serialization trace:
underlying (org.apache.spark.util.BoundedPriorityQueue)
at
com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:144)
at
com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:551)
at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:790)
at com.twitter.chill.SomeSerializer.read(SomeSerializer.scala:25)
at com.twitter.chill.SomeSerializer.read(SomeSerializer.scala:19)
at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:790)
at
org.apache.spark.serializer.KryoSerializerInstance.deserialize(KryoSerializer.scala:312)
at org.apache.spark.scheduler.DirectTaskResult.value(TaskResult.scala:87)
at
org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply$mcV$sp(TaskResultGetter.scala:66)
at
org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply(TaskResultGetter.scala:57)
at
org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply(TaskResultGetter.scala:57)
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1793)
at
org.apache.spark.scheduler.TaskResultGetter$$anon$2.run(TaskResultGetter.scala:56)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NullPointerException
at
org.apache.spark.sql.catalyst.expressions.codegen.LazilyGeneratedOrdering.compare(GenerateOrdering.scala:157)
at
org.apache.spark.sql.catalyst.expressions.codegen.LazilyGeneratedOrdering.compare(GenerateOrdering.scala:148)
at scala.math.Ordering$$anon$4.compare(Ordering.scala:111)
at java.util.PriorityQueue.siftUpUsingComparator(PriorityQueue.java:669)
at java.util.PriorityQueue.siftUp(PriorityQueue.java:645)
at java.util.PriorityQueue.offer(PriorityQueue.java:344)
at java.util.PriorityQueue.add(PriorityQueue.java:321)
at
com.twitter.chill.java.PriorityQueueSerializer.read(PriorityQueueSerializer.java:78)
at
com.twitter.chill.java.PriorityQueueSerializer.read(PriorityQueueSerializer.java:31)
at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:708)
at
com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125)
... 15 more
how can I fix this?
--
View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/spark-sql-query-of-nested-json-lists-data-tp19828.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org