You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "JESSE CHEN (JIRA)" <ji...@apache.org> on 2016/04/21 22:21:13 UTC
[jira] [Commented] (SPARK-14096) SPARK-SQL CLI returns NPE
[ https://issues.apache.org/jira/browse/SPARK-14096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15252592#comment-15252592 ]
JESSE CHEN commented on SPARK-14096:
------------------------------------
duplicate of SPARK-14521
> SPARK-SQL CLI returns NPE
> -------------------------
>
> Key: SPARK-14096
> URL: https://issues.apache.org/jira/browse/SPARK-14096
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 2.0.0
> Reporter: JESSE CHEN
>
> Trying to run TPCDS query 06 in spark-sql shell received the following error in the middle of a stage; but running another query 38 succeeded:
> NPE:
> {noformat}
> 16/03/22 15:12:56 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 10.0, whose tasks have all completed, from pool
> 16/03/22 15:12:56 INFO scheduler.TaskSetManager: Finished task 65.0 in stage 10.0 (TID 622) in 171 ms on localhost (30/200)
> 16/03/22 15:12:56 ERROR scheduler.TaskResultGetter: Exception while getting task result
> com.esotericsoftware.kryo.KryoException: java.lang.NullPointerException
> Serialization trace:
> underlying (org.apache.spark.util.BoundedPriorityQueue)
> at com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:626)
> at com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221)
> at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:732)
> at com.twitter.chill.SomeSerializer.read(SomeSerializer.scala:25)
> at com.twitter.chill.SomeSerializer.read(SomeSerializer.scala:19)
> at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:732)
> at org.apache.spark.serializer.KryoSerializerInstance.deserialize(KryoSerializer.scala:312)
> at org.apache.spark.scheduler.DirectTaskResult.value(TaskResult.scala:87)
> at org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply$mcV$sp(TaskResultGetter.scala:66)
> at org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply(TaskResultGetter.scala:57)
> at org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply(TaskResultGetter.scala:57)
> at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1790)
> at org.apache.spark.scheduler.TaskResultGetter$$anon$2.run(TaskResultGetter.scala:56)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NullPointerException
> at org.apache.spark.sql.catalyst.expressions.codegen.LazilyGeneratedOrdering.compare(GenerateOrdering.scala:157)
> at org.apache.spark.sql.catalyst.expressions.codegen.LazilyGeneratedOrdering.compare(GenerateOrdering.scala:148)
> at scala.math.Ordering$$anon$4.compare(Ordering.scala:111)
> at java.util.PriorityQueue.siftUpUsingComparator(PriorityQueue.java:669)
> at java.util.PriorityQueue.siftUp(PriorityQueue.java:645)
> at java.util.PriorityQueue.offer(PriorityQueue.java:344)
> at java.util.PriorityQueue.add(PriorityQueue.java:321)
> at com.twitter.chill.java.PriorityQueueSerializer.read(PriorityQueueSerializer.java:78)
> at com.twitter.chill.java.PriorityQueueSerializer.read(PriorityQueueSerializer.java:31)
> at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:651)
> at com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:605)
> ... 15 more
> 16/03/22 15:12:56 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 10.0, whose tasks have all completed, from pool
> 16/03/22 15:12:56 INFO scheduler.TaskSetManager: Finished task 66.0 in stage 10.0 (TID 623) in 171 ms on localhost (31/200)
> 16/03/22 15:12:56 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 10.0, whose tasks have all completed, from pool
> {noformat}
> query 06 (caused the above NPE):
> {noformat}
> select a.ca_state state, count(*) cnt
> from customer_address a
> join customer c on a.ca_address_sk = c.c_current_addr_sk
> join store_sales s on c.c_customer_sk = s.ss_customer_sk
> join date_dim d on s.ss_sold_date_sk = d.d_date_sk
> join item i on s.ss_item_sk = i.i_item_sk
> join (select distinct d_month_seq
> from date_dim
> where d_year = 2001
> and d_moy = 1 ) tmp1 ON d.d_month_seq = tmp1.d_month_seq
> join
> (select j.i_category, avg(j.i_current_price) as avg_i_current_price
> from item j group by j.i_category) tmp2 on tmp2.i_category = i.i_category
> where
> i.i_current_price > 1.2 * tmp2.avg_i_current_price
> group by a.ca_state
> having count(*) >= 10
> order by cnt
> limit 100;
> {noformat}
> query 38 (succeeded)
> {noformat}
> select count(*) from (
> select distinct c_last_name, c_first_name, d_date
> from store_sales, date_dim, customer
> where store_sales.ss_sold_date_sk = date_dim.d_date_sk
> and store_sales.ss_customer_sk = customer.c_customer_sk
> and d_month_seq between 1200 and 1200 + 11
> intersect
> select distinct c_last_name, c_first_name, d_date
> from catalog_sales, date_dim, customer
> where catalog_sales.cs_sold_date_sk = date_dim.d_date_sk
> and catalog_sales.cs_bill_customer_sk = customer.c_customer_sk
> and d_month_seq between 1200 and 1200 + 11
> intersect
> select distinct c_last_name, c_first_name, d_date
> from web_sales, date_dim, customer
> where web_sales.ws_sold_date_sk = date_dim.d_date_sk
> and web_sales.ws_bill_customer_sk = customer.c_customer_sk
> and d_month_seq between 1200 and 1200 + 11
> ) hot_cust
> limit 100;
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org