You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Jeff Jones <jj...@adaptivebiotech.com> on 2016/05/20 15:15:07 UTC

StackOverflowError in Spark SQL

I’m running Spark 1.6.0 in a standalone cluster. Periodically I’ve seen StackOverflowErrors when running queries. An example below.
In the past I’ve been able to avoid such situations by ensuring we don’t have too many arguments in ‘in’ clauses or too many unioned queries both of which seem to trigger issues like this.

Why doesn’t Spark protect itself from such issues? I’d much rather get a Spark exception that gets thrown and can be handled than get a StackOverflowException which causes the JVM to exit.

I can provide the full stack upon request.

Thanks,
Jeff


2016-05-20 05:32:23,044 - [ERROR] - from akka.actor.ActorSystemImpl in play-akka.actor.default-dispatcher-8

Uncaught error from thread [play-akka.actor.default-dispatcher-84] shutting down JVM since 'akka.jvm-exit-on-fatal-error' is enabled

java.lang.StackOverflowError: null

        at org.apache.spark.sql.catalyst.plans.logical.SetOperation.output(basicOperators.scala:96) ~[spark-assembly-1.6.0-hadoop2.2.0.jar:1.6.0]

        at org.apache.spark.sql.catalyst.plans.logical.SetOperation.output(basicOperators.scala:96) ~[spark-assembly-1.6.0-hadoop2.2.0.jar:1.6.0]

        at org.apache.spark.sql.catalyst.plans.logical.SetOperation.output(basicOperators.scala:96) ~[spark-assembly-1.6.0-hadoop2.2.0.jar:1.6.0]

        at org.apache.spark.sql.catalyst.plans.logical.SetOperation.output(basicOperators.scala:96) ~[spark-assembly-1.6.0-hadoop2.2.0.jar:1.6.0]

        at org.apache.spark.sql.catalyst.plans.logical.SetOperation.output(basicOperators.scala:96) ~[spark-assembly-1.6.0-hadoop2.2.0.jar:1.6.0]

        at org.apache.spark.sql.catalyst.plans.logical.SetOperation.output(basicOperators.scala:96) ~[spark-assembly-1.6.0-hadoop2.2.0.jar:1.6.0]

        at org.apache.spark.sql.catalyst.plans.logical.SetOperation.output(basicOperators.scala:96) ~[spark-assembly-1.6.0-hadoop2.2.0.jar:1.6.0]

        at org.apache.spark.sql.catalyst.plans.logical.SetOperation.output(basicOperators.scala:96) ~[spark-assembly-1.6.0-hadoop2.2.0.jar:1.6.0]

<clipped…>


        at org.apache.spark.sql.catalyst.plans.logical.SetOperation.output(basicOperators.scala:96) ~[spark-assembly-1.6.0-hadoop2.2.0.jar:1.6.0]

        at org.apache.spark.sql.catalyst.plans.logical.SetOperation.output(basicOperators.scala:96) ~[spark-assembly-1.6.0-hadoop2.2.0.jar:1.6.0]

        at org.apache.spark.sql.catalyst.optimizer.SetOperationPushDown$.org$apache$spark$sql$catalyst$optimizer$SetOperationPushDown$$buildRewrites(Optimizer.scala:110) ~[spark-assembly-1.6.0-hadoop2.2.0.jar:1.6.0]

        at org.apache.spark.sql.catalyst.optimizer.SetOperationPushDown$$anonfun$apply$2.applyOrElse(Optimizer.scala:149) ~[spark-assembly-1.6.0-hadoop2.2.0.jar:1.6.0]

        at org.apache.spark.sql.catalyst.optimizer.SetOperationPushDown$$anonfun$apply$2.applyOrElse(Optimizer.scala:145) ~[spark-assembly-1.6.0-hadoop2.2.0.jar:1.6.0]

        at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:243) ~[spark-assembly-1.6.0-hadoop2.2.0.jar:1.6.0]

        at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:243) ~[spark-assembly-1.6.0-hadoop2.2.0.jar:1.6.0]

        at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:53) ~[spark-assembly-1.6.0-hadoop2.2.0.jar:1.6.0]

        at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:242) ~[spark-assembly-1.6.0-hadoop2.2.0.jar:1.6.0]

        at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:248) ~[spark-assembly-1.6.0-hadoop2.2.0.jar:1.6.0]

        at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:248) ~[spark-assembly-1.6.0-hadoop2.2.0.jar:1.6.0]

        at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:265) ~[spark-assembly-1.6.0-hadoop2.2.0.jar:1.6.0]

        at scala.collection.Iterator$$anon$11.next(Iterator.scala:370) ~[spark-assembly-1.6.0-hadoop2.2.0.jar:1.6.0]

        at scala.collection.Iterator$class.foreach(Iterator.scala:742) ~[spark-assembly-1.6.0-hadoop2.2.0.jar:1.6.0]

        at scala.collection.AbstractIterator.foreach(Iterator.scala:1194) ~[spark-assembly-1.6.0-hadoop2.2.0.jar:1.6.0]

        at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:59) ~[spark-assembly-1.6.0-hadoop2.2.0.jar:1.6.0]

        at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:104) ~[spark-assembly-1.6.0-hadoop2.2.0.jar:1.6.0]

        at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:48) ~[spark-assembly-1.6.0-hadoop2.2.0.jar:1.6.0]

        at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:308) ~[spark-assembly-1.6.0-hadoop2.2.0.jar:1.6.0]

        at scala.collection.AbstractIterator.to(Iterator.scala:1194) ~[spark-assembly-1.6.0-hadoop2.2.0.jar:1.6.0]

        at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:300) ~[spark-assembly-1.6.0-hadoop2.2.0.jar:1.6.0]

        at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1194) ~[spark-assembly-1.6.0-hadoop2.2.0.jar:1.6.0]

        at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:287) ~[spark-assembly-1.6.0-hadoop2.2.0.jar:1.6.0]

        at scala.collection.AbstractIterator.toArray(Iterator.scala:1194) ~[spark-assembly-1.6.0-hadoop2.2.0.jar:1.6.0]

        at org.apache.spark.sql.catalyst.trees.TreeNode.transformChildren(TreeNode.scala:305) ~[spark-assembly-1.6.0-hadoop2.2.0.jar:1.6.0]

        at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:248) ~[spark-assembly-1.6.0-hadoop2.2.0.jar:1.6.0]

        at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$2.apply(TreeNode.scala:250) ~[spark-assembly-1.6.0-hadoop2.2.0.jar:1.6.0]

        at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$2.apply(TreeNode.scala:250) ~[spark-assembly-1.6.0-hadoop2.2.0.jar:1.6.0]

        at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:265) ~[spark-assembly-1.6.0-hadoop2.2.0.jar:1.6.0]

        at scala.collection.Iterator$$anon$11.next(Iterator.scala:370) ~[spark-assembly-1.6.0-hadoop2.2.0.jar:1.6.0]

        at scala.collection.Iterator$class.foreach(Iterator.scala:742) ~[spark-assembly-1.6.0-hadoop2.2.0.jar:1.6.0]

        at scala.collection.AbstractIterator.foreach(Iterator.scala:1194) ~[spark-assembly-1.6.0-hadoop2.2.0.jar:1.6.0]

        at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:59) ~[spark-assembly-1.6.0-hadoop2.2.0.jar:1.6.0]

        at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:104) ~[spark-assembly-1.6.0-hadoop2.2.0.jar:1.6.0]

        at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:48) ~[spark-assembly-1.6.0-hadoop2.2.0.jar:1.6.0]

        at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:308) ~[spark-assembly-1.6.0-hadoop2.2.0.jar:1.6.0]

        at scala.collection.AbstractIterator.to(Iterator.scala:1194) ~[spark-assembly-1.6.0-hadoop2.2.0.jar:1.6.0]

        at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:300) ~[spark-assembly-1.6.0-hadoop2.2.0.jar:1.6.0]

        at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1194) ~[spark-assembly-1.6.0-hadoop2.2.0.jar:1.6.0]

        at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:287) ~[spark-assembly-1.6.0-hadoop2.2.0.jar:1.6.0]

        at scala.collection.AbstractIterator.toArray(Iterator.scala:1194) ~[spark-assembly-1.6.0-hadoop2.2.0.jar:1.6.0]

        at org.apache.spark.sql.catalyst.trees.TreeNode.transformChildren(TreeNode.scala:305) ~[spark-assembly-1.6.0-hadoop2.2.0.jar:1.6.0]

        at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:250) ~[spark-assembly-1.6.0-hadoop2.2.0.jar:1.6.0]

        at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:248) ~[spark-assembly-1.6.0-hadoop2.2.0.jar:1.6.0]

        at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:248) ~[spark-assembly-1.6.0-hadoop2.2.0.jar:1.6.0]

        at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:265) ~[spark-assembly-1.6.0-hadoop2.2.0.jar:1.6.0]

        at scala.collection.Iterator$$anon$11.next(Iterator.scala:370) ~[spark-assembly-1.6.0-hadoop2.2.0.jar:1.6.0]

        at scala.collection.Iterator$class.foreach(Iterator.scala:742) ~[spark-assembly-1.6.0-hadoop2.2.0.jar:1.6.0]

<and much more…>



This message (and any attachments) is intended only for the designated recipient(s). It
may contain confidential or proprietary information, or have other limitations on use as
indicated by the sender. If you are not a designated recipient, you may not review, use,
copy or distribute this message. If you received this in error, please notify the sender by
reply e-mail and delete this message.