You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Xiao Li (JIRA)" <ji...@apache.org> on 2016/10/07 22:28:20 UTC

[jira] [Updated] (SPARK-7603) Crash of thrift server when doing SQL without "limit"

     [ https://issues.apache.org/jira/browse/SPARK-7603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Xiao Li updated SPARK-7603:
---------------------------
    Component/s:     (was: SQL)
                 Web UI

> Crash of thrift server when doing SQL without "limit"
> -----------------------------------------------------
>
>                 Key: SPARK-7603
>                 URL: https://issues.apache.org/jira/browse/SPARK-7603
>             Project: Spark
>          Issue Type: Bug
>          Components: Web UI
>    Affects Versions: 1.3.1
>         Environment: Hortonworks Sandbox 2.1  with Spark 1.3.1
>            Reporter: Ihor Bobak
>
> I have 2 tables in hive: one with 120 thousand records, another one is 5 times smaller. 
> I'm running a standalone cluster on single VM, and the thrift server with 
> ./start-thriftserver.sh --conf spark.executor.memory=2048m  --conf spark.driver.memory=1024m
> command. 
> My spark-defaults.conf contains:
> spark.master                     spark://sandbox.hortonworks.com:7077
> spark.eventLog.enabled           true
> spark.eventLog.dir               hdfs://sandbox.hortonworks.com:8020/user/pdi/spark/logs
> So, when I am running SQL 
> select <some fields from header>, <some fields from details>
> from  
> 	vw_salesorderdetail as d 
> 	left join vw_salesorderheader as h on h.SalesOrderID = d.SalesOrderID limit 2000000000;
> everything is fine, no matter that the limit is unreal (again: the resultset returned is just 120000 records).
> But if I am running the same query without limit clause - I get hanging of execution - see here: http://postimg.org/image/fujdjd16f/42945a78/
> and a lot of exceptions in the logs of thrift server - here you are:
> 15/05/13 17:59:27 INFO TaskSetManager: Starting task 158.0 in stage 48.0 (TID 953, sandbox.hortonworks.com, PROCESS_LOCAL, 1473 bytes)
> 15/05/13 18:00:01 INFO TaskSetManager: Finished task 150.0 in stage 48.0 (TID 945) in 36166 ms on sandbox.hortonworks.com (152/200)
> 15/05/13 18:00:02 ERROR Utils: Uncaught exception in thread Spark Context Cleaner
> java.lang.OutOfMemoryError: GC overhead limit exceeded
> 	at org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply$mcV$sp(ContextCleaner.scala:147)
> 	at org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply(ContextCleaner.scala:144)
> 	at org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply(ContextCleaner.scala:144)
> 	at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1618)
> 	at org.apache.spark.ContextCleaner.org$apache$spark$ContextCleaner$$keepCleaning(ContextCleaner.scala:143)
> 	at org.apache.spark.ContextCleaner$$anon$3.run(ContextCleaner.scala:65)
> Exception in thread "Spark Context Cleaner" 15/05/13 18:00:02 ERROR Utils: Uncaught exception in thread task-result-getter-1
> java.lang.OutOfMemoryError: GC overhead limit exceeded
> 	at java.lang.String.<init>(String.java:315)
> 	at com.esotericsoftware.kryo.io.Input.readAscii(Input.java:562)
> 	at com.esotericsoftware.kryo.io.Input.readString(Input.java:436)
> 	at com.esotericsoftware.kryo.serializers.DefaultSerializers$StringSerializer.read(DefaultSerializers.java:157)
> 	at com.esotericsoftware.kryo.serializers.DefaultSerializers$StringSerializer.read(DefaultSerializers.java:146)
> 	at com.esotericsoftware.kryo.Kryo.readObjectOrNull(Kryo.java:706)
> 	at com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:611)
> 	at com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221)
> 	at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:732)
> 	at com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:338)
> 	at com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:293)
> 	at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:651)
> 	at com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:605)
> 	at com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221)
> 	at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:651)
> 	at com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:605)
> 	at com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221)
> 	at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:732)
> 	at com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:338)
> 	at com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:293)
> 	at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:732)
> 	at org.apache.spark.serializer.KryoSerializerInstance.deserialize(KryoSerializer.scala:173)
> 	at org.apache.spark.scheduler.DirectTaskResult.value(TaskResult.scala:79)
> 	at org.apache.spark.scheduler.TaskSetManager.handleSuccessfulTask(TaskSetManager.scala:621)
> 	at org.apache.spark.scheduler.TaskSchedulerImpl.handleSuccessfulTask(TaskSchedulerImpl.scala:379)
> 	at org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply$mcV$sp(TaskResultGetter.scala:82)
> 	at org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply(TaskResultGetter.scala:51)
> 	at org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply(TaskResultGetter.scala:51)
> 	at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1618)
> 	at org.apache.spark.scheduler.TaskResultGetter$$anon$2.run(TaskResultGetter.scala:50)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> Exception in thread "task-result-getter-1" 15/05/13 18:00:04 INFO TaskSetManager: Starting task 159.0 in stage 48.0 (TID 954, sandbox.hortonworks.com, PROCESS_LOCAL, 1473 bytes)
> java.lang.OutOfMemoryError: GC overhead limit exceeded
> 	at org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply$mcV$sp(ContextCleaner.scala:147)
> 	at org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply(ContextCleaner.scala:144)
> 	at org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply(ContextCleaner.scala:144)
> 	at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1618)
> 	at org.apache.spark.ContextCleaner.org$apache$spark$ContextCleaner$$keepCleaning(ContextCleaner.scala:143)
> 	at org.apache.spark.ContextCleaner$$anon$3.run(ContextCleaner.scala:65)
> java.lang.OutOfMemoryError: GC overhead limit exceeded
> 	at java.lang.String.<init>(String.java:315)
> 	at com.esotericsoftware.kryo.io.Input.readAscii(Input.java:562)
> 	at com.esotericsoftware.kryo.io.Input.readString(Input.java:436)
> 	at com.esotericsoftware.kryo.serializers.DefaultSerializers$StringSerializer.read(DefaultSerializers.java:157)
> 	at com.esotericsoftware.kryo.serializers.DefaultSerializers$StringSerializer.read(DefaultSerializers.java:146)
> 	at com.esotericsoftware.kryo.Kryo.readObjectOrNull(Kryo.java:706)
> 	at com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:611)
> 	at com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221)
> 	at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:732)
> 	at com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:338)
> 	at com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:293)
> 	at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:651)
> 	at com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:605)
> 	at com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221)
> 	at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:651)
> 	at com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:605)
> 	at com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221)
> 	at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:732)
> 	at com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:338)
> 	at com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read(DefaultArraySerializers.java:293)
> 	at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:732)
> 	at org.apache.spark.serializer.KryoSerializerInstance.deserialize(KryoSerializer.scala:173)
> 	at org.apache.spark.scheduler.DirectTaskResult.value(TaskResult.scala:79)
> 	at org.apache.spark.scheduler.TaskSetManager.handleSuccessfulTask(TaskSetManager.scala:621)
> 	at org.apache.spark.scheduler.TaskSchedulerImpl.handleSuccessfulTask(TaskSchedulerImpl.scala:379)
> 	at org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply$mcV$sp(TaskResultGetter.scala:82)
> 	at org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply(TaskResultGetter.scala:51)
> 	at org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply(TaskResultGetter.scala:51)
> 	at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1618)
> 	at org.apache.spark.scheduler.TaskResultGetter$$anon$2.run(TaskResultGetter.scala:50)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> 15/05/13 18:00:05 INFO TaskSetManager: Finished task 154.0 in stage 48.0 (TID 949) in 40665 ms on sandbox.hortonworks.com (153/200)
> 15/05/13 18:00:20 ERROR Utils: Uncaught exception in thread task-result-getter-3
> java.lang.OutOfMemoryError: GC overhead limit exceeded
> Exception in thread "task-result-getter-3" java.lang.OutOfMemoryError: GC overhead limit exceeded
> 15/05/13 18:00:28 ERROR Utils: Uncaught exception in thread task-result-getter-2
> java.lang.OutOfMemoryError: GC overhead limit exceeded
> Exception in thread "task-result-getter-2" java.lang.OutOfMemoryError: GC overhead limit exceeded
> 15/05/13 18:00:29 INFO TaskSetManager: Starting task 160.0 in stage 48.0 (TID 955, sandbox.hortonworks.com, PROCESS_LOCAL, 1473 bytes)
> 15/05/13 18:00:31 ERROR ActorSystemImpl: exception on LARS’ timer thread
> java.lang.OutOfMemoryError: GC overhead limit exceeded
> 	at akka.actor.LightArrayRevolverScheduler$$anon$8.nextTick(Scheduler.scala:409)
> 	at akka.actor.LightArrayRevolverScheduler$$anon$8.run(Scheduler.scala:375)
> 	at java.lang.Thread.run(Thread.java:744)
> 15/05/13 18:00:31 INFO ActorSystemImpl: starting new LARS thread
> 15/05/13 18:00:31 ERROR ActorSystemImpl: Uncaught fatal error from thread [sparkDriver-akka.remote.default-remote-dispatcher-6] shutting down ActorSystem [sparkDriver]
> java.lang.OutOfMemoryError: GC overhead limit exceeded
> 	at java.lang.Class.getDeclaredMethods0(Native Method)
> 	at java.lang.Class.privateGetDeclaredMethods(Class.java:2531)
> 	at java.lang.Class.getDeclaredMethod(Class.java:2002)
> 	at java.io.ObjectStreamClass.getPrivateMethod(ObjectStreamClass.java:1431)
> 	at java.io.ObjectStreamClass.access$1700(ObjectStreamClass.java:72)
> 	at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:494)
> 	at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:468)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:468)
> 	at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:365)
> 	at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:602)
> 	at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1622)
> 	at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517)
> 	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771)
> 	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
> 	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
> 	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
> 	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
> 	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
> 	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
> 	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
> 	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
> 	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
> 	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
> 	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
> 	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
> 	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
> 	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
> 	at akka.serialization.JavaSerializer$$anonfun$1.apply(Serializer.scala:136)
> 	at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
> 	at akka.serialization.JavaSerializer.fromBinary(Serializer.scala:136)
> 	at akka.serialization.Serialization$$anonfun$deserialize$1.apply(Serialization.scala:104)
> 15/05/13 18:00:31 ERROR ActorSystemImpl: Uncaught fatal error from thread [sparkDriver-scheduler-1] shutting down ActorSystem [sparkDriver]
> java.lang.OutOfMemoryError: GC overhead limit exceeded
> 	at akka.actor.LightArrayRevolverScheduler$$anon$8.nextTick(Scheduler.scala:409)
> 	at akka.actor.LightArrayRevolverScheduler$$anon$8.run(Scheduler.scala:375)
> 	at java.lang.Thread.run(Thread.java:744)
> 15/05/13 18:00:31 ERROR ActorSystemImpl: Uncaught fatal error from thread [sparkDriver-akka.remote.default-remote-dispatcher-5] shutting down ActorSystem [sparkDriver]
> java.lang.OutOfMemoryError: GC overhead limit exceeded
> 	at java.lang.Class.getDeclaredMethods0(Native Method)
> 	at java.lang.Class.privateGetDeclaredMethods(Class.java:2531)
> 	at java.lang.Class.getDeclaredMethod(Class.java:2002)
> 	at java.io.ObjectStreamClass.getPrivateMethod(ObjectStreamClass.java:1431)
> 	at java.io.ObjectStreamClass.access$1700(ObjectStreamClass.java:72)
> 	at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:494)
> 	at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:468)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:468)
> 	at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:365)
> 	at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:602)
> 	at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1622)
> 	at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517)
> 	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771)
> 	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
> 	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
> 	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
> 	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
> 	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
> 	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
> 	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
> 	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
> 	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
> 	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
> 	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
> 	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
> 	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
> 	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
> 	at akka.serialization.JavaSerializer$$anonfun$1.apply(Serializer.scala:136)
> 	at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
> Feel free to contact me - I will send you full logs. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org