You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by bi...@gmail.com on 2016/05/12 11:27:27 UTC

Spark SQL: Managed memory leak detected

Hi All:

I meet an exception about " Managed memory leak detected " at the time of executing sql via `HiveContext.sql` when I am running a spark job. 

I found one similar JIRA(https://issues.apache.org/jira/browse/SPARK-11293 <https://issues.apache.org/jira/browse/SPARK-11293>), and guess the same reason results in this problem. 
`UnsafeInMemorySorter` can't release memory used.
But I am not sure.

Spark Configure: 
spark.shuffle.memoryFraction 0.7
spark.storage.memoryFraction 0.1 (无cache)
spark.executor.memory 13G
spark.executor.instances 40
spark.executor.cores 1
spark.master yarn-cluster

Every task just deals with small data about 1.2 MB。



Are there other ones who meet this problem? How to solve? 

Exception StackTrace:

===========
16/05/12 16:32:53 INFO executor.Executor: Running task 232.3 in stage 4.0 (TID 1631)
16/05/12 16:32:53 INFO storage.ShuffleBlockFetcherIterator: Getting 14 non-empty blocks out of 14 blocks
16/05/12 16:32:53 INFO storage.ShuffleBlockFetcherIterator: Started 9 remote fetches in 2 ms
16/05/12 16:32:53 INFO storage.ShuffleBlockFetcherIterator: Getting 14 non-empty blocks out of 14 blocks
16/05/12 16:32:53 INFO storage.ShuffleBlockFetcherIterator: Started 7 remote fetches in 2 ms
16/05/12 16:32:59 INFO executor.Executor: Finished task 377.0 in stage 4.0 (TID 1506). 2341 bytes result sent to driver
16/05/12 16:33:16 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 1634
16/05/12 16:33:16 INFO executor.Executor: Running task 20.3 in stage 4.0 (TID 1634)
16/05/12 16:33:16 INFO shuffle.ShuffleMemoryManager: TID 1634 waiting for at least 1/2N of shuffle memory pool to be free
16/05/12 16:33:17 INFO shuffle.ShuffleMemoryManager: TID 1634 waiting for at least 1/2N of shuffle memory pool to be free
16/05/12 16:33:17 INFO aggregate.TungstenAggregationIterator: falling back to sort based aggregation.
16/05/12 16:33:18 ERROR executor.CoarseGrainedExecutorBackend: RECEIVED SIGNAL 15: SIGTERM
16/05/12 16:33:18 INFO octo.DropCounter: shutdown executor with queue size 0
16/05/12 16:33:18 INFO octo.AsyncOctoCollector: shutdown executor with queue size 0
16/05/12 16:33:18 ERROR executor.Executor: Managed memory leak detected; size = 1409286144 bytes, TID = 1631
16/05/12 16:33:18 INFO storage.ShuffleBlockFetcherIterator: Getting 14 non-empty blocks out of 14 blocks
16/05/12 16:33:18 ERROR executor.Executor: Exception in task 232.3 in stage 4.0 (TID 1631)
java.lang.OutOfMemoryError: Java heap space
	at org.apache.spark.util.collection.unsafe.sort.UnsafeInMemorySorter.<init>(UnsafeInMemorySorter.java:86)
	at org.apache.spark.sql.execution.UnsafeKVExternalSorter.<init>(UnsafeKVExternalSorter.java:89)
	at org.apache.spark.sql.execution.UnsafeFixedWidthAggregationMap.destructAndCreateExternalSorter(UnsafeFixedWidthAggregationMap.java:257)
	at org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.switchToSortBasedAggregation(TungstenAggregationIterator.scala:435)
	at org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.processInputs(TungstenAggregationIterator.scala:379)
	at org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.start(TungstenAggregationIterator.scala:622)
	at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1.org$apache$spark$sql$execution$aggregate$TungstenAggregate$$anonfun$$executePartition$1(TungstenAggregate.scala:110)
	at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:119)
	at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:119)
	at org.apache.spark.rdd.MapPartitionsWithPreparationRDD.compute(MapPartitionsWithPreparationRDD.scala:64)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
	at org.apache.spark.scheduler.Task.run(Task.scala:88)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)

---------------
16/05/12 18:55:27 WARN scheduler.TaskSetManager: Lost task 232.2 in stage 4.0 (TID 1631, ...): java.lang.OutOfMemoryError: Java heap space
	at org.apache.spark.unsafe.map.BytesToBytesMap.allocate(BytesToBytesMap.java:686)
	at org.apache.spark.unsafe.map.BytesToBytesMap.growAndRehash(BytesToBytesMap.java:803)
	at org.apache.spark.unsafe.map.BytesToBytesMap$Location.putNewKey(BytesToBytesMap.java:651)
	at org.apache.spark.sql.execution.UnsafeFixedWidthAggregationMap.getAggregationBufferFromUnsafeRow(UnsafeFixedWidthAggregationMap.java:138)
	at org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.processInputs(TungstenAggregationIterator.scala:375)
	at org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.start(TungstenAggregationIterator.scala:622)
	at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1.org$apache$spark$sql$execution$aggregate$TungstenAggregate$$anonfun$$executePartition$1(TungstenAggregate.scala:110)
	at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:119)
	at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:119)
	at org.apache.spark.rdd.MapPartitionsWithPreparationRDD.compute(MapPartitionsWithPreparationRDD.scala:64)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
	at org.apache.spark.scheduler.Task.run(Task.scala:88)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)

===============

Waiting for your reply.

Thank you .