You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (JIRA)" <ji...@apache.org> on 2015/12/05 03:12:10 UTC
[jira] [Commented] (SPARK-12155) Execution OOM after a relative large dataset cached in the cluster.

    [ https://issues.apache.org/jira/browse/SPARK-12155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15042582#comment-15042582 ] 

Apache Spark commented on SPARK-12155:
--------------------------------------

User 'yhuai' has created a pull request for this issue:
https://github.com/apache/spark/pull/10153

> Execution OOM after a relative large dataset cached in the cluster.
> -------------------------------------------------------------------
>
>                 Key: SPARK-12155
>                 URL: https://issues.apache.org/jira/browse/SPARK-12155
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core, SQL
>            Reporter: Yin Huai
>            Priority: Blocker
>
> I have a cluster with relative 80GB of mem. Then, I cached a 43GB dataframe. When I start to consume the query. I got the following exception (I added more logs to the code).
> {code}
> 15/12/05 00:33:43 INFO UnifiedMemoryManager: Creating UnifedMemoryManager for 4 cores with 16929521664 maxMemory, 8464760832 storageRegionSize.
> 15/12/05 01:20:50 INFO MemoryStore: Ensuring 1048576 bytes of free space for block rdd_94_37(free: 3253659951, max: 16798973952)
> 15/12/05 01:20:50 INFO MemoryStore: Ensuring 5142008 bytes of free space for block rdd_94_37(free: 3252611375, max: 16798973952)
> 15/12/05 01:20:50 INFO Executor: Finished task 36.0 in stage 4.0 (TID 109). 3028 bytes result sent to driver
> 15/12/05 01:20:50 INFO MemoryStore: Ensuring 98948238 bytes of free space for block rdd_94_37(free: 3314840375, max: 16866344960)
> 15/12/05 01:20:50 INFO MemoryStore: Ensuring 98675713 bytes of free space for block rdd_94_37(free: 3215892137, max: 16866344960)
> 15/12/05 01:20:50 INFO MemoryStore: Ensuring 197347565 bytes of free space for block rdd_94_37(free: 3117216424, max: 16866344960)
> 15/12/05 01:20:50 INFO MemoryStore: Ensuring 295995553 bytes of free space for block rdd_94_37(free: 2919868859, max: 16866344960)
> 15/12/05 01:20:51 INFO MemoryStore: Ensuring 394728479 bytes of free space for block rdd_94_37(free: 2687050010, max: 16929521664)
> 15/12/05 01:20:51 INFO Executor: Finished task 32.0 in stage 4.0 (TID 106). 3028 bytes result sent to driver
> 15/12/05 01:20:51 INFO MemoryStore: Ensuring 591258816 bytes of free space for block rdd_94_37(free: 2292321531, max: 16929521664)
> 15/12/05 01:20:51 INFO MemoryStore: Ensuring 901645182 bytes of free space for block rdd_94_37(free: 1701062715, max: 16929521664)
> 15/12/05 01:20:52 INFO MemoryStore: Ensuring 1302179076 bytes of free space for block rdd_94_37(free: 799417533, max: 16929521664)
> 15/12/05 01:20:52 INFO MemoryStore: Will not store rdd_94_37 as it would require dropping another block from the same RDD
> 15/12/05 01:20:52 WARN MemoryStore: Not enough space to cache rdd_94_37 in memory! (computed 2.4 GB so far)
> 15/12/05 01:20:52 INFO MemoryStore: Memory use = 12.6 GB (blocks) + 2.4 GB (scratch space shared across 13 tasks(s)) = 15.0 GB. Storage limit = 15.8 GB.
> 15/12/05 01:20:52 INFO BlockManager: Found block rdd_94_37 locally
> 15/12/05 01:20:52 INFO UnifiedMemoryManager: Try to acquire 262144 bytes memory. But, on-heap execution memory poll only has 0 bytes free memory.
> 15/12/05 01:20:52 INFO UnifiedMemoryManager: memoryReclaimableFromStorage 8464760832, storageMemoryPool.poolSize 16929521664, storageRegionSize 8464760832.
> 15/12/05 01:20:52 INFO UnifiedMemoryManager: Try to reclaim memory space from storage memory pool.
> 15/12/05 01:20:52 INFO StorageMemoryPool: Claiming 262144 bytes free memory space from StorageMemoryPool.
> 15/12/05 01:20:52 INFO UnifiedMemoryManager: Reclaimed 262144 bytes of memory from storage memory pool.Adding them back to onHeapExecutionMemoryPool.
> 15/12/05 01:20:52 INFO UnifiedMemoryManager: Try to acquire 67108864 bytes memory. But, on-heap execution memory poll only has 0 bytes free memory.
> 15/12/05 01:20:52 INFO UnifiedMemoryManager: memoryReclaimableFromStorage 8464498688, storageMemoryPool.poolSize 16929259520, storageRegionSize 8464760832.
> 15/12/05 01:20:52 INFO UnifiedMemoryManager: Try to reclaim memory space from storage memory pool.
> 15/12/05 01:20:52 INFO StorageMemoryPool: Claiming 67108864 bytes free memory space from StorageMemoryPool.
> 15/12/05 01:20:52 INFO UnifiedMemoryManager: Reclaimed 67108864 bytes of memory from storage memory pool.Adding them back to onHeapExecutionMemoryPool.
> 15/12/05 01:20:54 INFO Executor: Finished task 37.0 in stage 4.0 (TID 110). 3077 bytes result sent to driver
> 15/12/05 01:20:56 INFO CoarseGrainedExecutorBackend: Got assigned task 120
> 15/12/05 01:20:56 INFO Executor: Running task 1.0 in stage 5.0 (TID 120)
> 15/12/05 01:20:56 INFO CoarseGrainedExecutorBackend: Got assigned task 124
> 15/12/05 01:20:56 INFO CoarseGrainedExecutorBackend: Got assigned task 128
> 15/12/05 01:20:56 INFO CoarseGrainedExecutorBackend: Got assigned task 132
> 15/12/05 01:20:56 INFO Executor: Running task 9.0 in stage 5.0 (TID 128)
> 15/12/05 01:20:56 INFO Executor: Running task 13.0 in stage 5.0 (TID 132)
> 15/12/05 01:20:56 INFO Executor: Running task 5.0 in stage 5.0 (TID 124)
> 15/12/05 01:20:56 INFO MapOutputTrackerWorker: Updating epoch to 2 and clearing cache
> 15/12/05 01:20:56 INFO TorrentBroadcast: Started reading broadcast variable 6
> 15/12/05 01:20:56 INFO MemoryStore: Ensuring 9471 bytes of free space for block broadcast_6_piece0(free: 3384207663, max: 16929521664)
> 15/12/05 01:20:56 INFO MemoryStore: Block broadcast_6_piece0 stored as bytes in memory (estimated size 9.2 KB, free 12.6 GB)
> 15/12/05 01:20:56 INFO TorrentBroadcast: Reading broadcast variable 6 took 5 ms
> 15/12/05 01:20:56 INFO MemoryStore: Ensuring 1048576 bytes of free space for block broadcast_6(free: 3384198192, max: 16929521664)
> 15/12/05 01:20:56 INFO MemoryStore: Ensuring 22032 bytes of free space for block broadcast_6(free: 3384198192, max: 16929521664)
> 15/12/05 01:20:56 INFO MemoryStore: Block broadcast_6 stored as values in memory (estimated size 21.5 KB, free 12.6 GB)
> 15/12/05 01:20:56 INFO MapOutputTrackerWorker: Don't have map outputs for shuffle 1, fetching them
> 15/12/05 01:20:56 INFO MapOutputTrackerWorker: Don't have map outputs for shuffle 1, fetching them
> 15/12/05 01:20:56 INFO MapOutputTrackerWorker: Don't have map outputs for shuffle 1, fetching them
> 15/12/05 01:20:56 INFO MapOutputTrackerWorker: Don't have map outputs for shuffle 1, fetching them
> 15/12/05 01:20:56 INFO MapOutputTrackerWorker: Doing the fetch; tracker endpoint = NettyRpcEndpointRef(spark://MapOutputTracker@10.0.202.130:56969)
> 15/12/05 01:20:56 INFO MapOutputTrackerWorker: Got the output locations
> 15/12/05 01:20:56 INFO ShuffleBlockFetcherIterator: Getting 43 non-empty blocks out of 43 blocks
> 15/12/05 01:20:56 INFO ShuffleBlockFetcherIterator: Getting 43 non-empty blocks out of 43 blocks
> 15/12/05 01:20:56 INFO ShuffleBlockFetcherIterator: Getting 43 non-empty blocks out of 43 blocks
> 15/12/05 01:20:56 INFO ShuffleBlockFetcherIterator: Getting 43 non-empty blocks out of 43 blocks
> 15/12/05 01:20:56 INFO ShuffleBlockFetcherIterator: Started 3 remote fetches in 41 ms
> 15/12/05 01:20:56 INFO ShuffleBlockFetcherIterator: Started 3 remote fetches in 41 ms
> 15/12/05 01:20:56 INFO ShuffleBlockFetcherIterator: Started 3 remote fetches in 40 ms
> 15/12/05 01:20:56 INFO ShuffleBlockFetcherIterator: Started 3 remote fetches in 41 ms
> 15/12/05 01:20:56 INFO UnifiedMemoryManager: Try to acquire 67108864 bytes memory. But, on-heap execution memory poll only has 66846720 bytes free memory.
> 15/12/05 01:20:56 INFO UnifiedMemoryManager: memoryReclaimableFromStorage 8397389824, storageMemoryPool.poolSize 16862150656, storageRegionSize 8464760832.
> 15/12/05 01:20:56 INFO UnifiedMemoryManager: Try to reclaim memory space from storage memory pool.
> 15/12/05 01:20:56 INFO StorageMemoryPool: Claiming 262144 bytes free memory space from StorageMemoryPool.
> 15/12/05 01:20:56 INFO UnifiedMemoryManager: Reclaimed 262144 bytes of memory from storage memory pool.Adding them back to onHeapExecutionMemoryPool.
> 15/12/05 01:20:56 INFO UnifiedMemoryManager: Try to acquire 67108864 bytes memory. But, on-heap execution memory poll only has 33554432 bytes free memory.
> 15/12/05 01:20:56 INFO UnifiedMemoryManager: memoryReclaimableFromStorage 8397127680, storageMemoryPool.poolSize 16861888512, storageRegionSize 8464760832.
> 15/12/05 01:20:56 INFO UnifiedMemoryManager: Try to reclaim memory space from storage memory pool.
> 15/12/05 01:20:56 INFO StorageMemoryPool: Claiming 33554432 bytes free memory space from StorageMemoryPool.
> 15/12/05 01:20:56 INFO UnifiedMemoryManager: Reclaimed 33554432 bytes of memory from storage memory pool.Adding them back to onHeapExecutionMemoryPool.
> 15/12/05 01:20:56 INFO GenerateMutableProjection: Code generated in 9.602791 ms
> 15/12/05 01:20:56 INFO GenerateMutableProjection: Code generated in 12.7135 ms
> 15/12/05 01:20:56 INFO Executor: Finished task 13.0 in stage 5.0 (TID 132). 2271 bytes result sent to driver
> 15/12/05 01:20:56 INFO Executor: Finished task 9.0 in stage 5.0 (TID 128). 2320 bytes result sent to driver
> 15/12/05 01:20:56 INFO CoarseGrainedExecutorBackend: Got assigned task 136
> 15/12/05 01:20:56 INFO CoarseGrainedExecutorBackend: Got assigned task 137
> 15/12/05 01:20:56 INFO Executor: Running task 17.0 in stage 5.0 (TID 136)
> 15/12/05 01:20:56 INFO ShuffleBlockFetcherIterator: Getting 43 non-empty blocks out of 43 blocks
> 15/12/05 01:20:56 INFO ShuffleBlockFetcherIterator: Started 3 remote fetches in 1 ms
> 15/12/05 01:20:56 INFO UnifiedMemoryManager: Try to acquire 67108864 bytes memory. But, on-heap execution memory poll only has 16515072 bytes free memory.
> 15/12/05 01:20:56 INFO UnifiedMemoryManager: memoryReclaimableFromStorage 8363573248, storageMemoryPool.poolSize 16828334080, storageRegionSize 8464760832.
> 15/12/05 01:20:56 INFO UnifiedMemoryManager: Try to reclaim memory space from storage memory pool.
> 15/12/05 01:20:56 INFO StorageMemoryPool: Claiming 50593792 bytes free memory space from StorageMemoryPool.
> 15/12/05 01:20:56 INFO UnifiedMemoryManager: Reclaimed 50593792 bytes of memory from storage memory pool.Adding them back to onHeapExecutionMemoryPool.
> 15/12/05 01:20:56 INFO Executor: Running task 18.0 in stage 5.0 (TID 137)
> 15/12/05 01:20:56 INFO GenerateUnsafeProjection: Code generated in 30.25836 ms
> 15/12/05 01:20:56 INFO ShuffleBlockFetcherIterator: Getting 43 non-empty blocks out of 43 blocks
> 15/12/05 01:20:56 INFO ShuffleBlockFetcherIterator: Started 3 remote fetches in 2 ms
> 15/12/05 01:20:56 INFO UnifiedMemoryManager: Try to acquire 67108864 bytes memory. But, on-heap execution memory poll only has 16515072 bytes free memory.
> 15/12/05 01:20:56 INFO UnifiedMemoryManager: memoryReclaimableFromStorage 8312979456, storageMemoryPool.poolSize 16777740288, storageRegionSize 8464760832.
> 15/12/05 01:20:56 INFO UnifiedMemoryManager: Try to reclaim memory space from storage memory pool.
> 15/12/05 01:20:56 INFO StorageMemoryPool: Claiming 50593792 bytes free memory space from StorageMemoryPool.
> 15/12/05 01:20:56 INFO UnifiedMemoryManager: Reclaimed 50593792 bytes of memory from storage memory pool.Adding them back to onHeapExecutionMemoryPool.
> 15/12/05 01:20:56 INFO GenerateUnsafeRowJoiner: Code generated in 19.615021 ms
> 15/12/05 01:20:57 INFO GenerateUnsafeProjection: Code generated in 23.149594 ms
> 15/12/05 01:20:57 INFO TaskMemoryManager: Memory used in task 136
> 15/12/05 01:20:57 INFO TaskMemoryManager: Acquired by org.apache.spark.unsafe.map.BytesToBytesMap@5ac6b585: 48.3 MB
> 15/12/05 01:20:57 INFO TaskMemoryManager: 0 bytes of memory were used by task 136 but are not associated with specific consumers
> 15/12/05 01:20:57 INFO TaskMemoryManager: 185597952 bytes of memory are used for execution and 13545345504 bytes of memory are used for storage
> 15/12/05 01:20:57 INFO TaskMemoryManager: Memory used in task 124
> 15/12/05 01:20:57 INFO TaskMemoryManager: Acquired by org.apache.spark.unsafe.map.BytesToBytesMap@30015a6a: 48.3 MB
> 15/12/05 01:20:57 INFO TaskMemoryManager: 0 bytes of memory were used by task 124 but are not associated with specific consumers
> 15/12/05 01:20:57 INFO TaskMemoryManager: 185597952 bytes of memory are used for execution and 13545345504 bytes of memory are used for storage
> 15/12/05 01:20:57 INFO UnifiedMemoryManager: Try to acquire 67108864 bytes memory. But, on-heap execution memory poll only has 16515072 bytes free memory.
> 15/12/05 01:20:57 INFO UnifiedMemoryManager: memoryReclaimableFromStorage 8262385664, storageMemoryPool.poolSize 16727146496, storageRegionSize 8464760832.
> 15/12/05 01:20:57 INFO UnifiedMemoryManager: Try to reclaim memory space from storage memory pool.
> 15/12/05 01:20:57 INFO StorageMemoryPool: Claiming 50593792 bytes free memory space from StorageMemoryPool.
> 15/12/05 01:20:57 INFO UnifiedMemoryManager: Reclaimed 50593792 bytes of memory from storage memory pool.Adding them back to onHeapExecutionMemoryPool.
> 15/12/05 01:20:57 INFO TaskMemoryManager: Memory used in task 137
> 15/12/05 01:20:57 INFO TaskMemoryManager: Acquired by org.apache.spark.unsafe.map.BytesToBytesMap@a9691e0: 48.3 MB
> 15/12/05 01:20:57 WARN TaskMemoryManager: leak 48.3 MB memory from org.apache.spark.unsafe.map.BytesToBytesMap@5ac6b585
> 15/12/05 01:20:57 INFO TaskMemoryManager: 0 bytes of memory were used by task 137 but are not associated with specific consumers
> 15/12/05 01:20:57 INFO TaskMemoryManager: 215023616 bytes of memory are used for execution and 13545345504 bytes of memory are used for storage
> 15/12/05 01:20:57 WARN TaskMemoryManager: leak 48.3 MB memory from org.apache.spark.unsafe.map.BytesToBytesMap@a9691e0
> 15/12/05 01:20:57 ERROR Executor: Managed memory leak detected; size = 50593792 bytes, TID = 136
> 15/12/05 01:20:57 ERROR Executor: Managed memory leak detected; size = 50593792 bytes, TID = 137
> 15/12/05 01:20:57 WARN TaskMemoryManager: leak 48.3 MB memory from org.apache.spark.unsafe.map.BytesToBytesMap@30015a6a
> 15/12/05 01:20:57 ERROR Executor: Managed memory leak detected; size = 50593792 bytes, TID = 124
> 15/12/05 01:20:57 ERROR Executor: Exception in task 18.0 in stage 5.0 (TID 137)
> java.lang.OutOfMemoryError: Unable to acquire 262144 bytes of memory, got 0
> 	at org.apache.spark.memory.MemoryConsumer.allocateArray(MemoryConsumer.java:91)
> 	at org.apache.spark.unsafe.map.BytesToBytesMap.allocate(BytesToBytesMap.java:735)
> 	at org.apache.spark.unsafe.map.BytesToBytesMap.<init>(BytesToBytesMap.java:197)
> 	at org.apache.spark.unsafe.map.BytesToBytesMap.<init>(BytesToBytesMap.java:212)
> 	at org.apache.spark.sql.execution.UnsafeFixedWidthAggregationMap.<init>(UnsafeFixedWidthAggregationMap.java:103)
> 	at org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.<init>(TungstenAggregationIterator.scala:483)
> 	at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:95)
> 	at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:86)
> 	at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
> 	at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
> 	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> 	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
> 	at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
> 	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> 	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
> 	at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
> 	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
> 	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
> 	at org.apache.spark.scheduler.Task.run(Task.scala:89)
> 	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> 	at java.lang.Thread.run(Thread.java:745)
> 15/12/05 01:20:57 ERROR Executor: Exception in task 17.0 in stage 5.0 (TID 136)
> java.lang.OutOfMemoryError: Unable to acquire 262144 bytes of memory, got 0
> 	at org.apache.spark.memory.MemoryConsumer.allocateArray(MemoryConsumer.java:91)
> 	at org.apache.spark.unsafe.map.BytesToBytesMap.allocate(BytesToBytesMap.java:735)
> 	at org.apache.spark.unsafe.map.BytesToBytesMap.<init>(BytesToBytesMap.java:197)
> 	at org.apache.spark.unsafe.map.BytesToBytesMap.<init>(BytesToBytesMap.java:212)
> 	at org.apache.spark.sql.execution.UnsafeFixedWidthAggregationMap.<init>(UnsafeFixedWidthAggregationMap.java:103)
> 	at org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.<init>(TungstenAggregationIterator.scala:483)
> 	at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:95)
> 	at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:86)
> 	at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
> 	at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
> 	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> 	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
> 	at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
> 	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> 	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
> 	at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
> 	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
> 	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
> 	at org.apache.spark.scheduler.Task.run(Task.scala:89)
> 	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> 	at java.lang.Thread.run(Thread.java:745)
> 15/12/05 01:20:57 ERROR Executor: Exception in task 5.0 in stage 5.0 (TID 124)
> java.lang.OutOfMemoryError: Unable to acquire 262144 bytes of memory, got 0
> 	at org.apache.spark.memory.MemoryConsumer.allocateArray(MemoryConsumer.java:91)
> 	at org.apache.spark.unsafe.map.BytesToBytesMap.allocate(BytesToBytesMap.java:735)
> 	at org.apache.spark.unsafe.map.BytesToBytesMap.<init>(BytesToBytesMap.java:197)
> 	at org.apache.spark.unsafe.map.BytesToBytesMap.<init>(BytesToBytesMap.java:212)
> 	at org.apache.spark.sql.execution.UnsafeFixedWidthAggregationMap.<init>(UnsafeFixedWidthAggregationMap.java:103)
> 	at org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.<init>(TungstenAggregationIterator.scala:483)
> 	at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:95)
> 	at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:86)
> 	at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
> 	at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
> 	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> 	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
> 	at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
> 	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> 	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
> 	at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
> 	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
> 	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
> 	at org.apache.spark.scheduler.Task.run(Task.scala:89)
> 	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> 	at java.lang.Thread.run(Thread.java:745)
> 15/12/05 01:20:57 ERROR SparkUncaughtExceptionHandler: Uncaught exception in thread Thread[Executor task launch worker-4,5,main]
> java.lang.OutOfMemoryError: Unable to acquire 262144 bytes of memory, got 0
> 	at org.apache.spark.memory.MemoryConsumer.allocateArray(MemoryConsumer.java:91)
> 	at org.apache.spark.unsafe.map.BytesToBytesMap.allocate(BytesToBytesMap.java:735)
> 	at org.apache.spark.unsafe.map.BytesToBytesMap.<init>(BytesToBytesMap.java:197)
> 	at org.apache.spark.unsafe.map.BytesToBytesMap.<init>(BytesToBytesMap.java:212)
> 	at org.apache.spark.sql.execution.UnsafeFixedWidthAggregationMap.<init>(UnsafeFixedWidthAggregationMap.java:103)
> 	at org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.<init>(TungstenAggregationIterator.scala:483)
> 	at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:95)
> 	at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:86)
> 	at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
> 	at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
> 	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> 	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
> 	at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
> 	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> 	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
> 	at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
> 	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
> 	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
> 	at org.apache.spark.scheduler.Task.run(Task.scala:89)
> 	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> 	at java.lang.Thread.run(Thread.java:745)
> 15/12/05 01:20:57 INFO DiskBlockManager: Shutdown hook called
> 15/12/05 01:20:57 INFO GenerateMutableProjection: Code generated in 21.666344 ms
> 15/12/05 01:20:57 DEBUG KeepAliveThread: KeepAliveThread received command: Shutdown
> 15/12/05 01:20:57 ERROR SparkUncaughtExceptionHandler: [Container in shutdown] Uncaught exception in thread Thread[Executor task launch worker-6,5,main]
> java.lang.OutOfMemoryError: Unable to acquire 262144 bytes of memory, got 0
> 	at org.apache.spark.memory.MemoryConsumer.allocateArray(MemoryConsumer.java:91)
> 	at org.apache.spark.unsafe.map.BytesToBytesMap.allocate(BytesToBytesMap.java:735)
> 	at org.apache.spark.unsafe.map.BytesToBytesMap.<init>(BytesToBytesMap.java:197)
> 	at org.apache.spark.unsafe.map.BytesToBytesMap.<init>(BytesToBytesMap.java:212)
> 	at org.apache.spark.sql.execution.UnsafeFixedWidthAggregationMap.<init>(UnsafeFixedWidthAggregationMap.java:103)
> 	at org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.<init>(TungstenAggregationIterator.scala:483)
> 	at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:95)
> 	at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:86)
> 	at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
> 	at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
> 	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> 	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
> 	at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
> 	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> 	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
> 	at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
> 	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
> 	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
> 	at org.apache.spark.scheduler.Task.run(Task.scala:89)
> 	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> 	at java.lang.Thread.run(Thread.java:745)
> 15/12/05 01:20:57 ERROR SparkUncaughtExceptionHandler: [Container in shutdown] Uncaught exception in thread Thread[Executor task launch worker-7,5,main]
> java.lang.OutOfMemoryError: Unable to acquire 262144 bytes of memory, got 0
> 	at org.apache.spark.memory.MemoryConsumer.allocateArray(MemoryConsumer.java:91)
> 	at org.apache.spark.unsafe.map.BytesToBytesMap.allocate(BytesToBytesMap.java:735)
> 	at org.apache.spark.unsafe.map.BytesToBytesMap.<init>(BytesToBytesMap.java:197)
> 	at org.apache.spark.unsafe.map.BytesToBytesMap.<init>(BytesToBytesMap.java:212)
> 	at org.apache.spark.sql.execution.UnsafeFixedWidthAggregationMap.<init>(UnsafeFixedWidthAggregationMap.java:103)
> 	at org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.<init>(TungstenAggregationIterator.scala:483)
> 	at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:95)
> 	at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:86)
> 	at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
> 	at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
> 	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> 	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
> 	at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
> 	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> 	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
> 	at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
> 	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
> 	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
> 	at org.apache.spark.scheduler.Task.run(Task.scala:89)
> 	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> 	at java.lang.Thread.run(Thread.java:745)
> 15/12/05 01:20:57 INFO KeepAliveThread: KeepAlive thread has been shutdown successfully
> 15/12/05 01:20:57 WARN TaskMemoryManager: leak 28.1 MB memory from org.apache.spark.unsafe.map.BytesToBytesMap@6feafdad
> 15/12/05 01:20:57 ERROR Executor: Managed memory leak detected; size = 29425664 bytes, TID = 120
> 15/12/05 01:20:57 ERROR Executor: Exception in task 1.0 in stage 5.0 (TID 120)
> java.io.FileNotFoundException: /local_disk/spark-1ebb23ad-e3a1-4af2-b3d0-58a70ceed7ec/executor-ca2c389d-8b67-487f-b175-b867282bf0a3/blockmgr-deda3833-d86c-4850-aa4f-64c26ebfbc4f/08/temp_shuffle_8b5df98d-701c-4ef3-98cc-9e4731fe4a68 (No such file or directory)
> 	at java.io.FileOutputStream.open0(Native Method)
> 	at java.io.FileOutputStream.open(FileOutputStream.java:270)
> 	at java.io.FileOutputStream.<init>(FileOutputStream.java:213)
> 	at org.apache.spark.storage.DiskBlockObjectWriter.open(DiskBlockObjectWriter.scala:88)
> 	at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:140)
> 	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
> 	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
> 	at org.apache.spark.scheduler.Task.run(Task.scala:89)
> 	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> 	at java.lang.Thread.run(Thread.java:745)
> 15/12/05 01:20:57 INFO ShutdownHookManager: Shutdown hook called
> {code}
> The query plan was like 
> {code}
> TungstenAggregate4
> +- TungstenExchange2
>    +- TungstenAggregate3
>       +- TungstenAggregate2
>          +- TungstenExchange1
>             +- TungstenAggregate1
>                +- Project 
>                   +- InMemoryColumnarTableScan
> {code}
> OOM happened in the stage having TungstenAggregate2 and TungstenAggregate3.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org