You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by unk1102 <um...@gmail.com> on 2015/08/18 17:57:25 UTC

Spark executor lost because of GC overhead limit exceeded even though using 20 executors using 25GB each

Hi this GC overhead limit error is making me crazy. I have 20 executors using
25 GB each I dont understand at all how can it throw GC overhead I also dont
that that big datasets. Once this GC error occurs in executor it will get
lost and slowly other executors getting lost because of IOException, Rpc
client disassociated, shuffle not found etc Please help me solve this I am
getting mad as I am new to Spark. Thanks in advance.

WARN scheduler.TaskSetManager: Lost task 7.0 in stage 363.0 (TID 3373,
myhost.com): java.lang.OutOfMemoryError: GC overhead limit exceeded
            at
org.apache.spark.sql.types.UTF8String.toString(UTF8String.scala:150)
            at
org.apache.spark.sql.catalyst.expressions.GenericRow.getString(rows.scala:120)
            at
org.apache.spark.sql.columnar.STRING$.actualSize(ColumnType.scala:312)
            at
org.apache.spark.sql.columnar.compression.DictionaryEncoding$Encoder.gatherCompressibilityStats(compressionSchemes.scala:224)
            at
org.apache.spark.sql.columnar.compression.CompressibleColumnBuilder$class.gatherCompressibilityStats(CompressibleColumnBuilder.scala:72)
            at
org.apache.spark.sql.columnar.compression.CompressibleColumnBuilder$class.appendFrom(CompressibleColumnBuilder.scala:80)
            at
org.apache.spark.sql.columnar.NativeColumnBuilder.appendFrom(ColumnBuilder.scala:87)
            at
org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1.next(InMemoryColumnarTableScan.scala:148)
            at
org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1.next(InMemoryColumnarTableScan.scala:124)
            at
org.apache.spark.storage.MemoryStore.unrollSafely(MemoryStore.scala:277)
            at
org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:171)
            at
org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:78)
            at org.apache.spark.rdd.RDD.iterator(RDD.scala:242)
            at
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
            at
org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
            at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
            at
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
            at
org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
            at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
            at
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
            at
org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
            at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
            at
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
            at
org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
            at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
            at
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
            at
org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
            at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
            at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:70)
            at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
            at org.apache.spark.scheduler.Task.run(Task.scala:70)
            at
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-executor-lost-because-of-GC-overhead-limit-exceeded-even-though-using-20-executors-using-25GB-h-tp24322.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Re: Spark executor lost because of GC overhead limit exceeded even though using 20 executors using 25GB each

Posted by Umesh Kacha <um...@gmail.com>.
Hi Ted thanks for the response. I am using Spark 1.4.1. Java is 1.7. I cant
share code complete code because of company policy I have attached mobile
snapshot of modified code if you can see it sorry about that. Basically I
have Driver program where I get hive partition using hiveconext.sql("show
partitions bla bla") then for each partition column I create one
ExecutorService worker. So I have around 2000 partition column to process
and process data and insert into hive table using hivecontext.sql("insert
into") I then submit this job using spark submit to yarn-client.




On Tue, Aug 18, 2015 at 10:53 PM, Ted Yu <yu...@gmail.com> wrote:

> Do you mind providing a bit more information ?
>
> release of Spark
>
> code snippet of your app
>
> version of Java
>
> Thanks
>
> On Tue, Aug 18, 2015 at 8:57 AM, unk1102 <um...@gmail.com> wrote:
>
>> Hi this GC overhead limit error is making me crazy. I have 20 executors
>> using
>> 25 GB each I dont understand at all how can it throw GC overhead I also
>> dont
>> that that big datasets. Once this GC error occurs in executor it will get
>> lost and slowly other executors getting lost because of IOException, Rpc
>> client disassociated, shuffle not found etc Please help me solve this I am
>> getting mad as I am new to Spark. Thanks in advance.
>>
>> WARN scheduler.TaskSetManager: Lost task 7.0 in stage 363.0 (TID 3373,
>> myhost.com): java.lang.OutOfMemoryError: GC overhead limit exceeded
>>             at
>> org.apache.spark.sql.types.UTF8String.toString(UTF8String.scala:150)
>>             at
>>
>> org.apache.spark.sql.catalyst.expressions.GenericRow.getString(rows.scala:120)
>>             at
>> org.apache.spark.sql.columnar.STRING$.actualSize(ColumnType.scala:312)
>>             at
>>
>> org.apache.spark.sql.columnar.compression.DictionaryEncoding$Encoder.gatherCompressibilityStats(compressionSchemes.scala:224)
>>             at
>>
>> org.apache.spark.sql.columnar.compression.CompressibleColumnBuilder$class.gatherCompressibilityStats(CompressibleColumnBuilder.scala:72)
>>             at
>>
>> org.apache.spark.sql.columnar.compression.CompressibleColumnBuilder$class.appendFrom(CompressibleColumnBuilder.scala:80)
>>             at
>>
>> org.apache.spark.sql.columnar.NativeColumnBuilder.appendFrom(ColumnBuilder.scala:87)
>>             at
>>
>> org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1.next(InMemoryColumnarTableScan.scala:148)
>>             at
>>
>> org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1.next(InMemoryColumnarTableScan.scala:124)
>>             at
>> org.apache.spark.storage.MemoryStore.unrollSafely(MemoryStore.scala:277)
>>             at
>> org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:171)
>>             at
>> org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:78)
>>             at org.apache.spark.rdd.RDD.iterator(RDD.scala:242)
>>             at
>> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
>>             at
>> org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
>>             at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
>>             at
>> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
>>             at
>> org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
>>             at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
>>             at
>> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
>>             at
>> org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
>>             at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
>>             at
>> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
>>             at
>> org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
>>             at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
>>             at
>> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
>>             at
>> org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
>>             at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
>>             at
>> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:70)
>>             at
>> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>>             at org.apache.spark.scheduler.Task.run(Task.scala:70)
>>             at
>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-executor-lost-because-of-GC-overhead-limit-exceeded-even-though-using-20-executors-using-25GB-h-tp24322.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>> For additional commands, e-mail: user-help@spark.apache.org
>>
>>
>

Re: Spark executor lost because of GC overhead limit exceeded even though using 20 executors using 25GB each

Posted by Ted Yu <yu...@gmail.com>.
Do you mind providing a bit more information ?

release of Spark

code snippet of your app

version of Java

Thanks

On Tue, Aug 18, 2015 at 8:57 AM, unk1102 <um...@gmail.com> wrote:

> Hi this GC overhead limit error is making me crazy. I have 20 executors
> using
> 25 GB each I dont understand at all how can it throw GC overhead I also
> dont
> that that big datasets. Once this GC error occurs in executor it will get
> lost and slowly other executors getting lost because of IOException, Rpc
> client disassociated, shuffle not found etc Please help me solve this I am
> getting mad as I am new to Spark. Thanks in advance.
>
> WARN scheduler.TaskSetManager: Lost task 7.0 in stage 363.0 (TID 3373,
> myhost.com): java.lang.OutOfMemoryError: GC overhead limit exceeded
>             at
> org.apache.spark.sql.types.UTF8String.toString(UTF8String.scala:150)
>             at
>
> org.apache.spark.sql.catalyst.expressions.GenericRow.getString(rows.scala:120)
>             at
> org.apache.spark.sql.columnar.STRING$.actualSize(ColumnType.scala:312)
>             at
>
> org.apache.spark.sql.columnar.compression.DictionaryEncoding$Encoder.gatherCompressibilityStats(compressionSchemes.scala:224)
>             at
>
> org.apache.spark.sql.columnar.compression.CompressibleColumnBuilder$class.gatherCompressibilityStats(CompressibleColumnBuilder.scala:72)
>             at
>
> org.apache.spark.sql.columnar.compression.CompressibleColumnBuilder$class.appendFrom(CompressibleColumnBuilder.scala:80)
>             at
>
> org.apache.spark.sql.columnar.NativeColumnBuilder.appendFrom(ColumnBuilder.scala:87)
>             at
>
> org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1.next(InMemoryColumnarTableScan.scala:148)
>             at
>
> org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1.next(InMemoryColumnarTableScan.scala:124)
>             at
> org.apache.spark.storage.MemoryStore.unrollSafely(MemoryStore.scala:277)
>             at
> org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:171)
>             at
> org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:78)
>             at org.apache.spark.rdd.RDD.iterator(RDD.scala:242)
>             at
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
>             at
> org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
>             at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
>             at
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
>             at
> org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
>             at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
>             at
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
>             at
> org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
>             at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
>             at
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
>             at
> org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
>             at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
>             at
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
>             at
> org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
>             at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
>             at
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:70)
>             at
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>             at org.apache.spark.scheduler.Task.run(Task.scala:70)
>             at
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-executor-lost-because-of-GC-overhead-limit-exceeded-even-though-using-20-executors-using-25GB-h-tp24322.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>