You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Sameer Tilak <ss...@live.com> on 2014/07/31 18:58:13 UTC

java.lang.OutOfMemoryError: Java heap space

Hi everyone,I have the following configuration. I am currently running my app in local mode.
  val conf = new SparkConf().setMaster("local[2]").setAppName("ApproxStrMatch").set("spark.executor.memory", "3g").set("spark.storage.memoryFraction", "0.1")
I am getting the following error. I tried setting up spark.executor.memory and memory fraction setting, however my UI does not show the increase and I still get these errors. I am loading a TSV file from HDFS (around 5 GB). Does this mean, I should update these settings and add more memory or is it somethign else? Spark master has 24 GB physical memory and workers have 16 GB, but we are running other services (CDH 5.1) on these nodes as well. 
14/07/31 09:48:09 INFO BlockFetcherIterator$BasicBlockFetcherIterator: Getting 2 non-empty blocks out of 2 blocks14/07/31 09:48:09 INFO BlockFetcherIterator$BasicBlockFetcherIterator: Getting 2 non-empty blocks out of 2 blocks14/07/31 09:48:09 INFO BlockFetcherIterator$BasicBlockFetcherIterator: Started 0 remote fetches in 6 ms14/07/31 09:48:09 INFO BlockFetcherIterator$BasicBlockFetcherIterator: Started 0 remote fetches in 6 ms14/07/31 09:48:09 INFO BlockFetcherIterator$BasicBlockFetcherIterator: maxBytesInFlight: 50331648, targetRequestSize: 1006632914/07/31 09:48:09 INFO BlockFetcherIterator$BasicBlockFetcherIterator: maxBytesInFlight: 50331648, targetRequestSize: 1006632914/07/31 09:48:09 INFO BlockFetcherIterator$BasicBlockFetcherIterator: Getting 2 non-empty blocks out of 2 blocks14/07/31 09:48:09 INFO BlockFetcherIterator$BasicBlockFetcherIterator: Getting 2 non-empty blocks out of 2 blocks14/07/31 09:48:09 INFO BlockFetcherIterator$BasicBlockFetcherIterator: Started 0 remote fetches in 1 ms14/07/31 09:48:09 INFO BlockFetcherIterator$BasicBlockFetcherIterator: Started 0 remote fetches in 1 ms14/07/31 09:48:17 ERROR Executor: Exception in task ID 5java.lang.OutOfMemoryError: Java heap space	at java.util.Arrays.copyOf(Arrays.java:2271)	at java.io.ByteArrayOutputStream.toByteArray(ByteArrayOutputStream.java:178)	at org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:73)	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:197)	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)	at java.lang.Thread.run(Thread.java:744)14/07/31 09:48:17 ERROR ExecutorUncaughtExceptionHandler: Uncaught exception in thread Thread[Executor task launch worker-3,5,main]java.lang.OutOfMemoryError: Java heap space	at java.util.Arrays.copyOf(Arrays.java:2271)	at java.io.ByteArrayOutputStream.toByteArray(ByteArrayOutputStream.java:178)	at org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:73)	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:197)	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)	at java.lang.Thread.run(Thread.java:744)14/07/31 09:48:17 WARN TaskSetManager: Lost TID 5 (task 1.0:0)14/07/31 09:48:17 WARN TaskSetManager: Loss was due to java.lang.OutOfMemoryErrorjava.lang.OutOfMemoryError: Java heap space	at java.util.Arrays.copyOf(Arrays.java:2271)	at java.io.ByteArrayOutputStream.toByteArray(ByteArrayOutputStream.java:178)	at org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:73)	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:197)	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)	at java.lang.Thread.run(Thread.java:744)14/07/31 09:48:17 ERROR TaskSetManager: Task 1.0:0 failed 1 times; aborting job14/07/31 09:48:17 INFO TaskSchedulerImpl: Cancelling stage 114/07/31 09:48:17 INFO DAGScheduler: Failed to run collect at ComputeScores.scala:7614/07/31 09:48:17 INFO Executor: Executor is trying to kill task 614/07/31 09:48:17 INFO TaskSchedulerImpl: Stage 1 was cancelled 		 	   		  

Re: java.lang.OutOfMemoryError: Java heap space

Posted by Haiyang Fu <ha...@gmail.com>.
http://spark.apache.org/docs/latest/tuning.html#level-of-parallelism



On Fri, Aug 1, 2014 at 1:29 PM, Haiyang Fu <ha...@gmail.com> wrote:

> Hi,
> here are two tips for you,
> 1. increase the parallism level
> 2.increase the driver memory
>
>
> On Fri, Aug 1, 2014 at 12:58 AM, Sameer Tilak <ss...@live.com> wrote:
>
>> Hi everyone,
>> I have the following configuration. I am currently running my app in
>> local mode.
>>
>>   val conf = new
>> SparkConf().setMaster("local[2]").setAppName("ApproxStrMatch").set("spark.executor.memory",
>> "3g").set("spark.storage.memoryFraction", "0.1")
>>
>> I am getting the following error. I tried setting up spark.executor.memory
>> and memory fraction setting, however my UI does not show the increase and I
>> still get these errors. I am loading a TSV file from HDFS (around 5 GB).
>> Does this mean, I should update these settings and add more memory or is it
>> somethign else? Spark master has 24 GB physical memory and workers have 16
>> GB, but we are running other services (CDH 5.1) on these nodes as well.
>>
>> 14/07/31 09:48:09 INFO BlockFetcherIterator$BasicBlockFetcherIterator:
>> Getting 2 non-empty blocks out of 2 blocks
>> 14/07/31 09:48:09 INFO BlockFetcherIterator$BasicBlockFetcherIterator:
>> Getting 2 non-empty blocks out of 2 blocks
>> 14/07/31 09:48:09 INFO BlockFetcherIterator$BasicBlockFetcherIterator:
>> Started 0 remote fetches in 6 ms
>> 14/07/31 09:48:09 INFO BlockFetcherIterator$BasicBlockFetcherIterator:
>> Started 0 remote fetches in 6 ms
>> 14/07/31 09:48:09 INFO BlockFetcherIterator$BasicBlockFetcherIterator:
>> maxBytesInFlight: 50331648, targetRequestSize: 10066329
>> 14/07/31 09:48:09 INFO BlockFetcherIterator$BasicBlockFetcherIterator:
>> maxBytesInFlight: 50331648, targetRequestSize: 10066329
>> 14/07/31 09:48:09 INFO BlockFetcherIterator$BasicBlockFetcherIterator:
>> Getting 2 non-empty blocks out of 2 blocks
>> 14/07/31 09:48:09 INFO BlockFetcherIterator$BasicBlockFetcherIterator:
>> Getting 2 non-empty blocks out of 2 blocks
>> 14/07/31 09:48:09 INFO BlockFetcherIterator$BasicBlockFetcherIterator:
>> Started 0 remote fetches in 1 ms
>> 14/07/31 09:48:09 INFO BlockFetcherIterator$BasicBlockFetcherIterator:
>> Started 0 remote fetches in 1 ms
>> 14/07/31 09:48:17 ERROR Executor: Exception in task ID 5
>> java.lang.OutOfMemoryError: Java heap space
>> at java.util.Arrays.copyOf(Arrays.java:2271)
>>  at
>> java.io.ByteArrayOutputStream.toByteArray(ByteArrayOutputStream.java:178)
>> at
>> org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:73)
>>  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:197)
>> at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>  at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>> at java.lang.Thread.run(Thread.java:744)
>> 14/07/31 09:48:17 ERROR ExecutorUncaughtExceptionHandler: Uncaught
>> exception in thread Thread[Executor task launch worker-3,5,main]
>> java.lang.OutOfMemoryError: Java heap space
>> at java.util.Arrays.copyOf(Arrays.java:2271)
>>  at
>> java.io.ByteArrayOutputStream.toByteArray(ByteArrayOutputStream.java:178)
>> at
>> org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:73)
>>  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:197)
>> at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>  at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>> at java.lang.Thread.run(Thread.java:744)
>> 14/07/31 09:48:17 WARN TaskSetManager: Lost TID 5 (task 1.0:0)
>> 14/07/31 09:48:17 WARN TaskSetManager: Loss was due to
>> java.lang.OutOfMemoryError
>> java.lang.OutOfMemoryError: Java heap space
>>  at java.util.Arrays.copyOf(Arrays.java:2271)
>> at
>> java.io.ByteArrayOutputStream.toByteArray(ByteArrayOutputStream.java:178)
>>  at
>> org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:73)
>> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:197)
>>  at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>  at java.lang.Thread.run(Thread.java:744)
>> 14/07/31 09:48:17 ERROR TaskSetManager: Task 1.0:0 failed 1 times;
>> aborting job
>> 14/07/31 09:48:17 INFO TaskSchedulerImpl: Cancelling stage 1
>> 14/07/31 09:48:17 INFO DAGScheduler: Failed to run collect at
>> ComputeScores.scala:76
>> 14/07/31 09:48:17 INFO Executor: Executor is trying to kill task 6
>> 14/07/31 09:48:17 INFO TaskSchedulerImpl: Stage 1 was cancelled
>>
>
>

Re: java.lang.OutOfMemoryError: Java heap space

Posted by Haiyang Fu <ha...@gmail.com>.
Hi,
here are two tips for you,
1. increase the parallism level
2.increase the driver memory


On Fri, Aug 1, 2014 at 12:58 AM, Sameer Tilak <ss...@live.com> wrote:

> Hi everyone,
> I have the following configuration. I am currently running my app in local
> mode.
>
>   val conf = new
> SparkConf().setMaster("local[2]").setAppName("ApproxStrMatch").set("spark.executor.memory",
> "3g").set("spark.storage.memoryFraction", "0.1")
>
> I am getting the following error. I tried setting up spark.executor.memory
> and memory fraction setting, however my UI does not show the increase and I
> still get these errors. I am loading a TSV file from HDFS (around 5 GB).
> Does this mean, I should update these settings and add more memory or is it
> somethign else? Spark master has 24 GB physical memory and workers have 16
> GB, but we are running other services (CDH 5.1) on these nodes as well.
>
> 14/07/31 09:48:09 INFO BlockFetcherIterator$BasicBlockFetcherIterator:
> Getting 2 non-empty blocks out of 2 blocks
> 14/07/31 09:48:09 INFO BlockFetcherIterator$BasicBlockFetcherIterator:
> Getting 2 non-empty blocks out of 2 blocks
> 14/07/31 09:48:09 INFO BlockFetcherIterator$BasicBlockFetcherIterator:
> Started 0 remote fetches in 6 ms
> 14/07/31 09:48:09 INFO BlockFetcherIterator$BasicBlockFetcherIterator:
> Started 0 remote fetches in 6 ms
> 14/07/31 09:48:09 INFO BlockFetcherIterator$BasicBlockFetcherIterator:
> maxBytesInFlight: 50331648, targetRequestSize: 10066329
> 14/07/31 09:48:09 INFO BlockFetcherIterator$BasicBlockFetcherIterator:
> maxBytesInFlight: 50331648, targetRequestSize: 10066329
> 14/07/31 09:48:09 INFO BlockFetcherIterator$BasicBlockFetcherIterator:
> Getting 2 non-empty blocks out of 2 blocks
> 14/07/31 09:48:09 INFO BlockFetcherIterator$BasicBlockFetcherIterator:
> Getting 2 non-empty blocks out of 2 blocks
> 14/07/31 09:48:09 INFO BlockFetcherIterator$BasicBlockFetcherIterator:
> Started 0 remote fetches in 1 ms
> 14/07/31 09:48:09 INFO BlockFetcherIterator$BasicBlockFetcherIterator:
> Started 0 remote fetches in 1 ms
> 14/07/31 09:48:17 ERROR Executor: Exception in task ID 5
> java.lang.OutOfMemoryError: Java heap space
> at java.util.Arrays.copyOf(Arrays.java:2271)
> at
> java.io.ByteArrayOutputStream.toByteArray(ByteArrayOutputStream.java:178)
> at
> org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:73)
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:197)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> 14/07/31 09:48:17 ERROR ExecutorUncaughtExceptionHandler: Uncaught
> exception in thread Thread[Executor task launch worker-3,5,main]
> java.lang.OutOfMemoryError: Java heap space
> at java.util.Arrays.copyOf(Arrays.java:2271)
> at
> java.io.ByteArrayOutputStream.toByteArray(ByteArrayOutputStream.java:178)
> at
> org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:73)
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:197)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> 14/07/31 09:48:17 WARN TaskSetManager: Lost TID 5 (task 1.0:0)
> 14/07/31 09:48:17 WARN TaskSetManager: Loss was due to
> java.lang.OutOfMemoryError
> java.lang.OutOfMemoryError: Java heap space
> at java.util.Arrays.copyOf(Arrays.java:2271)
> at
> java.io.ByteArrayOutputStream.toByteArray(ByteArrayOutputStream.java:178)
> at
> org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:73)
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:197)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> 14/07/31 09:48:17 ERROR TaskSetManager: Task 1.0:0 failed 1 times;
> aborting job
> 14/07/31 09:48:17 INFO TaskSchedulerImpl: Cancelling stage 1
> 14/07/31 09:48:17 INFO DAGScheduler: Failed to run collect at
> ComputeScores.scala:76
> 14/07/31 09:48:17 INFO Executor: Executor is trying to kill task 6
> 14/07/31 09:48:17 INFO TaskSchedulerImpl: Stage 1 was cancelled
>