You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Mohit Singh <mo...@gmail.com> on 2014/02/26 23:39:46 UTC

JVM error

Hi,
  I am experimenting with pyspark lately...
Every now and then, I see this error bieng streamed to pyspark shell .. and
most of the times.. the computation/operation completes.. and sometimes, it
just gets stuck...
My setup is 8 node cluster.. with loads of ram(256GB's) and space( TB's)
per node.
This enviornment is shared by general hadoop and hadoopy stuff..with recent
spark addition...

java.lang.OutOfMemoryError: Java heap space
    at
com.ning.compress.BufferRecycler.allocEncodingBuffer(BufferRecycler.java:59)
    at com.ning.compress.lzf.ChunkEncoder.<init>(ChunkEncoder.java:93)
    at
com.ning.compress.lzf.impl.UnsafeChunkEncoder.<init>(UnsafeChunkEncoder.java:40)
    at
com.ning.compress.lzf.impl.UnsafeChunkEncoderLE.<init>(UnsafeChunkEncoderLE.java:13)
    at
com.ning.compress.lzf.impl.UnsafeChunkEncoders.createEncoder(UnsafeChunkEncoders.java:31)
    at
com.ning.compress.lzf.util.ChunkEncoderFactory.optimalInstance(ChunkEncoderFactory.java:44)
    at com.ning.compress.lzf.LZFOutputStream.<init>(LZFOutputStream.java:61)
    at
org.apache.spark.io.LZFCompressionCodec.compressedOutputStream(CompressionCodec.scala:60)
    at
org.apache.spark.storage.BlockManager.wrapForCompression(BlockManager.scala:803)
    at
org.apache.spark.storage.BlockManager$$anonfun$5.apply(BlockManager.scala:471)
    at
org.apache.spark.storage.BlockManager$$anonfun$5.apply(BlockManager.scala:471)
    at
org.apache.spark.storage.DiskBlockObjectWriter.open(BlockObjectWriter.scala:117)
    at
org.apache.spark.storage.DiskBlockObjectWriter.write(BlockObjectWriter.scala:174)
    at
org.apache.spark.scheduler.ShuffleMapTask$$anonfun$runTask$1.apply(ShuffleMapTask.scala:164)
    at
org.apache.spark.scheduler.ShuffleMapTask$$anonfun$runTask$1.apply(ShuffleMapTask.scala:161)
    at scala.collection.Iterator$class.foreach(Iterator.scala:727)
    at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
    at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:161)
    at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:102)
    at org.apache.spark.scheduler.Task.run(Task.scala:53)
    at
org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mcV$sp(Executor.scala:213)
    at
org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:49)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:178)
    at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:744)



Most of the settings in spark are default.. So i was wondering if maybe,
there is some configuration that needs to happen?
There is this one config I have addded to spark_env file
SPARK_WORKER_MEMORY=20g

Also, I see tons of these errors as well..
14/02/26 14:33:17 INFO TaskSetManager: Loss was due to
java.lang.OutOfMemoryError: Java heap space [duplicate 1]
14/02/26 14:33:17 INFO TaskSetManager: Starting task 996.0:278 as TID 1792
on executor 9: node02 (PROCESS_LOCAL)
14/02/26 14:33:17 INFO TaskSetManager: Serialized task 996.0:278 as 4070
bytes in 0 ms
14/02/26 14:33:17 WARN TaskSetManager: Lost TID 1488 (task 996.0:184)
14/02/26 14:33:17 INFO TaskSetManager: Loss was due to
java.lang.OutOfMemoryError: Java heap space [duplicate 2]
14/02/26 14:33:17 INFO TaskSetManager: Starting task 996.0:247 as TID 1793
on executor 9: node02 (PROCESS_LOCAL)
14/02/26 14:33:17 INFO TaskSetManager: Serialized task 996.0:247 as 4070
bytes in 0 ms
14/02/26 14:33:17 WARN TaskSetManager: Lost TID 1484 (task 996.0:82)
14/02/26 14:33:17 INFO TaskSetManager: Loss was due to
java.lang.OutOfMemoryError: Java heap space [duplicate 3]
14/02/26 14:33:17 INFO TaskSetManager: Starting task 996.0:116 as TID 1794
on executor 9: node02 (PROCESS_LOCAL)
14/02/26 14:33:17 INFO TaskSetManager: Serialized task 996.0:116 as 4070
bytes in 1 ms
14/02/26 14:33:17 WARN TaskSetManager: Lost TID 1475 (task 996.0:157)
14/02/26 14:33:17 INFO TaskSetManager: Loss was due to
java.lang.OutOfMemoryError: Java heap space [duplicate 4]
14/02/26 14:33:17 INFO TaskSetManager: Starting task 996.0:98 as TID 1795
on executor 9: node02 (PROCESS_LOCAL)
14/02/26 14:33:17 INFO TaskSetManager: Serialized task 996.0:98 as 4070
bytes in 1 ms
14/02/26 14:33:17 WARN TaskSetManager: Lost TID 1492 (task 996.0:17)


and then...

14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1649 (task 996.0:115)
14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1666 (task 996.0:32)
14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1675 (task 996.0:160)
14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1657 (task 996.0:349)
14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1660 (task 996.0:141)
14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1651 (task 996.0:55)
14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1669 (task 996.0:126)
14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1678 (task 996.0:173)
14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1663 (task 996.0:128)
14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1672 (task 996.0:28)
14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1654 (task 996.0:96)
14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1699 (task 996.0:294)
14/02/26 14:33:20 INFO DAGScheduler: Executor lost: 12 (epoch 16)
14/02/26 14:33:20 INFO BlockManagerMasterActor: Trying to remove executor
12 from BlockManagerMaster.
14/02/26 14:33:20 INFO BlockManagerMaster: Removed 12 successfully in
removeExecutor
14/02/26 14:33:20 INFO Stage: Stage 996 is now unavailable on executor 12
(0/379, false)


which looks like warnings..


The code I tried to run was:
subs_count = complex_key.map( lambda x: (x[0],int(x[1])).reduceByKey(lambda
a,b:a+b))
subs_count.take(20)

Thanks

-- 
Mohit

"When you want success as badly as you want the air, then you will get it.
There is no other secret of success."
-Socrates

Re: JVM error

Posted by Bryn Keller <xo...@xoltar.org>.
Hi Mohit,

Yes, in pyspark you only get one chance to initialize a spark context. If
it goes wrong, you have to restart the process.

Thanks,
Bryn


On Fri, Feb 28, 2014 at 4:55 PM, Mohit Singh <mo...@gmail.com> wrote:

> And I tried that but got the error:
>
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
>   File "/home/hadoop/spark/python/pyspark/context.py", line 83, in __init__
>     SparkContext._ensure_initialized(self)
>   File "/home/hadoop/spark/python/pyspark/context.py", line 165, in
> _ensure_initialized
>     raise ValueError("Cannot run multiple SparkContexts at once")
> ValueError: Cannot run multiple SparkContexts at once
>
>
> On Fri, Feb 28, 2014 at 11:59 AM, Bryn Keller <xo...@xoltar.org> wrote:
>
>> Sorry, typo - that last line should be:
>>
>> sc = pyspark.Spark*Context*(conf = conf)
>>
>>
>> On Fri, Feb 28, 2014 at 9:37 AM, Mohit Singh <mo...@gmail.com> wrote:
>>
>>> Hi Bryn,
>>>   Thanks for the suggestion.
>>> I tried that..
>>> conf = pyspark.SparkConf().set("spark.executor.memory","20G")
>>> But.. got an error here:
>>>
>>> sc = pyspark.SparkConf(conf = conf)
>>> Traceback (most recent call last):
>>>   File "<stdin>", line 1, in <module>
>>> TypeError: __init__() got an unexpected keyword argument 'conf'
>>>
>>> ??
>>> This is in pyspark shell.
>>>
>>>
>>> On Thu, Feb 27, 2014 at 5:00 AM, Evgeniy Shishkin <it...@gmail.com>wrote:
>>>
>>>>
>>>> On 27 Feb 2014, at 07:22, Aaron Davidson <il...@gmail.com> wrote:
>>>>
>>>> > Setting spark.executor.memory is indeed the correct way to do this.
>>>> If you want to configure this in spark-env.sh, you can use
>>>> > export SPARK_JAVA_OPTS=" -Dspark.executor.memory=20g"
>>>> > (make sure to append the variable if you've been using
>>>> SPARK_JAVA_OPTS previously)
>>>> >
>>>> >
>>>> > On Wed, Feb 26, 2014 at 7:50 PM, Bryn Keller <xo...@xoltar.org>
>>>> wrote:
>>>> > Hi Mohit,
>>>> >
>>>> > You can still set SPARK_MEM in spark-env.sh, but that is deprecated.
>>>> This is from SparkContext.scala:
>>>> >
>>>> > if (!conf.contains("spark.executor.memory") &&
>>>> sys.env.contains("SPARK_MEM")) {
>>>> >     logWarning("Using SPARK_MEM to set amount of memory to use per
>>>> executor process is " +
>>>> >       "deprecated, instead use spark.executor.memory")
>>>> >   }
>>>> >
>>>> > Thanks,
>>>> > Bryn
>>>> >
>>>> >
>>>> > On Wed, Feb 26, 2014 at 6:28 PM, Mohit Singh <mo...@gmail.com>
>>>> wrote:
>>>> > Hi Bryn,
>>>> >   Thanks for responding. Is there a way I can permanently configure
>>>> this setting?
>>>> > like SPARK_EXECUTOR_MEMORY or somethign like that?
>>>> >
>>>> >
>>>> >
>>>> > On Wed, Feb 26, 2014 at 2:56 PM, Bryn Keller <xo...@xoltar.org>
>>>> wrote:
>>>> > Hi Mohit,
>>>> >
>>>> > Try increasing the executor memory instead of the worker memory - the
>>>> most appropriate place to do this is actually when you're creating your
>>>> SparkContext, something like:
>>>> >
>>>> > conf = pyspark.SparkConf()
>>>> >                        .setMaster("spark://master:7077")
>>>> >                        .setAppName("Example")
>>>> >                        .setSparkHome("/your/path/to/spark")
>>>> >                        .set("spark.executor.memory", "20G")
>>>> >                        .set("spark.logConf", "true")
>>>> > sc = pyspark.SparkConf(conf = conf)
>>>> >
>>>> > Hope that helps,
>>>> > Bryn
>>>> >
>>>> >
>>>> >
>>>> > On Wed, Feb 26, 2014 at 2:39 PM, Mohit Singh <mo...@gmail.com>
>>>> wrote:
>>>> > Hi,
>>>> >   I am experimenting with pyspark lately...
>>>> > Every now and then, I see this error bieng streamed to pyspark shell
>>>> .. and most of the times.. the computation/operation completes.. and
>>>> sometimes, it just gets stuck...
>>>> > My setup is 8 node cluster.. with loads of ram(256GB's) and space(
>>>> TB's) per node.
>>>> > This enviornment is shared by general hadoop and hadoopy stuff..with
>>>> recent spark addition...
>>>> >
>>>> > java.lang.OutOfMemoryError: Java heap space
>>>> >     at
>>>> com.ning.compress.BufferRecycler.allocEncodingBuffer(BufferRecycler.java:59)
>>>> >     at com.ning.compress.lzf.ChunkEncoder.<init>(ChunkEncoder.java:93)
>>>> >     at
>>>> com.ning.compress.lzf.impl.UnsafeChunkEncoder.<init>(UnsafeChunkEncoder.java:40)
>>>> >     at
>>>> com.ning.compress.lzf.impl.UnsafeChunkEncoderLE.<init>(UnsafeChunkEncoderLE.java:13)
>>>> >     at
>>>> com.ning.compress.lzf.impl.UnsafeChunkEncoders.createEncoder(UnsafeChunkEncoders.java:31)
>>>> >     at
>>>> com.ning.compress.lzf.util.ChunkEncoderFactory.optimalInstance(ChunkEncoderFactory.java:44)
>>>> >     at
>>>> com.ning.compress.lzf.LZFOutputStream.<init>(LZFOutputStream.java:61)
>>>> >     at
>>>> org.apache.spark.io.LZFCompressionCodec.compressedOutputStream(CompressionCodec.scala:60)
>>>> >     at
>>>> org.apache.spark.storage.BlockManager.wrapForCompression(BlockManager.scala:803)
>>>> >     at
>>>> org.apache.spark.storage.BlockManager$$anonfun$5.apply(BlockManager.scala:471)
>>>> >     at
>>>> org.apache.spark.storage.BlockManager$$anonfun$5.apply(BlockManager.scala:471)
>>>> >     at
>>>> org.apache.spark.storage.DiskBlockObjectWriter.open(BlockObjectWriter.scala:117)
>>>> >     at
>>>> org.apache.spark.storage.DiskBlockObjectWriter.write(BlockObjectWriter.scala:174)
>>>> >     at
>>>> org.apache.spark.scheduler.ShuffleMapTask$$anonfun$runTask$1.apply(ShuffleMapTask.scala:164)
>>>> >     at
>>>> org.apache.spark.scheduler.ShuffleMapTask$$anonfun$runTask$1.apply(ShuffleMapTask.scala:161)
>>>> >     at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>>>> >     at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>>>> >     at
>>>> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:161)
>>>> >     at
>>>> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:102)
>>>> >     at org.apache.spark.scheduler.Task.run(Task.scala:53)
>>>> >     at
>>>> org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mcV$sp(Executor.scala:213)
>>>> >     at
>>>> org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:49)
>>>> >     at
>>>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:178)
>>>> >     at
>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>> >     at
>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>> >     at java.lang.Thread.run(Thread.java:744)
>>>> >
>>>> >
>>>> >
>>>> > Most of the settings in spark are default.. So i was wondering if
>>>> maybe, there is some configuration that needs to happen?
>>>> > There is this one config I have addded to spark_env file
>>>> > SPARK_WORKER_MEMORY=20g
>>>> >
>>>> > Also, I see tons of these errors as well..
>>>> > 14/02/26 14:33:17 INFO TaskSetManager: Loss was due to
>>>> java.lang.OutOfMemoryError: Java heap space [duplicate 1]
>>>> > 14/02/26 14:33:17 INFO TaskSetManager: Starting task 996.0:278 as TID
>>>> 1792 on executor 9: node02 (PROCESS_LOCAL)
>>>> > 14/02/26 14:33:17 INFO TaskSetManager: Serialized task 996.0:278 as
>>>> 4070 bytes in 0 ms
>>>> > 14/02/26 14:33:17 WARN TaskSetManager: Lost TID 1488 (task 996.0:184)
>>>> > 14/02/26 14:33:17 INFO TaskSetManager: Loss was due to
>>>> java.lang.OutOfMemoryError: Java heap space [duplicate 2]
>>>> > 14/02/26 14:33:17 INFO TaskSetManager: Starting task 996.0:247 as TID
>>>> 1793 on executor 9: node02 (PROCESS_LOCAL)
>>>> > 14/02/26 14:33:17 INFO TaskSetManager: Serialized task 996.0:247 as
>>>> 4070 bytes in 0 ms
>>>> > 14/02/26 14:33:17 WARN TaskSetManager: Lost TID 1484 (task 996.0:82)
>>>> > 14/02/26 14:33:17 INFO TaskSetManager: Loss was due to
>>>> java.lang.OutOfMemoryError: Java heap space [duplicate 3]
>>>> > 14/02/26 14:33:17 INFO TaskSetManager: Starting task 996.0:116 as TID
>>>> 1794 on executor 9: node02 (PROCESS_LOCAL)
>>>> > 14/02/26 14:33:17 INFO TaskSetManager: Serialized task 996.0:116 as
>>>> 4070 bytes in 1 ms
>>>> > 14/02/26 14:33:17 WARN TaskSetManager: Lost TID 1475 (task 996.0:157)
>>>> > 14/02/26 14:33:17 INFO TaskSetManager: Loss was due to
>>>> java.lang.OutOfMemoryError: Java heap space [duplicate 4]
>>>> > 14/02/26 14:33:17 INFO TaskSetManager: Starting task 996.0:98 as TID
>>>> 1795 on executor 9: node02 (PROCESS_LOCAL)
>>>> > 14/02/26 14:33:17 INFO TaskSetManager: Serialized task 996.0:98 as
>>>> 4070 bytes in 1 ms
>>>> > 14/02/26 14:33:17 WARN TaskSetManager: Lost TID 1492 (task 996.0:17)
>>>> >
>>>> >
>>>> > and then...
>>>> >
>>>> > 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1649 (task 996.0:115)
>>>> > 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1666 (task 996.0:32)
>>>> > 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1675 (task 996.0:160)
>>>> > 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1657 (task 996.0:349)
>>>> > 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1660 (task 996.0:141)
>>>> > 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1651 (task 996.0:55)
>>>> > 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1669 (task 996.0:126)
>>>> > 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1678 (task 996.0:173)
>>>> > 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1663 (task 996.0:128)
>>>> > 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1672 (task 996.0:28)
>>>> > 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1654 (task 996.0:96)
>>>> > 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1699 (task 996.0:294)
>>>> > 14/02/26 14:33:20 INFO DAGScheduler: Executor lost: 12 (epoch 16)
>>>> > 14/02/26 14:33:20 INFO BlockManagerMasterActor: Trying to remove
>>>> executor 12 from BlockManagerMaster.
>>>> > 14/02/26 14:33:20 INFO BlockManagerMaster: Removed 12 successfully in
>>>> removeExecutor
>>>> > 14/02/26 14:33:20 INFO Stage: Stage 996 is now unavailable on
>>>> executor 12 (0/379, false)
>>>> >
>>>> >
>>>> > which looks like warnings..
>>>> >
>>>> >
>>>> > The code I tried to run was:
>>>> > subs_count = complex_key.map( lambda x:
>>>> (x[0],int(x[1])).reduceByKey(lambda a,b:a+b))
>>>> > subs_count.take(20)
>>>> >
>>>> > Thanks
>>>> >
>>>> > --
>>>> > Mohit
>>>> >
>>>> > "When you want success as badly as you want the air, then you will
>>>> get it. There is no other secret of success."
>>>> > -Socrates
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > --
>>>> > Mohit
>>>> >
>>>> > "When you want success as badly as you want the air, then you will
>>>> get it. There is no other secret of success."
>>>> > -Socrates
>>>> >
>>>> >
>>>>
>>>>
>>>
>>>
>>> --
>>> Mohit
>>>
>>> "When you want success as badly as you want the air, then you will get
>>> it. There is no other secret of success."
>>> -Socrates
>>>
>>
>>
>
>
> --
> Mohit
>
> "When you want success as badly as you want the air, then you will get it.
> There is no other secret of success."
> -Socrates
>

Re: JVM error

Posted by Mohit Singh <mo...@gmail.com>.
And I tried that but got the error:
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/hadoop/spark/python/pyspark/context.py", line 83, in __init__
    SparkContext._ensure_initialized(self)
  File "/home/hadoop/spark/python/pyspark/context.py", line 165, in
_ensure_initialized
    raise ValueError("Cannot run multiple SparkContexts at once")
ValueError: Cannot run multiple SparkContexts at once


On Fri, Feb 28, 2014 at 11:59 AM, Bryn Keller <xo...@xoltar.org> wrote:

> Sorry, typo - that last line should be:
>
> sc = pyspark.Spark*Context*(conf = conf)
>
>
> On Fri, Feb 28, 2014 at 9:37 AM, Mohit Singh <mo...@gmail.com> wrote:
>
>> Hi Bryn,
>>   Thanks for the suggestion.
>> I tried that..
>> conf = pyspark.SparkConf().set("spark.executor.memory","20G")
>> But.. got an error here:
>>
>> sc = pyspark.SparkConf(conf = conf)
>> Traceback (most recent call last):
>>   File "<stdin>", line 1, in <module>
>> TypeError: __init__() got an unexpected keyword argument 'conf'
>>
>> ??
>> This is in pyspark shell.
>>
>>
>> On Thu, Feb 27, 2014 at 5:00 AM, Evgeniy Shishkin <it...@gmail.com>wrote:
>>
>>>
>>> On 27 Feb 2014, at 07:22, Aaron Davidson <il...@gmail.com> wrote:
>>>
>>> > Setting spark.executor.memory is indeed the correct way to do this. If
>>> you want to configure this in spark-env.sh, you can use
>>> > export SPARK_JAVA_OPTS=" -Dspark.executor.memory=20g"
>>> > (make sure to append the variable if you've been using SPARK_JAVA_OPTS
>>> previously)
>>> >
>>> >
>>> > On Wed, Feb 26, 2014 at 7:50 PM, Bryn Keller <xo...@xoltar.org>
>>> wrote:
>>> > Hi Mohit,
>>> >
>>> > You can still set SPARK_MEM in spark-env.sh, but that is deprecated.
>>> This is from SparkContext.scala:
>>> >
>>> > if (!conf.contains("spark.executor.memory") &&
>>> sys.env.contains("SPARK_MEM")) {
>>> >     logWarning("Using SPARK_MEM to set amount of memory to use per
>>> executor process is " +
>>> >       "deprecated, instead use spark.executor.memory")
>>> >   }
>>> >
>>> > Thanks,
>>> > Bryn
>>> >
>>> >
>>> > On Wed, Feb 26, 2014 at 6:28 PM, Mohit Singh <mo...@gmail.com>
>>> wrote:
>>> > Hi Bryn,
>>> >   Thanks for responding. Is there a way I can permanently configure
>>> this setting?
>>> > like SPARK_EXECUTOR_MEMORY or somethign like that?
>>> >
>>> >
>>> >
>>> > On Wed, Feb 26, 2014 at 2:56 PM, Bryn Keller <xo...@xoltar.org>
>>> wrote:
>>> > Hi Mohit,
>>> >
>>> > Try increasing the executor memory instead of the worker memory - the
>>> most appropriate place to do this is actually when you're creating your
>>> SparkContext, something like:
>>> >
>>> > conf = pyspark.SparkConf()
>>> >                        .setMaster("spark://master:7077")
>>> >                        .setAppName("Example")
>>> >                        .setSparkHome("/your/path/to/spark")
>>> >                        .set("spark.executor.memory", "20G")
>>> >                        .set("spark.logConf", "true")
>>> > sc = pyspark.SparkConf(conf = conf)
>>> >
>>> > Hope that helps,
>>> > Bryn
>>> >
>>> >
>>> >
>>> > On Wed, Feb 26, 2014 at 2:39 PM, Mohit Singh <mo...@gmail.com>
>>> wrote:
>>> > Hi,
>>> >   I am experimenting with pyspark lately...
>>> > Every now and then, I see this error bieng streamed to pyspark shell
>>> .. and most of the times.. the computation/operation completes.. and
>>> sometimes, it just gets stuck...
>>> > My setup is 8 node cluster.. with loads of ram(256GB's) and space(
>>> TB's) per node.
>>> > This enviornment is shared by general hadoop and hadoopy stuff..with
>>> recent spark addition...
>>> >
>>> > java.lang.OutOfMemoryError: Java heap space
>>> >     at
>>> com.ning.compress.BufferRecycler.allocEncodingBuffer(BufferRecycler.java:59)
>>> >     at com.ning.compress.lzf.ChunkEncoder.<init>(ChunkEncoder.java:93)
>>> >     at
>>> com.ning.compress.lzf.impl.UnsafeChunkEncoder.<init>(UnsafeChunkEncoder.java:40)
>>> >     at
>>> com.ning.compress.lzf.impl.UnsafeChunkEncoderLE.<init>(UnsafeChunkEncoderLE.java:13)
>>> >     at
>>> com.ning.compress.lzf.impl.UnsafeChunkEncoders.createEncoder(UnsafeChunkEncoders.java:31)
>>> >     at
>>> com.ning.compress.lzf.util.ChunkEncoderFactory.optimalInstance(ChunkEncoderFactory.java:44)
>>> >     at
>>> com.ning.compress.lzf.LZFOutputStream.<init>(LZFOutputStream.java:61)
>>> >     at
>>> org.apache.spark.io.LZFCompressionCodec.compressedOutputStream(CompressionCodec.scala:60)
>>> >     at
>>> org.apache.spark.storage.BlockManager.wrapForCompression(BlockManager.scala:803)
>>> >     at
>>> org.apache.spark.storage.BlockManager$$anonfun$5.apply(BlockManager.scala:471)
>>> >     at
>>> org.apache.spark.storage.BlockManager$$anonfun$5.apply(BlockManager.scala:471)
>>> >     at
>>> org.apache.spark.storage.DiskBlockObjectWriter.open(BlockObjectWriter.scala:117)
>>> >     at
>>> org.apache.spark.storage.DiskBlockObjectWriter.write(BlockObjectWriter.scala:174)
>>> >     at
>>> org.apache.spark.scheduler.ShuffleMapTask$$anonfun$runTask$1.apply(ShuffleMapTask.scala:164)
>>> >     at
>>> org.apache.spark.scheduler.ShuffleMapTask$$anonfun$runTask$1.apply(ShuffleMapTask.scala:161)
>>> >     at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>>> >     at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>>> >     at
>>> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:161)
>>> >     at
>>> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:102)
>>> >     at org.apache.spark.scheduler.Task.run(Task.scala:53)
>>> >     at
>>> org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mcV$sp(Executor.scala:213)
>>> >     at
>>> org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:49)
>>> >     at
>>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:178)
>>> >     at
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>> >     at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>> >     at java.lang.Thread.run(Thread.java:744)
>>> >
>>> >
>>> >
>>> > Most of the settings in spark are default.. So i was wondering if
>>> maybe, there is some configuration that needs to happen?
>>> > There is this one config I have addded to spark_env file
>>> > SPARK_WORKER_MEMORY=20g
>>> >
>>> > Also, I see tons of these errors as well..
>>> > 14/02/26 14:33:17 INFO TaskSetManager: Loss was due to
>>> java.lang.OutOfMemoryError: Java heap space [duplicate 1]
>>> > 14/02/26 14:33:17 INFO TaskSetManager: Starting task 996.0:278 as TID
>>> 1792 on executor 9: node02 (PROCESS_LOCAL)
>>> > 14/02/26 14:33:17 INFO TaskSetManager: Serialized task 996.0:278 as
>>> 4070 bytes in 0 ms
>>> > 14/02/26 14:33:17 WARN TaskSetManager: Lost TID 1488 (task 996.0:184)
>>> > 14/02/26 14:33:17 INFO TaskSetManager: Loss was due to
>>> java.lang.OutOfMemoryError: Java heap space [duplicate 2]
>>> > 14/02/26 14:33:17 INFO TaskSetManager: Starting task 996.0:247 as TID
>>> 1793 on executor 9: node02 (PROCESS_LOCAL)
>>> > 14/02/26 14:33:17 INFO TaskSetManager: Serialized task 996.0:247 as
>>> 4070 bytes in 0 ms
>>> > 14/02/26 14:33:17 WARN TaskSetManager: Lost TID 1484 (task 996.0:82)
>>> > 14/02/26 14:33:17 INFO TaskSetManager: Loss was due to
>>> java.lang.OutOfMemoryError: Java heap space [duplicate 3]
>>> > 14/02/26 14:33:17 INFO TaskSetManager: Starting task 996.0:116 as TID
>>> 1794 on executor 9: node02 (PROCESS_LOCAL)
>>> > 14/02/26 14:33:17 INFO TaskSetManager: Serialized task 996.0:116 as
>>> 4070 bytes in 1 ms
>>> > 14/02/26 14:33:17 WARN TaskSetManager: Lost TID 1475 (task 996.0:157)
>>> > 14/02/26 14:33:17 INFO TaskSetManager: Loss was due to
>>> java.lang.OutOfMemoryError: Java heap space [duplicate 4]
>>> > 14/02/26 14:33:17 INFO TaskSetManager: Starting task 996.0:98 as TID
>>> 1795 on executor 9: node02 (PROCESS_LOCAL)
>>> > 14/02/26 14:33:17 INFO TaskSetManager: Serialized task 996.0:98 as
>>> 4070 bytes in 1 ms
>>> > 14/02/26 14:33:17 WARN TaskSetManager: Lost TID 1492 (task 996.0:17)
>>> >
>>> >
>>> > and then...
>>> >
>>> > 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1649 (task 996.0:115)
>>> > 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1666 (task 996.0:32)
>>> > 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1675 (task 996.0:160)
>>> > 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1657 (task 996.0:349)
>>> > 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1660 (task 996.0:141)
>>> > 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1651 (task 996.0:55)
>>> > 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1669 (task 996.0:126)
>>> > 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1678 (task 996.0:173)
>>> > 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1663 (task 996.0:128)
>>> > 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1672 (task 996.0:28)
>>> > 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1654 (task 996.0:96)
>>> > 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1699 (task 996.0:294)
>>> > 14/02/26 14:33:20 INFO DAGScheduler: Executor lost: 12 (epoch 16)
>>> > 14/02/26 14:33:20 INFO BlockManagerMasterActor: Trying to remove
>>> executor 12 from BlockManagerMaster.
>>> > 14/02/26 14:33:20 INFO BlockManagerMaster: Removed 12 successfully in
>>> removeExecutor
>>> > 14/02/26 14:33:20 INFO Stage: Stage 996 is now unavailable on executor
>>> 12 (0/379, false)
>>> >
>>> >
>>> > which looks like warnings..
>>> >
>>> >
>>> > The code I tried to run was:
>>> > subs_count = complex_key.map( lambda x:
>>> (x[0],int(x[1])).reduceByKey(lambda a,b:a+b))
>>> > subs_count.take(20)
>>> >
>>> > Thanks
>>> >
>>> > --
>>> > Mohit
>>> >
>>> > "When you want success as badly as you want the air, then you will get
>>> it. There is no other secret of success."
>>> > -Socrates
>>> >
>>> >
>>> >
>>> >
>>> > --
>>> > Mohit
>>> >
>>> > "When you want success as badly as you want the air, then you will get
>>> it. There is no other secret of success."
>>> > -Socrates
>>> >
>>> >
>>>
>>>
>>
>>
>> --
>> Mohit
>>
>> "When you want success as badly as you want the air, then you will get
>> it. There is no other secret of success."
>> -Socrates
>>
>
>


-- 
Mohit

"When you want success as badly as you want the air, then you will get it.
There is no other secret of success."
-Socrates

Re: JVM error

Posted by Bryn Keller <xo...@xoltar.org>.
Sorry, typo - that last line should be:

sc = pyspark.Spark*Context*(conf = conf)


On Fri, Feb 28, 2014 at 9:37 AM, Mohit Singh <mo...@gmail.com> wrote:

> Hi Bryn,
>   Thanks for the suggestion.
> I tried that..
> conf = pyspark.SparkConf().set("spark.executor.memory","20G")
> But.. got an error here:
>
> sc = pyspark.SparkConf(conf = conf)
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> TypeError: __init__() got an unexpected keyword argument 'conf'
>
> ??
> This is in pyspark shell.
>
>
> On Thu, Feb 27, 2014 at 5:00 AM, Evgeniy Shishkin <it...@gmail.com>wrote:
>
>>
>> On 27 Feb 2014, at 07:22, Aaron Davidson <il...@gmail.com> wrote:
>>
>> > Setting spark.executor.memory is indeed the correct way to do this. If
>> you want to configure this in spark-env.sh, you can use
>> > export SPARK_JAVA_OPTS=" -Dspark.executor.memory=20g"
>> > (make sure to append the variable if you've been using SPARK_JAVA_OPTS
>> previously)
>> >
>> >
>> > On Wed, Feb 26, 2014 at 7:50 PM, Bryn Keller <xo...@xoltar.org> wrote:
>> > Hi Mohit,
>> >
>> > You can still set SPARK_MEM in spark-env.sh, but that is deprecated.
>> This is from SparkContext.scala:
>> >
>> > if (!conf.contains("spark.executor.memory") &&
>> sys.env.contains("SPARK_MEM")) {
>> >     logWarning("Using SPARK_MEM to set amount of memory to use per
>> executor process is " +
>> >       "deprecated, instead use spark.executor.memory")
>> >   }
>> >
>> > Thanks,
>> > Bryn
>> >
>> >
>> > On Wed, Feb 26, 2014 at 6:28 PM, Mohit Singh <mo...@gmail.com>
>> wrote:
>> > Hi Bryn,
>> >   Thanks for responding. Is there a way I can permanently configure
>> this setting?
>> > like SPARK_EXECUTOR_MEMORY or somethign like that?
>> >
>> >
>> >
>> > On Wed, Feb 26, 2014 at 2:56 PM, Bryn Keller <xo...@xoltar.org> wrote:
>> > Hi Mohit,
>> >
>> > Try increasing the executor memory instead of the worker memory - the
>> most appropriate place to do this is actually when you're creating your
>> SparkContext, something like:
>> >
>> > conf = pyspark.SparkConf()
>> >                        .setMaster("spark://master:7077")
>> >                        .setAppName("Example")
>> >                        .setSparkHome("/your/path/to/spark")
>> >                        .set("spark.executor.memory", "20G")
>> >                        .set("spark.logConf", "true")
>> > sc = pyspark.SparkConf(conf = conf)
>> >
>> > Hope that helps,
>> > Bryn
>> >
>> >
>> >
>> > On Wed, Feb 26, 2014 at 2:39 PM, Mohit Singh <mo...@gmail.com>
>> wrote:
>> > Hi,
>> >   I am experimenting with pyspark lately...
>> > Every now and then, I see this error bieng streamed to pyspark shell ..
>> and most of the times.. the computation/operation completes.. and
>> sometimes, it just gets stuck...
>> > My setup is 8 node cluster.. with loads of ram(256GB's) and space(
>> TB's) per node.
>> > This enviornment is shared by general hadoop and hadoopy stuff..with
>> recent spark addition...
>> >
>> > java.lang.OutOfMemoryError: Java heap space
>> >     at
>> com.ning.compress.BufferRecycler.allocEncodingBuffer(BufferRecycler.java:59)
>> >     at com.ning.compress.lzf.ChunkEncoder.<init>(ChunkEncoder.java:93)
>> >     at
>> com.ning.compress.lzf.impl.UnsafeChunkEncoder.<init>(UnsafeChunkEncoder.java:40)
>> >     at
>> com.ning.compress.lzf.impl.UnsafeChunkEncoderLE.<init>(UnsafeChunkEncoderLE.java:13)
>> >     at
>> com.ning.compress.lzf.impl.UnsafeChunkEncoders.createEncoder(UnsafeChunkEncoders.java:31)
>> >     at
>> com.ning.compress.lzf.util.ChunkEncoderFactory.optimalInstance(ChunkEncoderFactory.java:44)
>> >     at
>> com.ning.compress.lzf.LZFOutputStream.<init>(LZFOutputStream.java:61)
>> >     at
>> org.apache.spark.io.LZFCompressionCodec.compressedOutputStream(CompressionCodec.scala:60)
>> >     at
>> org.apache.spark.storage.BlockManager.wrapForCompression(BlockManager.scala:803)
>> >     at
>> org.apache.spark.storage.BlockManager$$anonfun$5.apply(BlockManager.scala:471)
>> >     at
>> org.apache.spark.storage.BlockManager$$anonfun$5.apply(BlockManager.scala:471)
>> >     at
>> org.apache.spark.storage.DiskBlockObjectWriter.open(BlockObjectWriter.scala:117)
>> >     at
>> org.apache.spark.storage.DiskBlockObjectWriter.write(BlockObjectWriter.scala:174)
>> >     at
>> org.apache.spark.scheduler.ShuffleMapTask$$anonfun$runTask$1.apply(ShuffleMapTask.scala:164)
>> >     at
>> org.apache.spark.scheduler.ShuffleMapTask$$anonfun$runTask$1.apply(ShuffleMapTask.scala:161)
>> >     at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>> >     at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>> >     at
>> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:161)
>> >     at
>> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:102)
>> >     at org.apache.spark.scheduler.Task.run(Task.scala:53)
>> >     at
>> org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mcV$sp(Executor.scala:213)
>> >     at
>> org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:49)
>> >     at
>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:178)
>> >     at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>> >     at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>> >     at java.lang.Thread.run(Thread.java:744)
>> >
>> >
>> >
>> > Most of the settings in spark are default.. So i was wondering if
>> maybe, there is some configuration that needs to happen?
>> > There is this one config I have addded to spark_env file
>> > SPARK_WORKER_MEMORY=20g
>> >
>> > Also, I see tons of these errors as well..
>> > 14/02/26 14:33:17 INFO TaskSetManager: Loss was due to
>> java.lang.OutOfMemoryError: Java heap space [duplicate 1]
>> > 14/02/26 14:33:17 INFO TaskSetManager: Starting task 996.0:278 as TID
>> 1792 on executor 9: node02 (PROCESS_LOCAL)
>> > 14/02/26 14:33:17 INFO TaskSetManager: Serialized task 996.0:278 as
>> 4070 bytes in 0 ms
>> > 14/02/26 14:33:17 WARN TaskSetManager: Lost TID 1488 (task 996.0:184)
>> > 14/02/26 14:33:17 INFO TaskSetManager: Loss was due to
>> java.lang.OutOfMemoryError: Java heap space [duplicate 2]
>> > 14/02/26 14:33:17 INFO TaskSetManager: Starting task 996.0:247 as TID
>> 1793 on executor 9: node02 (PROCESS_LOCAL)
>> > 14/02/26 14:33:17 INFO TaskSetManager: Serialized task 996.0:247 as
>> 4070 bytes in 0 ms
>> > 14/02/26 14:33:17 WARN TaskSetManager: Lost TID 1484 (task 996.0:82)
>> > 14/02/26 14:33:17 INFO TaskSetManager: Loss was due to
>> java.lang.OutOfMemoryError: Java heap space [duplicate 3]
>> > 14/02/26 14:33:17 INFO TaskSetManager: Starting task 996.0:116 as TID
>> 1794 on executor 9: node02 (PROCESS_LOCAL)
>> > 14/02/26 14:33:17 INFO TaskSetManager: Serialized task 996.0:116 as
>> 4070 bytes in 1 ms
>> > 14/02/26 14:33:17 WARN TaskSetManager: Lost TID 1475 (task 996.0:157)
>> > 14/02/26 14:33:17 INFO TaskSetManager: Loss was due to
>> java.lang.OutOfMemoryError: Java heap space [duplicate 4]
>> > 14/02/26 14:33:17 INFO TaskSetManager: Starting task 996.0:98 as TID
>> 1795 on executor 9: node02 (PROCESS_LOCAL)
>> > 14/02/26 14:33:17 INFO TaskSetManager: Serialized task 996.0:98 as 4070
>> bytes in 1 ms
>> > 14/02/26 14:33:17 WARN TaskSetManager: Lost TID 1492 (task 996.0:17)
>> >
>> >
>> > and then...
>> >
>> > 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1649 (task 996.0:115)
>> > 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1666 (task 996.0:32)
>> > 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1675 (task 996.0:160)
>> > 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1657 (task 996.0:349)
>> > 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1660 (task 996.0:141)
>> > 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1651 (task 996.0:55)
>> > 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1669 (task 996.0:126)
>> > 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1678 (task 996.0:173)
>> > 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1663 (task 996.0:128)
>> > 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1672 (task 996.0:28)
>> > 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1654 (task 996.0:96)
>> > 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1699 (task 996.0:294)
>> > 14/02/26 14:33:20 INFO DAGScheduler: Executor lost: 12 (epoch 16)
>> > 14/02/26 14:33:20 INFO BlockManagerMasterActor: Trying to remove
>> executor 12 from BlockManagerMaster.
>> > 14/02/26 14:33:20 INFO BlockManagerMaster: Removed 12 successfully in
>> removeExecutor
>> > 14/02/26 14:33:20 INFO Stage: Stage 996 is now unavailable on executor
>> 12 (0/379, false)
>> >
>> >
>> > which looks like warnings..
>> >
>> >
>> > The code I tried to run was:
>> > subs_count = complex_key.map( lambda x:
>> (x[0],int(x[1])).reduceByKey(lambda a,b:a+b))
>> > subs_count.take(20)
>> >
>> > Thanks
>> >
>> > --
>> > Mohit
>> >
>> > "When you want success as badly as you want the air, then you will get
>> it. There is no other secret of success."
>> > -Socrates
>> >
>> >
>> >
>> >
>> > --
>> > Mohit
>> >
>> > "When you want success as badly as you want the air, then you will get
>> it. There is no other secret of success."
>> > -Socrates
>> >
>> >
>>
>>
>
>
> --
> Mohit
>
> "When you want success as badly as you want the air, then you will get it.
> There is no other secret of success."
> -Socrates
>

Re: JVM error

Posted by Mohit Singh <mo...@gmail.com>.
Hi Bryn,
  Thanks for the suggestion.
I tried that..
conf = pyspark.SparkConf().set("spark.executor.memory","20G")
But.. got an error here:
sc = pyspark.SparkConf(conf = conf)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: __init__() got an unexpected keyword argument 'conf'

??
This is in pyspark shell.


On Thu, Feb 27, 2014 at 5:00 AM, Evgeniy Shishkin <it...@gmail.com>wrote:

>
> On 27 Feb 2014, at 07:22, Aaron Davidson <il...@gmail.com> wrote:
>
> > Setting spark.executor.memory is indeed the correct way to do this. If
> you want to configure this in spark-env.sh, you can use
> > export SPARK_JAVA_OPTS=" -Dspark.executor.memory=20g"
> > (make sure to append the variable if you've been using SPARK_JAVA_OPTS
> previously)
> >
> >
> > On Wed, Feb 26, 2014 at 7:50 PM, Bryn Keller <xo...@xoltar.org> wrote:
> > Hi Mohit,
> >
> > You can still set SPARK_MEM in spark-env.sh, but that is deprecated.
> This is from SparkContext.scala:
> >
> > if (!conf.contains("spark.executor.memory") &&
> sys.env.contains("SPARK_MEM")) {
> >     logWarning("Using SPARK_MEM to set amount of memory to use per
> executor process is " +
> >       "deprecated, instead use spark.executor.memory")
> >   }
> >
> > Thanks,
> > Bryn
> >
> >
> > On Wed, Feb 26, 2014 at 6:28 PM, Mohit Singh <mo...@gmail.com>
> wrote:
> > Hi Bryn,
> >   Thanks for responding. Is there a way I can permanently configure this
> setting?
> > like SPARK_EXECUTOR_MEMORY or somethign like that?
> >
> >
> >
> > On Wed, Feb 26, 2014 at 2:56 PM, Bryn Keller <xo...@xoltar.org> wrote:
> > Hi Mohit,
> >
> > Try increasing the executor memory instead of the worker memory - the
> most appropriate place to do this is actually when you're creating your
> SparkContext, something like:
> >
> > conf = pyspark.SparkConf()
> >                        .setMaster("spark://master:7077")
> >                        .setAppName("Example")
> >                        .setSparkHome("/your/path/to/spark")
> >                        .set("spark.executor.memory", "20G")
> >                        .set("spark.logConf", "true")
> > sc = pyspark.SparkConf(conf = conf)
> >
> > Hope that helps,
> > Bryn
> >
> >
> >
> > On Wed, Feb 26, 2014 at 2:39 PM, Mohit Singh <mo...@gmail.com>
> wrote:
> > Hi,
> >   I am experimenting with pyspark lately...
> > Every now and then, I see this error bieng streamed to pyspark shell ..
> and most of the times.. the computation/operation completes.. and
> sometimes, it just gets stuck...
> > My setup is 8 node cluster.. with loads of ram(256GB's) and space( TB's)
> per node.
> > This enviornment is shared by general hadoop and hadoopy stuff..with
> recent spark addition...
> >
> > java.lang.OutOfMemoryError: Java heap space
> >     at
> com.ning.compress.BufferRecycler.allocEncodingBuffer(BufferRecycler.java:59)
> >     at com.ning.compress.lzf.ChunkEncoder.<init>(ChunkEncoder.java:93)
> >     at
> com.ning.compress.lzf.impl.UnsafeChunkEncoder.<init>(UnsafeChunkEncoder.java:40)
> >     at
> com.ning.compress.lzf.impl.UnsafeChunkEncoderLE.<init>(UnsafeChunkEncoderLE.java:13)
> >     at
> com.ning.compress.lzf.impl.UnsafeChunkEncoders.createEncoder(UnsafeChunkEncoders.java:31)
> >     at
> com.ning.compress.lzf.util.ChunkEncoderFactory.optimalInstance(ChunkEncoderFactory.java:44)
> >     at
> com.ning.compress.lzf.LZFOutputStream.<init>(LZFOutputStream.java:61)
> >     at
> org.apache.spark.io.LZFCompressionCodec.compressedOutputStream(CompressionCodec.scala:60)
> >     at
> org.apache.spark.storage.BlockManager.wrapForCompression(BlockManager.scala:803)
> >     at
> org.apache.spark.storage.BlockManager$$anonfun$5.apply(BlockManager.scala:471)
> >     at
> org.apache.spark.storage.BlockManager$$anonfun$5.apply(BlockManager.scala:471)
> >     at
> org.apache.spark.storage.DiskBlockObjectWriter.open(BlockObjectWriter.scala:117)
> >     at
> org.apache.spark.storage.DiskBlockObjectWriter.write(BlockObjectWriter.scala:174)
> >     at
> org.apache.spark.scheduler.ShuffleMapTask$$anonfun$runTask$1.apply(ShuffleMapTask.scala:164)
> >     at
> org.apache.spark.scheduler.ShuffleMapTask$$anonfun$runTask$1.apply(ShuffleMapTask.scala:161)
> >     at scala.collection.Iterator$class.foreach(Iterator.scala:727)
> >     at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
> >     at
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:161)
> >     at
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:102)
> >     at org.apache.spark.scheduler.Task.run(Task.scala:53)
> >     at
> org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mcV$sp(Executor.scala:213)
> >     at
> org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:49)
> >     at
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:178)
> >     at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> >     at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> >     at java.lang.Thread.run(Thread.java:744)
> >
> >
> >
> > Most of the settings in spark are default.. So i was wondering if maybe,
> there is some configuration that needs to happen?
> > There is this one config I have addded to spark_env file
> > SPARK_WORKER_MEMORY=20g
> >
> > Also, I see tons of these errors as well..
> > 14/02/26 14:33:17 INFO TaskSetManager: Loss was due to
> java.lang.OutOfMemoryError: Java heap space [duplicate 1]
> > 14/02/26 14:33:17 INFO TaskSetManager: Starting task 996.0:278 as TID
> 1792 on executor 9: node02 (PROCESS_LOCAL)
> > 14/02/26 14:33:17 INFO TaskSetManager: Serialized task 996.0:278 as 4070
> bytes in 0 ms
> > 14/02/26 14:33:17 WARN TaskSetManager: Lost TID 1488 (task 996.0:184)
> > 14/02/26 14:33:17 INFO TaskSetManager: Loss was due to
> java.lang.OutOfMemoryError: Java heap space [duplicate 2]
> > 14/02/26 14:33:17 INFO TaskSetManager: Starting task 996.0:247 as TID
> 1793 on executor 9: node02 (PROCESS_LOCAL)
> > 14/02/26 14:33:17 INFO TaskSetManager: Serialized task 996.0:247 as 4070
> bytes in 0 ms
> > 14/02/26 14:33:17 WARN TaskSetManager: Lost TID 1484 (task 996.0:82)
> > 14/02/26 14:33:17 INFO TaskSetManager: Loss was due to
> java.lang.OutOfMemoryError: Java heap space [duplicate 3]
> > 14/02/26 14:33:17 INFO TaskSetManager: Starting task 996.0:116 as TID
> 1794 on executor 9: node02 (PROCESS_LOCAL)
> > 14/02/26 14:33:17 INFO TaskSetManager: Serialized task 996.0:116 as 4070
> bytes in 1 ms
> > 14/02/26 14:33:17 WARN TaskSetManager: Lost TID 1475 (task 996.0:157)
> > 14/02/26 14:33:17 INFO TaskSetManager: Loss was due to
> java.lang.OutOfMemoryError: Java heap space [duplicate 4]
> > 14/02/26 14:33:17 INFO TaskSetManager: Starting task 996.0:98 as TID
> 1795 on executor 9: node02 (PROCESS_LOCAL)
> > 14/02/26 14:33:17 INFO TaskSetManager: Serialized task 996.0:98 as 4070
> bytes in 1 ms
> > 14/02/26 14:33:17 WARN TaskSetManager: Lost TID 1492 (task 996.0:17)
> >
> >
> > and then...
> >
> > 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1649 (task 996.0:115)
> > 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1666 (task 996.0:32)
> > 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1675 (task 996.0:160)
> > 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1657 (task 996.0:349)
> > 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1660 (task 996.0:141)
> > 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1651 (task 996.0:55)
> > 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1669 (task 996.0:126)
> > 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1678 (task 996.0:173)
> > 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1663 (task 996.0:128)
> > 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1672 (task 996.0:28)
> > 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1654 (task 996.0:96)
> > 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1699 (task 996.0:294)
> > 14/02/26 14:33:20 INFO DAGScheduler: Executor lost: 12 (epoch 16)
> > 14/02/26 14:33:20 INFO BlockManagerMasterActor: Trying to remove
> executor 12 from BlockManagerMaster.
> > 14/02/26 14:33:20 INFO BlockManagerMaster: Removed 12 successfully in
> removeExecutor
> > 14/02/26 14:33:20 INFO Stage: Stage 996 is now unavailable on executor
> 12 (0/379, false)
> >
> >
> > which looks like warnings..
> >
> >
> > The code I tried to run was:
> > subs_count = complex_key.map( lambda x:
> (x[0],int(x[1])).reduceByKey(lambda a,b:a+b))
> > subs_count.take(20)
> >
> > Thanks
> >
> > --
> > Mohit
> >
> > "When you want success as badly as you want the air, then you will get
> it. There is no other secret of success."
> > -Socrates
> >
> >
> >
> >
> > --
> > Mohit
> >
> > "When you want success as badly as you want the air, then you will get
> it. There is no other secret of success."
> > -Socrates
> >
> >
>
>


-- 
Mohit

"When you want success as badly as you want the air, then you will get it.
There is no other secret of success."
-Socrates

Re: JVM error

Posted by Evgeniy Shishkin <it...@gmail.com>.
On 27 Feb 2014, at 07:22, Aaron Davidson <il...@gmail.com> wrote:

> Setting spark.executor.memory is indeed the correct way to do this. If you want to configure this in spark-env.sh, you can use
> export SPARK_JAVA_OPTS=" -Dspark.executor.memory=20g"
> (make sure to append the variable if you've been using SPARK_JAVA_OPTS previously)
> 
> 
> On Wed, Feb 26, 2014 at 7:50 PM, Bryn Keller <xo...@xoltar.org> wrote:
> Hi Mohit,
> 
> You can still set SPARK_MEM in spark-env.sh, but that is deprecated. This is from SparkContext.scala:
> 
> if (!conf.contains("spark.executor.memory") && sys.env.contains("SPARK_MEM")) {
>     logWarning("Using SPARK_MEM to set amount of memory to use per executor process is " +
>       "deprecated, instead use spark.executor.memory")
>   }
> 
> Thanks,
> Bryn
> 
> 
> On Wed, Feb 26, 2014 at 6:28 PM, Mohit Singh <mo...@gmail.com> wrote:
> Hi Bryn,
>   Thanks for responding. Is there a way I can permanently configure this setting?
> like SPARK_EXECUTOR_MEMORY or somethign like that?
> 
> 
> 
> On Wed, Feb 26, 2014 at 2:56 PM, Bryn Keller <xo...@xoltar.org> wrote:
> Hi Mohit,
> 
> Try increasing the executor memory instead of the worker memory - the most appropriate place to do this is actually when you're creating your SparkContext, something like:
> 
> conf = pyspark.SparkConf()
>                        .setMaster("spark://master:7077")
>                        .setAppName("Example")
>                        .setSparkHome("/your/path/to/spark")
>                        .set("spark.executor.memory", "20G")
>                        .set("spark.logConf", "true")
> sc = pyspark.SparkConf(conf = conf)
> 
> Hope that helps,
> Bryn
> 
> 
> 
> On Wed, Feb 26, 2014 at 2:39 PM, Mohit Singh <mo...@gmail.com> wrote:
> Hi,
>   I am experimenting with pyspark lately...
> Every now and then, I see this error bieng streamed to pyspark shell .. and most of the times.. the computation/operation completes.. and sometimes, it just gets stuck...
> My setup is 8 node cluster.. with loads of ram(256GB's) and space( TB's) per node.
> This enviornment is shared by general hadoop and hadoopy stuff..with recent spark addition...
> 
> java.lang.OutOfMemoryError: Java heap space
>     at com.ning.compress.BufferRecycler.allocEncodingBuffer(BufferRecycler.java:59)
>     at com.ning.compress.lzf.ChunkEncoder.<init>(ChunkEncoder.java:93)
>     at com.ning.compress.lzf.impl.UnsafeChunkEncoder.<init>(UnsafeChunkEncoder.java:40)
>     at com.ning.compress.lzf.impl.UnsafeChunkEncoderLE.<init>(UnsafeChunkEncoderLE.java:13)
>     at com.ning.compress.lzf.impl.UnsafeChunkEncoders.createEncoder(UnsafeChunkEncoders.java:31)
>     at com.ning.compress.lzf.util.ChunkEncoderFactory.optimalInstance(ChunkEncoderFactory.java:44)
>     at com.ning.compress.lzf.LZFOutputStream.<init>(LZFOutputStream.java:61)
>     at org.apache.spark.io.LZFCompressionCodec.compressedOutputStream(CompressionCodec.scala:60)
>     at org.apache.spark.storage.BlockManager.wrapForCompression(BlockManager.scala:803)
>     at org.apache.spark.storage.BlockManager$$anonfun$5.apply(BlockManager.scala:471)
>     at org.apache.spark.storage.BlockManager$$anonfun$5.apply(BlockManager.scala:471)
>     at org.apache.spark.storage.DiskBlockObjectWriter.open(BlockObjectWriter.scala:117)
>     at org.apache.spark.storage.DiskBlockObjectWriter.write(BlockObjectWriter.scala:174)
>     at org.apache.spark.scheduler.ShuffleMapTask$$anonfun$runTask$1.apply(ShuffleMapTask.scala:164)
>     at org.apache.spark.scheduler.ShuffleMapTask$$anonfun$runTask$1.apply(ShuffleMapTask.scala:161)
>     at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>     at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>     at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:161)
>     at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:102)
>     at org.apache.spark.scheduler.Task.run(Task.scala:53)
>     at org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mcV$sp(Executor.scala:213)
>     at org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:49)
>     at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:178)
>     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>     at java.lang.Thread.run(Thread.java:744)
> 
> 
> 
> Most of the settings in spark are default.. So i was wondering if maybe, there is some configuration that needs to happen?
> There is this one config I have addded to spark_env file
> SPARK_WORKER_MEMORY=20g
> 
> Also, I see tons of these errors as well..
> 14/02/26 14:33:17 INFO TaskSetManager: Loss was due to java.lang.OutOfMemoryError: Java heap space [duplicate 1]
> 14/02/26 14:33:17 INFO TaskSetManager: Starting task 996.0:278 as TID 1792 on executor 9: node02 (PROCESS_LOCAL)
> 14/02/26 14:33:17 INFO TaskSetManager: Serialized task 996.0:278 as 4070 bytes in 0 ms
> 14/02/26 14:33:17 WARN TaskSetManager: Lost TID 1488 (task 996.0:184)
> 14/02/26 14:33:17 INFO TaskSetManager: Loss was due to java.lang.OutOfMemoryError: Java heap space [duplicate 2]
> 14/02/26 14:33:17 INFO TaskSetManager: Starting task 996.0:247 as TID 1793 on executor 9: node02 (PROCESS_LOCAL)
> 14/02/26 14:33:17 INFO TaskSetManager: Serialized task 996.0:247 as 4070 bytes in 0 ms
> 14/02/26 14:33:17 WARN TaskSetManager: Lost TID 1484 (task 996.0:82)
> 14/02/26 14:33:17 INFO TaskSetManager: Loss was due to java.lang.OutOfMemoryError: Java heap space [duplicate 3]
> 14/02/26 14:33:17 INFO TaskSetManager: Starting task 996.0:116 as TID 1794 on executor 9: node02 (PROCESS_LOCAL)
> 14/02/26 14:33:17 INFO TaskSetManager: Serialized task 996.0:116 as 4070 bytes in 1 ms
> 14/02/26 14:33:17 WARN TaskSetManager: Lost TID 1475 (task 996.0:157)
> 14/02/26 14:33:17 INFO TaskSetManager: Loss was due to java.lang.OutOfMemoryError: Java heap space [duplicate 4]
> 14/02/26 14:33:17 INFO TaskSetManager: Starting task 996.0:98 as TID 1795 on executor 9: node02 (PROCESS_LOCAL)
> 14/02/26 14:33:17 INFO TaskSetManager: Serialized task 996.0:98 as 4070 bytes in 1 ms
> 14/02/26 14:33:17 WARN TaskSetManager: Lost TID 1492 (task 996.0:17)
> 
> 
> and then...
> 
> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1649 (task 996.0:115)
> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1666 (task 996.0:32)
> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1675 (task 996.0:160)
> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1657 (task 996.0:349)
> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1660 (task 996.0:141)
> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1651 (task 996.0:55)
> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1669 (task 996.0:126)
> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1678 (task 996.0:173)
> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1663 (task 996.0:128)
> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1672 (task 996.0:28)
> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1654 (task 996.0:96)
> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1699 (task 996.0:294)
> 14/02/26 14:33:20 INFO DAGScheduler: Executor lost: 12 (epoch 16)
> 14/02/26 14:33:20 INFO BlockManagerMasterActor: Trying to remove executor 12 from BlockManagerMaster.
> 14/02/26 14:33:20 INFO BlockManagerMaster: Removed 12 successfully in removeExecutor
> 14/02/26 14:33:20 INFO Stage: Stage 996 is now unavailable on executor 12 (0/379, false)
> 
> 
> which looks like warnings..
> 
> 
> The code I tried to run was:
> subs_count = complex_key.map( lambda x: (x[0],int(x[1])).reduceByKey(lambda a,b:a+b))
> subs_count.take(20)
> 
> Thanks
> 
> -- 
> Mohit
> 
> "When you want success as badly as you want the air, then you will get it. There is no other secret of success."
> -Socrates
> 
> 
> 
> 
> -- 
> Mohit
> 
> "When you want success as badly as you want the air, then you will get it. There is no other secret of success."
> -Socrates
> 
> 


Re: JVM error

Posted by Aaron Davidson <il...@gmail.com>.
Setting spark.executor.memory is indeed the correct way to do this. If you
want to configure this in spark-env.sh, you can use
export SPARK_JAVA_OPTS=" -Dspark.executor.memory=20g"
(make sure to append the variable if you've been using SPARK_JAVA_OPTS
previously)


On Wed, Feb 26, 2014 at 7:50 PM, Bryn Keller <xo...@xoltar.org> wrote:

> Hi Mohit,
>
> You can still set SPARK_MEM in spark-env.sh, but that is deprecated. This
> is from SparkContext.scala:
>
> if (!conf.contains("spark.executor.memory") &&
> sys.env.contains("SPARK_MEM")) {
>     logWarning("Using SPARK_MEM to set amount of memory to use per
> executor process is " +
>       "deprecated, instead use spark.executor.memory")
>   }
>
> Thanks,
> Bryn
>
>
> On Wed, Feb 26, 2014 at 6:28 PM, Mohit Singh <mo...@gmail.com> wrote:
>
>> Hi Bryn,
>>   Thanks for responding. Is there a way I can permanently configure this
>> setting?
>> like SPARK_EXECUTOR_MEMORY or somethign like that?
>>
>>
>>
>> On Wed, Feb 26, 2014 at 2:56 PM, Bryn Keller <xo...@xoltar.org> wrote:
>>
>>> Hi Mohit,
>>>
>>> Try increasing the *executor* memory instead of the worker memory - the
>>> most appropriate place to do this is actually when you're creating your
>>> SparkContext, something like:
>>>
>>> conf = pyspark.SparkConf()
>>>                        .setMaster("spark://master:7077")
>>>                        .setAppName("Example")
>>>                        .setSparkHome("/your/path/to/spark")
>>>                        .set("spark.executor.memory", "20G")
>>>                        .set("spark.logConf", "true")
>>> sc = pyspark.SparkConf(conf = conf)
>>>
>>> Hope that helps,
>>> Bryn
>>>
>>>
>>>
>>> On Wed, Feb 26, 2014 at 2:39 PM, Mohit Singh <mo...@gmail.com>wrote:
>>>
>>>> Hi,
>>>>   I am experimenting with pyspark lately...
>>>> Every now and then, I see this error bieng streamed to pyspark shell ..
>>>> and most of the times.. the computation/operation completes.. and
>>>> sometimes, it just gets stuck...
>>>> My setup is 8 node cluster.. with loads of ram(256GB's) and space(
>>>> TB's) per node.
>>>> This enviornment is shared by general hadoop and hadoopy stuff..with
>>>> recent spark addition...
>>>>
>>>> java.lang.OutOfMemoryError: Java heap space
>>>>     at
>>>> com.ning.compress.BufferRecycler.allocEncodingBuffer(BufferRecycler.java:59)
>>>>     at com.ning.compress.lzf.ChunkEncoder.<init>(ChunkEncoder.java:93)
>>>>     at
>>>> com.ning.compress.lzf.impl.UnsafeChunkEncoder.<init>(UnsafeChunkEncoder.java:40)
>>>>     at
>>>> com.ning.compress.lzf.impl.UnsafeChunkEncoderLE.<init>(UnsafeChunkEncoderLE.java:13)
>>>>     at
>>>> com.ning.compress.lzf.impl.UnsafeChunkEncoders.createEncoder(UnsafeChunkEncoders.java:31)
>>>>     at
>>>> com.ning.compress.lzf.util.ChunkEncoderFactory.optimalInstance(ChunkEncoderFactory.java:44)
>>>>     at
>>>> com.ning.compress.lzf.LZFOutputStream.<init>(LZFOutputStream.java:61)
>>>>     at
>>>> org.apache.spark.io.LZFCompressionCodec.compressedOutputStream(CompressionCodec.scala:60)
>>>>     at
>>>> org.apache.spark.storage.BlockManager.wrapForCompression(BlockManager.scala:803)
>>>>     at
>>>> org.apache.spark.storage.BlockManager$$anonfun$5.apply(BlockManager.scala:471)
>>>>     at
>>>> org.apache.spark.storage.BlockManager$$anonfun$5.apply(BlockManager.scala:471)
>>>>     at
>>>> org.apache.spark.storage.DiskBlockObjectWriter.open(BlockObjectWriter.scala:117)
>>>>     at
>>>> org.apache.spark.storage.DiskBlockObjectWriter.write(BlockObjectWriter.scala:174)
>>>>     at
>>>> org.apache.spark.scheduler.ShuffleMapTask$$anonfun$runTask$1.apply(ShuffleMapTask.scala:164)
>>>>     at
>>>> org.apache.spark.scheduler.ShuffleMapTask$$anonfun$runTask$1.apply(ShuffleMapTask.scala:161)
>>>>     at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>>>>     at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>>>>     at
>>>> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:161)
>>>>     at
>>>> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:102)
>>>>     at org.apache.spark.scheduler.Task.run(Task.scala:53)
>>>>     at
>>>> org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mcV$sp(Executor.scala:213)
>>>>     at
>>>> org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:49)
>>>>     at
>>>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:178)
>>>>     at
>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>>     at
>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>>     at java.lang.Thread.run(Thread.java:744)
>>>>
>>>>
>>>>
>>>> Most of the settings in spark are default.. So i was wondering if
>>>> maybe, there is some configuration that needs to happen?
>>>> There is this one config I have addded to spark_env file
>>>> SPARK_WORKER_MEMORY=20g
>>>>
>>>> Also, I see tons of these errors as well..
>>>> 14/02/26 14:33:17 INFO TaskSetManager: Loss was due to
>>>> java.lang.OutOfMemoryError: Java heap space [duplicate 1]
>>>> 14/02/26 14:33:17 INFO TaskSetManager: Starting task 996.0:278 as TID
>>>> 1792 on executor 9: node02 (PROCESS_LOCAL)
>>>> 14/02/26 14:33:17 INFO TaskSetManager: Serialized task 996.0:278 as
>>>> 4070 bytes in 0 ms
>>>> 14/02/26 14:33:17 WARN TaskSetManager: Lost TID 1488 (task 996.0:184)
>>>> 14/02/26 14:33:17 INFO TaskSetManager: Loss was due to
>>>> java.lang.OutOfMemoryError: Java heap space [duplicate 2]
>>>> 14/02/26 14:33:17 INFO TaskSetManager: Starting task 996.0:247 as TID
>>>> 1793 on executor 9: node02 (PROCESS_LOCAL)
>>>> 14/02/26 14:33:17 INFO TaskSetManager: Serialized task 996.0:247 as
>>>> 4070 bytes in 0 ms
>>>> 14/02/26 14:33:17 WARN TaskSetManager: Lost TID 1484 (task 996.0:82)
>>>> 14/02/26 14:33:17 INFO TaskSetManager: Loss was due to
>>>> java.lang.OutOfMemoryError: Java heap space [duplicate 3]
>>>> 14/02/26 14:33:17 INFO TaskSetManager: Starting task 996.0:116 as TID
>>>> 1794 on executor 9: node02 (PROCESS_LOCAL)
>>>> 14/02/26 14:33:17 INFO TaskSetManager: Serialized task 996.0:116 as
>>>> 4070 bytes in 1 ms
>>>> 14/02/26 14:33:17 WARN TaskSetManager: Lost TID 1475 (task 996.0:157)
>>>> 14/02/26 14:33:17 INFO TaskSetManager: Loss was due to
>>>> java.lang.OutOfMemoryError: Java heap space [duplicate 4]
>>>> 14/02/26 14:33:17 INFO TaskSetManager: Starting task 996.0:98 as TID
>>>> 1795 on executor 9: node02 (PROCESS_LOCAL)
>>>> 14/02/26 14:33:17 INFO TaskSetManager: Serialized task 996.0:98 as 4070
>>>> bytes in 1 ms
>>>> 14/02/26 14:33:17 WARN TaskSetManager: Lost TID 1492 (task 996.0:17)
>>>>
>>>>
>>>> and then...
>>>>
>>>> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1649 (task 996.0:115)
>>>> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1666 (task 996.0:32)
>>>> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1675 (task 996.0:160)
>>>> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1657 (task 996.0:349)
>>>> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1660 (task 996.0:141)
>>>> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1651 (task 996.0:55)
>>>> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1669 (task 996.0:126)
>>>> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1678 (task 996.0:173)
>>>> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1663 (task 996.0:128)
>>>> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1672 (task 996.0:28)
>>>> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1654 (task 996.0:96)
>>>> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1699 (task 996.0:294)
>>>> 14/02/26 14:33:20 INFO DAGScheduler: Executor lost: 12 (epoch 16)
>>>> 14/02/26 14:33:20 INFO BlockManagerMasterActor: Trying to remove
>>>> executor 12 from BlockManagerMaster.
>>>> 14/02/26 14:33:20 INFO BlockManagerMaster: Removed 12 successfully in
>>>> removeExecutor
>>>> 14/02/26 14:33:20 INFO Stage: Stage 996 is now unavailable on executor
>>>> 12 (0/379, false)
>>>>
>>>>
>>>> which looks like warnings..
>>>>
>>>>
>>>> The code I tried to run was:
>>>> subs_count = complex_key.map( lambda x:
>>>> (x[0],int(x[1])).reduceByKey(lambda a,b:a+b))
>>>> subs_count.take(20)
>>>>
>>>> Thanks
>>>>
>>>>  --
>>>> Mohit
>>>>
>>>> "When you want success as badly as you want the air, then you will get
>>>> it. There is no other secret of success."
>>>> -Socrates
>>>>
>>>
>>>
>>
>>
>> --
>> Mohit
>>
>> "When you want success as badly as you want the air, then you will get
>> it. There is no other secret of success."
>> -Socrates
>>
>
>

Re: JVM error

Posted by Bryn Keller <xo...@xoltar.org>.
Hi Mohit,

You can still set SPARK_MEM in spark-env.sh, but that is deprecated. This
is from SparkContext.scala:

if (!conf.contains("spark.executor.memory") &&
sys.env.contains("SPARK_MEM")) {
    logWarning("Using SPARK_MEM to set amount of memory to use per executor
process is " +
      "deprecated, instead use spark.executor.memory")
  }

Thanks,
Bryn


On Wed, Feb 26, 2014 at 6:28 PM, Mohit Singh <mo...@gmail.com> wrote:

> Hi Bryn,
>   Thanks for responding. Is there a way I can permanently configure this
> setting?
> like SPARK_EXECUTOR_MEMORY or somethign like that?
>
>
>
> On Wed, Feb 26, 2014 at 2:56 PM, Bryn Keller <xo...@xoltar.org> wrote:
>
>> Hi Mohit,
>>
>> Try increasing the *executor* memory instead of the worker memory - the
>> most appropriate place to do this is actually when you're creating your
>> SparkContext, something like:
>>
>> conf = pyspark.SparkConf()
>>                        .setMaster("spark://master:7077")
>>                        .setAppName("Example")
>>                        .setSparkHome("/your/path/to/spark")
>>                        .set("spark.executor.memory", "20G")
>>                        .set("spark.logConf", "true")
>> sc = pyspark.SparkConf(conf = conf)
>>
>> Hope that helps,
>> Bryn
>>
>>
>>
>> On Wed, Feb 26, 2014 at 2:39 PM, Mohit Singh <mo...@gmail.com> wrote:
>>
>>> Hi,
>>>   I am experimenting with pyspark lately...
>>> Every now and then, I see this error bieng streamed to pyspark shell ..
>>> and most of the times.. the computation/operation completes.. and
>>> sometimes, it just gets stuck...
>>> My setup is 8 node cluster.. with loads of ram(256GB's) and space( TB's)
>>> per node.
>>> This enviornment is shared by general hadoop and hadoopy stuff..with
>>> recent spark addition...
>>>
>>> java.lang.OutOfMemoryError: Java heap space
>>>     at
>>> com.ning.compress.BufferRecycler.allocEncodingBuffer(BufferRecycler.java:59)
>>>     at com.ning.compress.lzf.ChunkEncoder.<init>(ChunkEncoder.java:93)
>>>     at
>>> com.ning.compress.lzf.impl.UnsafeChunkEncoder.<init>(UnsafeChunkEncoder.java:40)
>>>     at
>>> com.ning.compress.lzf.impl.UnsafeChunkEncoderLE.<init>(UnsafeChunkEncoderLE.java:13)
>>>     at
>>> com.ning.compress.lzf.impl.UnsafeChunkEncoders.createEncoder(UnsafeChunkEncoders.java:31)
>>>     at
>>> com.ning.compress.lzf.util.ChunkEncoderFactory.optimalInstance(ChunkEncoderFactory.java:44)
>>>     at
>>> com.ning.compress.lzf.LZFOutputStream.<init>(LZFOutputStream.java:61)
>>>     at
>>> org.apache.spark.io.LZFCompressionCodec.compressedOutputStream(CompressionCodec.scala:60)
>>>     at
>>> org.apache.spark.storage.BlockManager.wrapForCompression(BlockManager.scala:803)
>>>     at
>>> org.apache.spark.storage.BlockManager$$anonfun$5.apply(BlockManager.scala:471)
>>>     at
>>> org.apache.spark.storage.BlockManager$$anonfun$5.apply(BlockManager.scala:471)
>>>     at
>>> org.apache.spark.storage.DiskBlockObjectWriter.open(BlockObjectWriter.scala:117)
>>>     at
>>> org.apache.spark.storage.DiskBlockObjectWriter.write(BlockObjectWriter.scala:174)
>>>     at
>>> org.apache.spark.scheduler.ShuffleMapTask$$anonfun$runTask$1.apply(ShuffleMapTask.scala:164)
>>>     at
>>> org.apache.spark.scheduler.ShuffleMapTask$$anonfun$runTask$1.apply(ShuffleMapTask.scala:161)
>>>     at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>>>     at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>>>     at
>>> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:161)
>>>     at
>>> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:102)
>>>     at org.apache.spark.scheduler.Task.run(Task.scala:53)
>>>     at
>>> org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mcV$sp(Executor.scala:213)
>>>     at
>>> org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:49)
>>>     at
>>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:178)
>>>     at
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>     at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>     at java.lang.Thread.run(Thread.java:744)
>>>
>>>
>>>
>>> Most of the settings in spark are default.. So i was wondering if maybe,
>>> there is some configuration that needs to happen?
>>> There is this one config I have addded to spark_env file
>>> SPARK_WORKER_MEMORY=20g
>>>
>>> Also, I see tons of these errors as well..
>>> 14/02/26 14:33:17 INFO TaskSetManager: Loss was due to
>>> java.lang.OutOfMemoryError: Java heap space [duplicate 1]
>>> 14/02/26 14:33:17 INFO TaskSetManager: Starting task 996.0:278 as TID
>>> 1792 on executor 9: node02 (PROCESS_LOCAL)
>>> 14/02/26 14:33:17 INFO TaskSetManager: Serialized task 996.0:278 as 4070
>>> bytes in 0 ms
>>> 14/02/26 14:33:17 WARN TaskSetManager: Lost TID 1488 (task 996.0:184)
>>> 14/02/26 14:33:17 INFO TaskSetManager: Loss was due to
>>> java.lang.OutOfMemoryError: Java heap space [duplicate 2]
>>> 14/02/26 14:33:17 INFO TaskSetManager: Starting task 996.0:247 as TID
>>> 1793 on executor 9: node02 (PROCESS_LOCAL)
>>> 14/02/26 14:33:17 INFO TaskSetManager: Serialized task 996.0:247 as 4070
>>> bytes in 0 ms
>>> 14/02/26 14:33:17 WARN TaskSetManager: Lost TID 1484 (task 996.0:82)
>>> 14/02/26 14:33:17 INFO TaskSetManager: Loss was due to
>>> java.lang.OutOfMemoryError: Java heap space [duplicate 3]
>>> 14/02/26 14:33:17 INFO TaskSetManager: Starting task 996.0:116 as TID
>>> 1794 on executor 9: node02 (PROCESS_LOCAL)
>>> 14/02/26 14:33:17 INFO TaskSetManager: Serialized task 996.0:116 as 4070
>>> bytes in 1 ms
>>> 14/02/26 14:33:17 WARN TaskSetManager: Lost TID 1475 (task 996.0:157)
>>> 14/02/26 14:33:17 INFO TaskSetManager: Loss was due to
>>> java.lang.OutOfMemoryError: Java heap space [duplicate 4]
>>> 14/02/26 14:33:17 INFO TaskSetManager: Starting task 996.0:98 as TID
>>> 1795 on executor 9: node02 (PROCESS_LOCAL)
>>> 14/02/26 14:33:17 INFO TaskSetManager: Serialized task 996.0:98 as 4070
>>> bytes in 1 ms
>>> 14/02/26 14:33:17 WARN TaskSetManager: Lost TID 1492 (task 996.0:17)
>>>
>>>
>>> and then...
>>>
>>> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1649 (task 996.0:115)
>>> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1666 (task 996.0:32)
>>> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1675 (task 996.0:160)
>>> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1657 (task 996.0:349)
>>> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1660 (task 996.0:141)
>>> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1651 (task 996.0:55)
>>> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1669 (task 996.0:126)
>>> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1678 (task 996.0:173)
>>> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1663 (task 996.0:128)
>>> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1672 (task 996.0:28)
>>> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1654 (task 996.0:96)
>>> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1699 (task 996.0:294)
>>> 14/02/26 14:33:20 INFO DAGScheduler: Executor lost: 12 (epoch 16)
>>> 14/02/26 14:33:20 INFO BlockManagerMasterActor: Trying to remove
>>> executor 12 from BlockManagerMaster.
>>> 14/02/26 14:33:20 INFO BlockManagerMaster: Removed 12 successfully in
>>> removeExecutor
>>> 14/02/26 14:33:20 INFO Stage: Stage 996 is now unavailable on executor
>>> 12 (0/379, false)
>>>
>>>
>>> which looks like warnings..
>>>
>>>
>>> The code I tried to run was:
>>> subs_count = complex_key.map( lambda x:
>>> (x[0],int(x[1])).reduceByKey(lambda a,b:a+b))
>>> subs_count.take(20)
>>>
>>> Thanks
>>>
>>>  --
>>> Mohit
>>>
>>> "When you want success as badly as you want the air, then you will get
>>> it. There is no other secret of success."
>>> -Socrates
>>>
>>
>>
>
>
> --
> Mohit
>
> "When you want success as badly as you want the air, then you will get it.
> There is no other secret of success."
> -Socrates
>

Re: JVM error

Posted by Mohit Singh <mo...@gmail.com>.
Hi Bryn,
  Thanks for responding. Is there a way I can permanently configure this
setting?
like SPARK_EXECUTOR_MEMORY or somethign like that?



On Wed, Feb 26, 2014 at 2:56 PM, Bryn Keller <xo...@xoltar.org> wrote:

> Hi Mohit,
>
> Try increasing the *executor* memory instead of the worker memory - the
> most appropriate place to do this is actually when you're creating your
> SparkContext, something like:
>
> conf = pyspark.SparkConf()
>                        .setMaster("spark://master:7077")
>                        .setAppName("Example")
>                        .setSparkHome("/your/path/to/spark")
>                        .set("spark.executor.memory", "20G")
>                        .set("spark.logConf", "true")
> sc = pyspark.SparkConf(conf = conf)
>
> Hope that helps,
> Bryn
>
>
>
> On Wed, Feb 26, 2014 at 2:39 PM, Mohit Singh <mo...@gmail.com> wrote:
>
>> Hi,
>>   I am experimenting with pyspark lately...
>> Every now and then, I see this error bieng streamed to pyspark shell ..
>> and most of the times.. the computation/operation completes.. and
>> sometimes, it just gets stuck...
>> My setup is 8 node cluster.. with loads of ram(256GB's) and space( TB's)
>> per node.
>> This enviornment is shared by general hadoop and hadoopy stuff..with
>> recent spark addition...
>>
>> java.lang.OutOfMemoryError: Java heap space
>>     at
>> com.ning.compress.BufferRecycler.allocEncodingBuffer(BufferRecycler.java:59)
>>     at com.ning.compress.lzf.ChunkEncoder.<init>(ChunkEncoder.java:93)
>>     at
>> com.ning.compress.lzf.impl.UnsafeChunkEncoder.<init>(UnsafeChunkEncoder.java:40)
>>     at
>> com.ning.compress.lzf.impl.UnsafeChunkEncoderLE.<init>(UnsafeChunkEncoderLE.java:13)
>>     at
>> com.ning.compress.lzf.impl.UnsafeChunkEncoders.createEncoder(UnsafeChunkEncoders.java:31)
>>     at
>> com.ning.compress.lzf.util.ChunkEncoderFactory.optimalInstance(ChunkEncoderFactory.java:44)
>>     at
>> com.ning.compress.lzf.LZFOutputStream.<init>(LZFOutputStream.java:61)
>>     at
>> org.apache.spark.io.LZFCompressionCodec.compressedOutputStream(CompressionCodec.scala:60)
>>     at
>> org.apache.spark.storage.BlockManager.wrapForCompression(BlockManager.scala:803)
>>     at
>> org.apache.spark.storage.BlockManager$$anonfun$5.apply(BlockManager.scala:471)
>>     at
>> org.apache.spark.storage.BlockManager$$anonfun$5.apply(BlockManager.scala:471)
>>     at
>> org.apache.spark.storage.DiskBlockObjectWriter.open(BlockObjectWriter.scala:117)
>>     at
>> org.apache.spark.storage.DiskBlockObjectWriter.write(BlockObjectWriter.scala:174)
>>     at
>> org.apache.spark.scheduler.ShuffleMapTask$$anonfun$runTask$1.apply(ShuffleMapTask.scala:164)
>>     at
>> org.apache.spark.scheduler.ShuffleMapTask$$anonfun$runTask$1.apply(ShuffleMapTask.scala:161)
>>     at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>>     at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>>     at
>> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:161)
>>     at
>> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:102)
>>     at org.apache.spark.scheduler.Task.run(Task.scala:53)
>>     at
>> org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mcV$sp(Executor.scala:213)
>>     at
>> org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:49)
>>     at
>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:178)
>>     at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>     at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>     at java.lang.Thread.run(Thread.java:744)
>>
>>
>>
>> Most of the settings in spark are default.. So i was wondering if maybe,
>> there is some configuration that needs to happen?
>> There is this one config I have addded to spark_env file
>> SPARK_WORKER_MEMORY=20g
>>
>> Also, I see tons of these errors as well..
>> 14/02/26 14:33:17 INFO TaskSetManager: Loss was due to
>> java.lang.OutOfMemoryError: Java heap space [duplicate 1]
>> 14/02/26 14:33:17 INFO TaskSetManager: Starting task 996.0:278 as TID
>> 1792 on executor 9: node02 (PROCESS_LOCAL)
>> 14/02/26 14:33:17 INFO TaskSetManager: Serialized task 996.0:278 as 4070
>> bytes in 0 ms
>> 14/02/26 14:33:17 WARN TaskSetManager: Lost TID 1488 (task 996.0:184)
>> 14/02/26 14:33:17 INFO TaskSetManager: Loss was due to
>> java.lang.OutOfMemoryError: Java heap space [duplicate 2]
>> 14/02/26 14:33:17 INFO TaskSetManager: Starting task 996.0:247 as TID
>> 1793 on executor 9: node02 (PROCESS_LOCAL)
>> 14/02/26 14:33:17 INFO TaskSetManager: Serialized task 996.0:247 as 4070
>> bytes in 0 ms
>> 14/02/26 14:33:17 WARN TaskSetManager: Lost TID 1484 (task 996.0:82)
>> 14/02/26 14:33:17 INFO TaskSetManager: Loss was due to
>> java.lang.OutOfMemoryError: Java heap space [duplicate 3]
>> 14/02/26 14:33:17 INFO TaskSetManager: Starting task 996.0:116 as TID
>> 1794 on executor 9: node02 (PROCESS_LOCAL)
>> 14/02/26 14:33:17 INFO TaskSetManager: Serialized task 996.0:116 as 4070
>> bytes in 1 ms
>> 14/02/26 14:33:17 WARN TaskSetManager: Lost TID 1475 (task 996.0:157)
>> 14/02/26 14:33:17 INFO TaskSetManager: Loss was due to
>> java.lang.OutOfMemoryError: Java heap space [duplicate 4]
>> 14/02/26 14:33:17 INFO TaskSetManager: Starting task 996.0:98 as TID 1795
>> on executor 9: node02 (PROCESS_LOCAL)
>> 14/02/26 14:33:17 INFO TaskSetManager: Serialized task 996.0:98 as 4070
>> bytes in 1 ms
>> 14/02/26 14:33:17 WARN TaskSetManager: Lost TID 1492 (task 996.0:17)
>>
>>
>> and then...
>>
>> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1649 (task 996.0:115)
>> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1666 (task 996.0:32)
>> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1675 (task 996.0:160)
>> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1657 (task 996.0:349)
>> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1660 (task 996.0:141)
>> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1651 (task 996.0:55)
>> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1669 (task 996.0:126)
>> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1678 (task 996.0:173)
>> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1663 (task 996.0:128)
>> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1672 (task 996.0:28)
>> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1654 (task 996.0:96)
>> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1699 (task 996.0:294)
>> 14/02/26 14:33:20 INFO DAGScheduler: Executor lost: 12 (epoch 16)
>> 14/02/26 14:33:20 INFO BlockManagerMasterActor: Trying to remove executor
>> 12 from BlockManagerMaster.
>> 14/02/26 14:33:20 INFO BlockManagerMaster: Removed 12 successfully in
>> removeExecutor
>> 14/02/26 14:33:20 INFO Stage: Stage 996 is now unavailable on executor 12
>> (0/379, false)
>>
>>
>> which looks like warnings..
>>
>>
>> The code I tried to run was:
>> subs_count = complex_key.map( lambda x:
>> (x[0],int(x[1])).reduceByKey(lambda a,b:a+b))
>> subs_count.take(20)
>>
>> Thanks
>>
>>  --
>> Mohit
>>
>> "When you want success as badly as you want the air, then you will get
>> it. There is no other secret of success."
>> -Socrates
>>
>
>


-- 
Mohit

"When you want success as badly as you want the air, then you will get it.
There is no other secret of success."
-Socrates

Re: JVM error

Posted by Bryn Keller <xo...@xoltar.org>.
Hi Mohit,

Try increasing the *executor* memory instead of the worker memory - the
most appropriate place to do this is actually when you're creating your
SparkContext, something like:

conf = pyspark.SparkConf()
                       .setMaster("spark://master:7077")
                       .setAppName("Example")
                       .setSparkHome("/your/path/to/spark")
                       .set("spark.executor.memory", "20G")
                       .set("spark.logConf", "true")
sc = pyspark.SparkConf(conf = conf)

Hope that helps,
Bryn



On Wed, Feb 26, 2014 at 2:39 PM, Mohit Singh <mo...@gmail.com> wrote:

> Hi,
>   I am experimenting with pyspark lately...
> Every now and then, I see this error bieng streamed to pyspark shell ..
> and most of the times.. the computation/operation completes.. and
> sometimes, it just gets stuck...
> My setup is 8 node cluster.. with loads of ram(256GB's) and space( TB's)
> per node.
> This enviornment is shared by general hadoop and hadoopy stuff..with
> recent spark addition...
>
> java.lang.OutOfMemoryError: Java heap space
>     at
> com.ning.compress.BufferRecycler.allocEncodingBuffer(BufferRecycler.java:59)
>     at com.ning.compress.lzf.ChunkEncoder.<init>(ChunkEncoder.java:93)
>     at
> com.ning.compress.lzf.impl.UnsafeChunkEncoder.<init>(UnsafeChunkEncoder.java:40)
>     at
> com.ning.compress.lzf.impl.UnsafeChunkEncoderLE.<init>(UnsafeChunkEncoderLE.java:13)
>     at
> com.ning.compress.lzf.impl.UnsafeChunkEncoders.createEncoder(UnsafeChunkEncoders.java:31)
>     at
> com.ning.compress.lzf.util.ChunkEncoderFactory.optimalInstance(ChunkEncoderFactory.java:44)
>     at
> com.ning.compress.lzf.LZFOutputStream.<init>(LZFOutputStream.java:61)
>     at
> org.apache.spark.io.LZFCompressionCodec.compressedOutputStream(CompressionCodec.scala:60)
>     at
> org.apache.spark.storage.BlockManager.wrapForCompression(BlockManager.scala:803)
>     at
> org.apache.spark.storage.BlockManager$$anonfun$5.apply(BlockManager.scala:471)
>     at
> org.apache.spark.storage.BlockManager$$anonfun$5.apply(BlockManager.scala:471)
>     at
> org.apache.spark.storage.DiskBlockObjectWriter.open(BlockObjectWriter.scala:117)
>     at
> org.apache.spark.storage.DiskBlockObjectWriter.write(BlockObjectWriter.scala:174)
>     at
> org.apache.spark.scheduler.ShuffleMapTask$$anonfun$runTask$1.apply(ShuffleMapTask.scala:164)
>     at
> org.apache.spark.scheduler.ShuffleMapTask$$anonfun$runTask$1.apply(ShuffleMapTask.scala:161)
>     at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>     at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>     at
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:161)
>     at
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:102)
>     at org.apache.spark.scheduler.Task.run(Task.scala:53)
>     at
> org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mcV$sp(Executor.scala:213)
>     at
> org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:49)
>     at
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:178)
>     at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>     at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>     at java.lang.Thread.run(Thread.java:744)
>
>
>
> Most of the settings in spark are default.. So i was wondering if maybe,
> there is some configuration that needs to happen?
> There is this one config I have addded to spark_env file
> SPARK_WORKER_MEMORY=20g
>
> Also, I see tons of these errors as well..
> 14/02/26 14:33:17 INFO TaskSetManager: Loss was due to
> java.lang.OutOfMemoryError: Java heap space [duplicate 1]
> 14/02/26 14:33:17 INFO TaskSetManager: Starting task 996.0:278 as TID 1792
> on executor 9: node02 (PROCESS_LOCAL)
> 14/02/26 14:33:17 INFO TaskSetManager: Serialized task 996.0:278 as 4070
> bytes in 0 ms
> 14/02/26 14:33:17 WARN TaskSetManager: Lost TID 1488 (task 996.0:184)
> 14/02/26 14:33:17 INFO TaskSetManager: Loss was due to
> java.lang.OutOfMemoryError: Java heap space [duplicate 2]
> 14/02/26 14:33:17 INFO TaskSetManager: Starting task 996.0:247 as TID 1793
> on executor 9: node02 (PROCESS_LOCAL)
> 14/02/26 14:33:17 INFO TaskSetManager: Serialized task 996.0:247 as 4070
> bytes in 0 ms
> 14/02/26 14:33:17 WARN TaskSetManager: Lost TID 1484 (task 996.0:82)
> 14/02/26 14:33:17 INFO TaskSetManager: Loss was due to
> java.lang.OutOfMemoryError: Java heap space [duplicate 3]
> 14/02/26 14:33:17 INFO TaskSetManager: Starting task 996.0:116 as TID 1794
> on executor 9: node02 (PROCESS_LOCAL)
> 14/02/26 14:33:17 INFO TaskSetManager: Serialized task 996.0:116 as 4070
> bytes in 1 ms
> 14/02/26 14:33:17 WARN TaskSetManager: Lost TID 1475 (task 996.0:157)
> 14/02/26 14:33:17 INFO TaskSetManager: Loss was due to
> java.lang.OutOfMemoryError: Java heap space [duplicate 4]
> 14/02/26 14:33:17 INFO TaskSetManager: Starting task 996.0:98 as TID 1795
> on executor 9: node02 (PROCESS_LOCAL)
> 14/02/26 14:33:17 INFO TaskSetManager: Serialized task 996.0:98 as 4070
> bytes in 1 ms
> 14/02/26 14:33:17 WARN TaskSetManager: Lost TID 1492 (task 996.0:17)
>
>
> and then...
>
> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1649 (task 996.0:115)
> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1666 (task 996.0:32)
> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1675 (task 996.0:160)
> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1657 (task 996.0:349)
> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1660 (task 996.0:141)
> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1651 (task 996.0:55)
> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1669 (task 996.0:126)
> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1678 (task 996.0:173)
> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1663 (task 996.0:128)
> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1672 (task 996.0:28)
> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1654 (task 996.0:96)
> 14/02/26 14:33:20 WARN TaskSetManager: Lost TID 1699 (task 996.0:294)
> 14/02/26 14:33:20 INFO DAGScheduler: Executor lost: 12 (epoch 16)
> 14/02/26 14:33:20 INFO BlockManagerMasterActor: Trying to remove executor
> 12 from BlockManagerMaster.
> 14/02/26 14:33:20 INFO BlockManagerMaster: Removed 12 successfully in
> removeExecutor
> 14/02/26 14:33:20 INFO Stage: Stage 996 is now unavailable on executor 12
> (0/379, false)
>
>
> which looks like warnings..
>
>
> The code I tried to run was:
> subs_count = complex_key.map( lambda x:
> (x[0],int(x[1])).reduceByKey(lambda a,b:a+b))
> subs_count.take(20)
>
> Thanks
>
>  --
> Mohit
>
> "When you want success as badly as you want the air, then you will get it.
> There is no other secret of success."
> -Socrates
>