You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Sai Prasanna <an...@gmail.com> on 2014/03/24 08:24:45 UTC

GC overhead limit exceeded in Spark-interactive shell

Hi All !! I am getting the following error in interactive spark-shell
[0.8.1]


*org.apache.spark.SparkException: Job aborted: Task 0.0:0 failed more than
0 times; aborting job java.lang.OutOfMemoryError: GC overhead limit
exceeded*


But i had set the following in the spark.env.sh and hadoop-env.sh

export SPARK_DEAMON_MEMORY=8g
export SPARK_WORKER_MEMORY=8g
export SPARK_DEAMON_JAVA_OPTS="-Xms8g -Xmx8g"
export SPARK_JAVA_OPTS="-Xms8g -Xmx8g"


export HADOOP_HEAPSIZE=4000

Any suggestions ??

-- 
*Sai Prasanna. AN*
*II M.Tech (CS), SSSIHL*

Re: GC overhead limit exceeded in Spark-interactive shell

Posted by Sai Prasanna <an...@gmail.com>.
Thanks Aaron !!


On Mon, Mar 24, 2014 at 10:58 PM, Aaron Davidson <il...@gmail.com> wrote:

> 1. Note sure on this, I don't believe we change the defaults from Java.
>
> 2. SPARK_JAVA_OPTS can be used to set the various Java properties (other
> than memory heap size itself)
>
> 3. If you want to have 8 GB executors then, yes, only two can run on each
> 16 GB node. (In fact, you should also keep a significant amount of memory
> free for the OS to use for buffer caching and such.)
> An executor may use many cores, though, so this shouldn't be an issue.
>
>
> On Mon, Mar 24, 2014 at 2:44 AM, Sai Prasanna <an...@gmail.com>wrote:
>
>> Thanks Aaron and Sean...
>>
>> Setting SPARK_MEM finally worked. But i have a small doubt.
>> 1)What is the default value that is allocated for JVM and for HEAP_SPACE
>> for Garbage collector.
>>
>> 2)Usually we set 1/3 of total memory for heap. So what should be the
>> practice for Spark processes. Where & how should we set them.
>> And what is the default value does it assume?
>>
>> 3) Moreover, if we set SPARK_MEM to say 8g and i have a 16g RAM, can only
>> two executors run max on a node of a cluster ??
>>
>>
>> Thanks Again !!
>>
>>
>>
>>
>> On Mon, Mar 24, 2014 at 2:13 PM, Sean Owen <so...@cloudera.com> wrote:
>>
>>> PS you have a typo in "DEAMON" - its DAEMON. Thanks Latin.
>>> On Mar 24, 2014 7:25 AM, "Sai Prasanna" <an...@gmail.com> wrote:
>>>
>>>> Hi All !! I am getting the following error in interactive spark-shell
>>>> [0.8.1]
>>>>
>>>>
>>>>  *org.apache.spark.SparkException: Job aborted: Task 0.0:0 failed more
>>>> than 0 times; aborting job java.lang.OutOfMemoryError: GC overhead limit
>>>> exceeded*
>>>>
>>>>
>>>> But i had set the following in the spark.env.sh and hadoop-env.sh
>>>>
>>>> export SPARK_DEAMON_MEMORY=8g
>>>> export SPARK_WORKER_MEMORY=8g
>>>> export SPARK_DEAMON_JAVA_OPTS="-Xms8g -Xmx8g"
>>>> export SPARK_JAVA_OPTS="-Xms8g -Xmx8g"
>>>>
>>>>
>>>> export HADOOP_HEAPSIZE=4000
>>>>
>>>> Any suggestions ??
>>>>
>>>> --
>>>> *Sai Prasanna. AN*
>>>> *II M.Tech (CS), SSSIHL*
>>>>
>>>>
>>>>
>>
>>
>> --
>> *Sai Prasanna. AN*
>> *II M.Tech (CS), SSSIHL*
>>
>>
>> *Entire water in the ocean can never sink a ship, Unless it gets inside.
>> All the pressures of life can never hurt you, Unless you let them in.*
>>
>
>


-- 
*Sai Prasanna. AN*
*II M.Tech (CS), SSSIHL*


*Entire water in the ocean can never sink a ship, Unless it gets inside.All
the pressures of life can never hurt you, Unless you let them in.*

Re: GC overhead limit exceeded in Spark-interactive shell

Posted by Aaron Davidson <il...@gmail.com>.
1. Note sure on this, I don't believe we change the defaults from Java.

2. SPARK_JAVA_OPTS can be used to set the various Java properties (other
than memory heap size itself)

3. If you want to have 8 GB executors then, yes, only two can run on each
16 GB node. (In fact, you should also keep a significant amount of memory
free for the OS to use for buffer caching and such.)
An executor may use many cores, though, so this shouldn't be an issue.


On Mon, Mar 24, 2014 at 2:44 AM, Sai Prasanna <an...@gmail.com>wrote:

> Thanks Aaron and Sean...
>
> Setting SPARK_MEM finally worked. But i have a small doubt.
> 1)What is the default value that is allocated for JVM and for HEAP_SPACE
> for Garbage collector.
>
> 2)Usually we set 1/3 of total memory for heap. So what should be the
> practice for Spark processes. Where & how should we set them.
> And what is the default value does it assume?
>
> 3) Moreover, if we set SPARK_MEM to say 8g and i have a 16g RAM, can only
> two executors run max on a node of a cluster ??
>
>
> Thanks Again !!
>
>
>
>
> On Mon, Mar 24, 2014 at 2:13 PM, Sean Owen <so...@cloudera.com> wrote:
>
>> PS you have a typo in "DEAMON" - its DAEMON. Thanks Latin.
>> On Mar 24, 2014 7:25 AM, "Sai Prasanna" <an...@gmail.com> wrote:
>>
>>> Hi All !! I am getting the following error in interactive spark-shell
>>> [0.8.1]
>>>
>>>
>>>  *org.apache.spark.SparkException: Job aborted: Task 0.0:0 failed more
>>> than 0 times; aborting job java.lang.OutOfMemoryError: GC overhead limit
>>> exceeded*
>>>
>>>
>>> But i had set the following in the spark.env.sh and hadoop-env.sh
>>>
>>> export SPARK_DEAMON_MEMORY=8g
>>> export SPARK_WORKER_MEMORY=8g
>>> export SPARK_DEAMON_JAVA_OPTS="-Xms8g -Xmx8g"
>>> export SPARK_JAVA_OPTS="-Xms8g -Xmx8g"
>>>
>>>
>>> export HADOOP_HEAPSIZE=4000
>>>
>>> Any suggestions ??
>>>
>>> --
>>> *Sai Prasanna. AN*
>>> *II M.Tech (CS), SSSIHL*
>>>
>>>
>>>
>
>
> --
> *Sai Prasanna. AN*
> *II M.Tech (CS), SSSIHL*
>
>
> *Entire water in the ocean can never sink a ship, Unless it gets inside.
> All the pressures of life can never hurt you, Unless you let them in.*
>

Re: GC overhead limit exceeded in Spark-interactive shell

Posted by Sai Prasanna <an...@gmail.com>.
Thanks Aaron and Sean...

Setting SPARK_MEM finally worked. But i have a small doubt.
1)What is the default value that is allocated for JVM and for HEAP_SPACE
for Garbage collector.

2)Usually we set 1/3 of total memory for heap. So what should be the
practice for Spark processes. Where & how should we set them.
And what is the default value does it assume?

3) Moreover, if we set SPARK_MEM to say 8g and i have a 16g RAM, can only
two executors run max on a node of a cluster ??


Thanks Again !!




On Mon, Mar 24, 2014 at 2:13 PM, Sean Owen <so...@cloudera.com> wrote:

> PS you have a typo in "DEAMON" - its DAEMON. Thanks Latin.
> On Mar 24, 2014 7:25 AM, "Sai Prasanna" <an...@gmail.com> wrote:
>
>> Hi All !! I am getting the following error in interactive spark-shell
>> [0.8.1]
>>
>>
>>  *org.apache.spark.SparkException: Job aborted: Task 0.0:0 failed more
>> than 0 times; aborting job java.lang.OutOfMemoryError: GC overhead limit
>> exceeded*
>>
>>
>> But i had set the following in the spark.env.sh and hadoop-env.sh
>>
>> export SPARK_DEAMON_MEMORY=8g
>> export SPARK_WORKER_MEMORY=8g
>> export SPARK_DEAMON_JAVA_OPTS="-Xms8g -Xmx8g"
>> export SPARK_JAVA_OPTS="-Xms8g -Xmx8g"
>>
>>
>> export HADOOP_HEAPSIZE=4000
>>
>> Any suggestions ??
>>
>> --
>> *Sai Prasanna. AN*
>> *II M.Tech (CS), SSSIHL*
>>
>>
>>


-- 
*Sai Prasanna. AN*
*II M.Tech (CS), SSSIHL*


*Entire water in the ocean can never sink a ship, Unless it gets inside.All
the pressures of life can never hurt you, Unless you let them in.*

Re: GC overhead limit exceeded in Spark-interactive shell

Posted by Sean Owen <so...@cloudera.com>.
PS you have a typo in "DEAMON" - its DAEMON. Thanks Latin.
On Mar 24, 2014 7:25 AM, "Sai Prasanna" <an...@gmail.com> wrote:

> Hi All !! I am getting the following error in interactive spark-shell
> [0.8.1]
>
>
>  *org.apache.spark.SparkException: Job aborted: Task 0.0:0 failed more
> than 0 times; aborting job java.lang.OutOfMemoryError: GC overhead limit
> exceeded*
>
>
> But i had set the following in the spark.env.sh and hadoop-env.sh
>
> export SPARK_DEAMON_MEMORY=8g
> export SPARK_WORKER_MEMORY=8g
> export SPARK_DEAMON_JAVA_OPTS="-Xms8g -Xmx8g"
> export SPARK_JAVA_OPTS="-Xms8g -Xmx8g"
>
>
> export HADOOP_HEAPSIZE=4000
>
> Any suggestions ??
>
> --
> *Sai Prasanna. AN*
> *II M.Tech (CS), SSSIHL*
>
>
>

Re: GC overhead limit exceeded in Spark-interactive shell

Posted by Aaron Davidson <il...@gmail.com>.
To be clear on what your configuration will do:

- SPARK_DAEMON_MEMORY=8g will make your standalone master and worker
schedulers have a lot of memory. These do not impact the actual amount of
useful memory given to executors or your driver, however, so you probably
don't need to set this.
- SPARK_WORKER_MEMORY=8g allows each worker to provide up to 8g worth of
executors. In itself, this does not actually give executors more memory,
just allows them to get more. This is a necessary setting.

- *_JAVA_OPTS should not be used to set memory parameters, as they may or
may not override their *_MEMORY counterparts.

The two things you are not configuring are the amount of memory for your
driver (for a 0.8.1 spark-shell, you must use SPARK_MEM) and the amount of
memory given to each executor (spark.executor.memory). By default, Spark
executors are only 512MB in size, so you will probably want to increase
this up to the value of SPARK_WORKER_MEMORY. This will provide you with 1
executor per worker that uses all available memory, which is probably what
you want for testing purposes (it is less ideal for sharing a cluster).

In case the distinction between workers/masters (collectively "daemons"),
executors, and drivers is not clear to you, please check out the
corresponding documentation on Spark clusters:
https://spark.incubator.apache.org/docs/0.8.1/cluster-overview.html


On Mon, Mar 24, 2014 at 12:24 AM, Sai Prasanna <an...@gmail.com>wrote:

> Hi All !! I am getting the following error in interactive spark-shell
> [0.8.1]
>
>
>  *org.apache.spark.SparkException: Job aborted: Task 0.0:0 failed more
> than 0 times; aborting job java.lang.OutOfMemoryError: GC overhead limit
> exceeded*
>
>
> But i had set the following in the spark.env.sh and hadoop-env.sh
>
> export SPARK_DEAMON_MEMORY=8g
> export SPARK_WORKER_MEMORY=8g
> export SPARK_DEAMON_JAVA_OPTS="-Xms8g -Xmx8g"
> export SPARK_JAVA_OPTS="-Xms8g -Xmx8g"
>
>
> export HADOOP_HEAPSIZE=4000
>
> Any suggestions ??
>
> --
> *Sai Prasanna. AN*
> *II M.Tech (CS), SSSIHL*
>
>
>