You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Alessandro Baretta <al...@gmail.com> on 2015/01/19 07:06:20 UTC
Memory config issues
All,
I'm getting out of memory exceptions in SparkSQL GROUP BY queries. I have
plenty of RAM, so I should be able to brute-force my way through, but I
can't quite figure out what memory option affects what process.
My current memory configuration is the following:
export SPARK_WORKER_MEMORY=83971m
export SPARK_DAEMON_MEMORY=15744m
What does each of these config options do exactly?
Also, how come the executors page of the web UI shows no memory usage:
0.0 B / 42.4 GB
And where does 42.4 GB come from?
Alex
Re: Memory config issues
Posted by Sean Owen <so...@cloudera.com>.
On Mon, Jan 19, 2015 at 6:29 AM, Akhil Das <ak...@sigmoidanalytics.com> wrote:
> Its the executor memory (spark.executor.memory) which you can set while
> creating the spark context. By default it uses 0.6% of the executor memory
(Uses 0.6 or 60%)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org
Re: Memory config issues
Posted by Alessandro Baretta <al...@gmail.com>.
Akhil,
Ah, very good point. I guess "SET spark.sql.shuffle.partitions=1024" should
do it.
Alex
On Sun, Jan 18, 2015 at 10:29 PM, Akhil Das <ak...@sigmoidanalytics.com>
wrote:
> Its the executor memory (spark.executor.memory) which you can set while
> creating the spark context. By default it uses 0.6% of the executor memory
> for Storage. Now, to show some memory usage, you need to cache (persist)
> the RDD. Regarding the OOM Exception, you can increase the level of
> parallelism (also you can increase the number of partitions depending on
> your data size) and it should be fine.
>
> Thanks
> Best Regards
>
> On Mon, Jan 19, 2015 at 11:36 AM, Alessandro Baretta <
> alexbaretta@gmail.com> wrote:
>
>> All,
>>
>> I'm getting out of memory exceptions in SparkSQL GROUP BY queries. I have
>> plenty of RAM, so I should be able to brute-force my way through, but I
>> can't quite figure out what memory option affects what process.
>>
>> My current memory configuration is the following:
>> export SPARK_WORKER_MEMORY=83971m
>> export SPARK_DAEMON_MEMORY=15744m
>>
>> What does each of these config options do exactly?
>>
>> Also, how come the executors page of the web UI shows no memory usage:
>>
>> 0.0 B / 42.4 GB
>>
>> And where does 42.4 GB come from?
>>
>> Alex
>>
>
>
Re: Memory config issues
Posted by Akhil Das <ak...@sigmoidanalytics.com>.
Its the executor memory (spark.executor.memory) which you can set while
creating the spark context. By default it uses 0.6% of the executor memory
for Storage. Now, to show some memory usage, you need to cache (persist)
the RDD. Regarding the OOM Exception, you can increase the level of
parallelism (also you can increase the number of partitions depending on
your data size) and it should be fine.
Thanks
Best Regards
On Mon, Jan 19, 2015 at 11:36 AM, Alessandro Baretta <al...@gmail.com>
wrote:
> All,
>
> I'm getting out of memory exceptions in SparkSQL GROUP BY queries. I have
> plenty of RAM, so I should be able to brute-force my way through, but I
> can't quite figure out what memory option affects what process.
>
> My current memory configuration is the following:
> export SPARK_WORKER_MEMORY=83971m
> export SPARK_DAEMON_MEMORY=15744m
>
> What does each of these config options do exactly?
>
> Also, how come the executors page of the web UI shows no memory usage:
>
> 0.0 B / 42.4 GB
>
> And where does 42.4 GB come from?
>
> Alex
>