You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Michael Misiewicz <mm...@gmail.com> on 2015/07/22 21:38:42 UTC

spark.executor.memory and spark.driver.memory have no effect in yarn-cluster mode (1.4.x)?

Hi group,

I seem to have encountered a weird problem with 'spark-submit' and manually
setting sparkconf values in my applications.

It seems like setting the configuration values spark.executor.memory
and spark.driver.memory don't have any effect, when they are set from
within my application (i.e. prior to creating a SparkContext).

In yarn-cluster mode, only the values specified on the command line via
spark-submit for driver and executor memory are respected, and if not, it
appears spark falls back to defaults. For example,

Correct behavior noted in Driver's logs on YARN when --executor-memory is
specified:

15/07/22 19:25:59 INFO yarn.YarnAllocator: Will request 200 executor
containers, each with 1 cores and 13824 MB memory including 1536 MB
overhead
15/07/22 19:25:59 INFO yarn.YarnAllocator: Container request (host:
Any, capability: <memory:13824, vCores:1>)


But not when spark.executor.memory is specified prior to spark context
initialization:

15/07/22 19:22:22 INFO yarn.YarnAllocator: Will request 200 executor
containers, each with 1 cores and 2560 MB memory including 1536 MB
overhead
15/07/22 19:22:22 INFO yarn.YarnAllocator: Container request (host:
Any, capability: <memory:2560, vCores:1>)


In both cases, executor mem should be 10g. Interestingly, I set a
parameter spark.yarn.executor.memoryOverhead which appears to be
respected whether or not I'm in yarn-cluster or yarn-client mode.


Has anyone seen this before? Any idea what might be causing this behavior?

Re: spark.executor.memory and spark.driver.memory have no effect in yarn-cluster mode (1.4.x)?

Posted by Michael Misiewicz <mm...@gmail.com>.
That makes a lot of sense, thanks for the concise answer!

On Wed, Jul 22, 2015 at 4:10 PM, Andrew Or <an...@databricks.com> wrote:

> Hi Michael,
>
> In general, driver related properties should not be set through the
> SparkConf. This is because by the time the SparkConf is created, we have
> already started the driver JVM, so it's too late to change the memory,
> class paths and other properties.
>
> In cluster mode, executor related properties should also not be set
> through the SparkConf. This is because the driver is run on the cluster
> just like the executors, and the executors are launched independently by
> whatever the cluster manager (e.g. YARN) is configured to do.
>
> The recommended way of setting these properties is either through the
> conf/spark-defaults.conf properties file, or through the spark-submit
> command line, e.g.:
>
> bin/spark-shell --master yarn --executor-memory 2g --driver-memory 5g
>
> Let me know if that answers your question,
> -Andrew
>
>
> 2015-07-22 12:38 GMT-07:00 Michael Misiewicz <mm...@gmail.com>:
>
>> Hi group,
>>
>> I seem to have encountered a weird problem with 'spark-submit' and
>> manually setting sparkconf values in my applications.
>>
>> It seems like setting the configuration values spark.executor.memory
>> and spark.driver.memory don't have any effect, when they are set from
>> within my application (i.e. prior to creating a SparkContext).
>>
>> In yarn-cluster mode, only the values specified on the command line via
>> spark-submit for driver and executor memory are respected, and if not, it
>> appears spark falls back to defaults. For example,
>>
>> Correct behavior noted in Driver's logs on YARN when --executor-memory is
>> specified:
>>
>> 15/07/22 19:25:59 INFO yarn.YarnAllocator: Will request 200 executor containers, each with 1 cores and 13824 MB memory including 1536 MB overhead
>> 15/07/22 19:25:59 INFO yarn.YarnAllocator: Container request (host: Any, capability: <memory:13824, vCores:1>)
>>
>>
>> But not when spark.executor.memory is specified prior to spark context initialization:
>>
>> 15/07/22 19:22:22 INFO yarn.YarnAllocator: Will request 200 executor containers, each with 1 cores and 2560 MB memory including 1536 MB overhead
>> 15/07/22 19:22:22 INFO yarn.YarnAllocator: Container request (host: Any, capability: <memory:2560, vCores:1>)
>>
>>
>> In both cases, executor mem should be 10g. Interestingly, I set a parameter spark.yarn.executor.memoryOverhead which appears to be respected whether or not I'm in yarn-cluster or yarn-client mode.
>>
>>
>> Has anyone seen this before? Any idea what might be causing this behavior?
>>
>>
>

Re: spark.executor.memory and spark.driver.memory have no effect in yarn-cluster mode (1.4.x)?

Posted by Andrew Or <an...@databricks.com>.
Hi Michael,

In general, driver related properties should not be set through the
SparkConf. This is because by the time the SparkConf is created, we have
already started the driver JVM, so it's too late to change the memory,
class paths and other properties.

In cluster mode, executor related properties should also not be set through
the SparkConf. This is because the driver is run on the cluster just like
the executors, and the executors are launched independently by whatever the
cluster manager (e.g. YARN) is configured to do.

The recommended way of setting these properties is either through the
conf/spark-defaults.conf properties file, or through the spark-submit
command line, e.g.:

bin/spark-shell --master yarn --executor-memory 2g --driver-memory 5g

Let me know if that answers your question,
-Andrew


2015-07-22 12:38 GMT-07:00 Michael Misiewicz <mm...@gmail.com>:

> Hi group,
>
> I seem to have encountered a weird problem with 'spark-submit' and
> manually setting sparkconf values in my applications.
>
> It seems like setting the configuration values spark.executor.memory
> and spark.driver.memory don't have any effect, when they are set from
> within my application (i.e. prior to creating a SparkContext).
>
> In yarn-cluster mode, only the values specified on the command line via
> spark-submit for driver and executor memory are respected, and if not, it
> appears spark falls back to defaults. For example,
>
> Correct behavior noted in Driver's logs on YARN when --executor-memory is
> specified:
>
> 15/07/22 19:25:59 INFO yarn.YarnAllocator: Will request 200 executor containers, each with 1 cores and 13824 MB memory including 1536 MB overhead
> 15/07/22 19:25:59 INFO yarn.YarnAllocator: Container request (host: Any, capability: <memory:13824, vCores:1>)
>
>
> But not when spark.executor.memory is specified prior to spark context initialization:
>
> 15/07/22 19:22:22 INFO yarn.YarnAllocator: Will request 200 executor containers, each with 1 cores and 2560 MB memory including 1536 MB overhead
> 15/07/22 19:22:22 INFO yarn.YarnAllocator: Container request (host: Any, capability: <memory:2560, vCores:1>)
>
>
> In both cases, executor mem should be 10g. Interestingly, I set a parameter spark.yarn.executor.memoryOverhead which appears to be respected whether or not I'm in yarn-cluster or yarn-client mode.
>
>
> Has anyone seen this before? Any idea what might be causing this behavior?
>
>