You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Anbang Hu (JIRA)" <ji...@apache.org> on 2018/09/12 00:59:00 UTC
[jira] [Updated] (SPARK-25410) Spark executor on YARN does not
include memoryOverhead when starting an ExecutorRunnable
[ https://issues.apache.org/jira/browse/SPARK-25410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Anbang Hu updated SPARK-25410:
------------------------------
Description:
When deploying on YARN, only {{executorMemory}} is used to launch executors in [YarnAllocator.scala#L529|https://github.com/apache/spark/blob/18688d370399dcf92f4228db6c7e3cb186804c18/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala#L529]:
{code}
try {
new ExecutorRunnable(
Some(container),
conf,
sparkConf,
driverUrl,
executorId,
executorHostname,
executorMemory,
executorCores,
appAttemptId.getApplicationId.toString,
securityMgr,
localResources
).run()
updateInternalState()
} catch {
{code}
However, resource capability requested for each executor is {{executorMemory + memoryOverhead}} in [YarnAllocator.scala#L142|https://github.com/apache/spark/blob/18688d370399dcf92f4228db6c7e3cb186804c18/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala#L142]:
{code:scala}
// Resource capability requested for each executors
private[yarn] val resource = Resource.newInstance(executorMemory + memoryOverhead, executorCores)
{code}
This means that the amount of {{memoryOverhead}} will not be used in running the job, hence wasted.
Checking both k8s and Mesos, it looks like they both include overhead memory.
For k8s, in [ExecutorPodFactory.scala#L179|https://github.com/apache/spark/blob/18688d370399dcf92f4228db6c7e3cb186804c18/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodFactory.scala#L179]:
{code}
val executorContainer = new ContainerBuilder()
.withName("executor")
.withImage(executorContainerImage)
.withImagePullPolicy(imagePullPolicy)
.withNewResources()
.addToRequests("memory", executorMemoryQuantity)
.addToLimits("memory", executorMemoryLimitQuantity)
.addToRequests("cpu", executorCpuQuantity)
.endResources()
.addAllToEnv(executorEnv.asJava)
.withPorts(requiredPorts.asJava)
.addToArgs("executor")
.build()
{code}
For Mesos, in[MesosSchedulerUtils.scala#L374|https://github.com/apache/spark/blob/18688d370399dcf92f4228db6c7e3cb186804c18/resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerUtils.scala#L374]:
{code}
/**
* Return the amount of memory to allocate to each executor, taking into account
* container overheads.
*
* @param sc SparkContext to use to get `spark.mesos.executor.memoryOverhead` value
* @return memory requirement as (0.1 * memoryOverhead) or MEMORY_OVERHEAD_MINIMUM
* (whichever is larger)
*/
def executorMemory(sc: SparkContext): Int = {
sc.conf.getInt("spark.mesos.executor.memoryOverhead",
math.max(MEMORY_OVERHEAD_FRACTION * sc.executorMemory, MEMORY_OVERHEAD_MINIMUM).toInt) +
sc.executorMemory
}
{code}
was:
When deploying on YARN, only `executorMemory` is used to launch executors: https://github.com/apache/spark/blob/18688d370399dcf92f4228db6c7e3cb186804c18/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala#L529
However, resource capability requested for each executor is `executorMemory + memoryOverhead`: https://github.com/apache/spark/blob/18688d370399dcf92f4228db6c7e3cb186804c18/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala#L142
This means that the amount of `memoryOverhead` will not be used in running the job, hence wasted.
Checking both k8s and Mesos, it looks like they both include overhead memory: https://github.com/apache/spark/blob/18688d370399dcf92f4228db6c7e3cb186804c18/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodFactory.scala#L179
https://github.com/apache/spark/blob/18688d370399dcf92f4228db6c7e3cb186804c18/resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerUtils.scala#L374
> Spark executor on YARN does not include memoryOverhead when starting an ExecutorRunnable
> ----------------------------------------------------------------------------------------
>
> Key: SPARK-25410
> URL: https://issues.apache.org/jira/browse/SPARK-25410
> Project: Spark
> Issue Type: Bug
> Components: YARN
> Affects Versions: 2.3.1
> Reporter: Anbang Hu
> Priority: Major
>
> When deploying on YARN, only {{executorMemory}} is used to launch executors in [YarnAllocator.scala#L529|https://github.com/apache/spark/blob/18688d370399dcf92f4228db6c7e3cb186804c18/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala#L529]:
> {code}
> try {
> new ExecutorRunnable(
> Some(container),
> conf,
> sparkConf,
> driverUrl,
> executorId,
> executorHostname,
> executorMemory,
> executorCores,
> appAttemptId.getApplicationId.toString,
> securityMgr,
> localResources
> ).run()
> updateInternalState()
> } catch {
> {code}
> However, resource capability requested for each executor is {{executorMemory + memoryOverhead}} in [YarnAllocator.scala#L142|https://github.com/apache/spark/blob/18688d370399dcf92f4228db6c7e3cb186804c18/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala#L142]:
> {code:scala}
> // Resource capability requested for each executors
> private[yarn] val resource = Resource.newInstance(executorMemory + memoryOverhead, executorCores)
> {code}
> This means that the amount of {{memoryOverhead}} will not be used in running the job, hence wasted.
> Checking both k8s and Mesos, it looks like they both include overhead memory.
> For k8s, in [ExecutorPodFactory.scala#L179|https://github.com/apache/spark/blob/18688d370399dcf92f4228db6c7e3cb186804c18/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodFactory.scala#L179]:
> {code}
> val executorContainer = new ContainerBuilder()
> .withName("executor")
> .withImage(executorContainerImage)
> .withImagePullPolicy(imagePullPolicy)
> .withNewResources()
> .addToRequests("memory", executorMemoryQuantity)
> .addToLimits("memory", executorMemoryLimitQuantity)
> .addToRequests("cpu", executorCpuQuantity)
> .endResources()
> .addAllToEnv(executorEnv.asJava)
> .withPorts(requiredPorts.asJava)
> .addToArgs("executor")
> .build()
> {code}
> For Mesos, in[MesosSchedulerUtils.scala#L374|https://github.com/apache/spark/blob/18688d370399dcf92f4228db6c7e3cb186804c18/resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerUtils.scala#L374]:
> {code}
> /**
> * Return the amount of memory to allocate to each executor, taking into account
> * container overheads.
> *
> * @param sc SparkContext to use to get `spark.mesos.executor.memoryOverhead` value
> * @return memory requirement as (0.1 * memoryOverhead) or MEMORY_OVERHEAD_MINIMUM
> * (whichever is larger)
> */
> def executorMemory(sc: SparkContext): Int = {
> sc.conf.getInt("spark.mesos.executor.memoryOverhead",
> math.max(MEMORY_OVERHEAD_FRACTION * sc.executorMemory, MEMORY_OVERHEAD_MINIMUM).toInt) +
> sc.executorMemory
> }
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org