You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/02/13 19:23:05 UTC

[GitHub] [spark] Kimahriman opened a new pull request #35504: [SPARK-38194][YARN] Make yarn memory overhead factor configurable

Kimahriman opened a new pull request #35504:
URL: https://github.com/apache/spark/pull/35504


   <!--
   Thanks for sending a pull request!  Here are some tips for you:
     1. If this is your first time, please read our contributor guidelines: https://spark.apache.org/contributing.html
     2. Ensure you have added or run the appropriate tests for your PR: https://spark.apache.org/developer-tools.html
     3. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP][SPARK-XXXX] Your PR title ...'.
     4. Be sure to keep the PR description updated to reflect all changes.
     5. Please write your PR title to summarize what this PR proposes.
     6. If possible, provide a concise example to reproduce the issue for a faster review.
     7. If you want to add a new configuration, please read the guideline first for naming configurations in
        'core/src/main/scala/org/apache/spark/internal/config/ConfigEntry.scala'.
     8. If you want to add or modify an error type or message, please read the guideline first in
        'core/src/main/resources/error/README.md'.
   -->
   
   ### What changes were proposed in this pull request?
   <!--
   Please clarify what changes you are proposing. The purpose of this section is to outline the changes and how this PR fixes the issue. 
   If possible, please consider writing useful notes for better and faster reviews in your PR. See the examples below.
     1. If you refactor some codes with changing classes, showing the class hierarchy will help reviewers.
     2. If you fix some SQL features, you can provide some references of other DBMSes.
     3. If there is design documentation, please add the link.
     4. If there is a discussion in the mailing list, please add the link.
   -->
   Add a new config to set the memory overhead factor for yarn applications. Currently the memory overhead is hard coded to 10%, and the only way to set it higher is to set it to a specific memory amount.
   
   
   ### Why are the changes needed?
   <!--
   Please clarify why the changes are needed. For instance,
     1. If you propose a new API, clarify the use case for a new API.
     2. If you fix a bug, you can clarify why it is a bug.
   -->
   In dynamic environments where different people or use cases need different memory requirements, it would be helpful to set a higher memory overhead factor instead of having to set a higher specific memory overhead value. The kubernetes resource manager already makes this configurable.
   
   ### Does this PR introduce _any_ user-facing change?
   <!--
   Note that it means *any* user-facing change including all aspects such as the documentation fix.
   If yes, please clarify the previous behavior and the change this PR proposes - provide the console output, description and/or an example to show the behavior difference if possible.
   If possible, please also clarify if this is a user-facing change compared to the released Spark versions or within the unreleased branches such as master.
   If no, write 'No'.
   -->
   No change to default behavior, just adds a new config users can change.
   
   ### How was this patch tested?
   <!--
   If tests were added, say they were added here. Please make sure to add some test cases that check the changes thoroughly including negative and positive cases if possible.
   If it was tested in a way different from regular unit tests, please clarify how you tested step by step, ideally copy and paste-able, so that other reviewers can test and check, and descendants can verify in the future.
   If tests were not added, please describe why they were not added and/or why it was difficult to add.
   If benchmark tests were added, please run the benchmarks in GitHub Actions for the consistent environment, and the instructions could accord to: https://spark.apache.org/developer-tools.html#github-workflow-benchmarks.
   -->
   New UT to check the memory calculation.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #35504: [SPARK-38194][YARN] Make yarn memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #35504:
URL: https://github.com/apache/spark/pull/35504#issuecomment-1038753913


   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Kimahriman commented on pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on pull request #35504:
URL: https://github.com/apache/spark/pull/35504#issuecomment-1043163373


   > Can I ask what use cases this is targeting?
   
   There's no specific use case I'm trying to use more off heap memory that's not specifically off-heap spark features. I've just noticed more than 10% extra memory being used for our normal jobs (that don't use off heap features). I have no idea why or what's using this memory, but it is. I definitely have one job that has a memory leak on the driver side, and is currently sitting at 100g reserved memory with a 32g heap, and haven't figured out why, but we also tried turning on strict memory enforcement in yarn and constantly had executors killed for using too much memory. I haven't investigated these much either, but currently we're calculating a higher memoryOverhead based on the executor memory to give it a little more breathing room, but this makes it easier to just set a factor across the board versus manually tuning memoryOverhead for each job.
   
   I feel like I've heard G1GC uses more off-heap memory than other garbage collectors, don't know if that's true? But this is just a quality of life improvement if other people run into similar issues.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tgravescs commented on pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
tgravescs commented on pull request #35504:
URL: https://github.com/apache/spark/pull/35504#issuecomment-1045052706


   sounds good.  I haven't had time to go through code in detail. Lets make sure we document precedence with it and the memoryOverhead config and update all the .md docs (yarn and configuration), which on quick skim didn't see.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Yikun commented on a change in pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
Yikun commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r829225722



##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = if (kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+    kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+  } else {
+    kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+  }

Review comment:
       Yep, I think so.
   https://github.com/apache/spark/blob/cd86df881c9570d32f6522ee163b884fda00a530/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/submit/KubernetesClientUtils.scala#L72




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r829332470



##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = if (kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+    kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+  } else {
+    kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+  }

Review comment:
       Please unblock Apache Spark 3.3 K8s module QA period by reverting this. We can land it back after having a healthy commit.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r829285680



##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = if (kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+    kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+  } else {
+    kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+  }

Review comment:
       Hi, @tgravescs and @mridulm . Yes, we need to fix this, but before that I'm wondering if we really need this at Apache Spark 3.3. As we noticed here, it turns out this is not a straight forward contribution. If you don't mind, I'd like to recommend to keep fixing and stabilizing in `master` branch while reverting this from `branch-3.3` at least.
   

##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = if (kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+    kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+  } else {
+    kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+  }

Review comment:
       > I should also state that if people don't want in 3.3, I'm personally fine with it just need input from anyone interested in the feature.
   
   Thank you. Then, it's simpler because this PR was backported manually after feature freeze. :)
   We are currently discussing on the whitelisting about late arrivals like this PR, aren't we, @tgravescs ?
   We can discuss there together to get those input you need.

##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = if (kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+    kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+  } else {
+    kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+  }

Review comment:
       Please unblock K8s module QA period by reverting this. We can land it back after having a healthy commit.

##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = if (kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+    kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+  } else {
+    kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+  }

Review comment:
       Please unblock Apache Spark 3.3 K8s module QA period by reverting this. We can land it back after having a healthy commit.

##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = if (kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+    kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+  } else {
+    kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+  }

Review comment:
       Thank you for your decision, @tgravescs .




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tgravescs commented on a change in pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
tgravescs commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r817734660



##########
File path: resource-managers/kubernetes/core/src/test/scala/org/apache/spark/deploy/k8s/features/BasicDriverFeatureStepSuite.scala
##########
@@ -213,10 +213,62 @@ class BasicDriverFeatureStepSuite extends SparkFunSuite {
       assert(mem === s"${expected}Mi")
 
       val systemProperties = step.getAdditionalPodSystemProperties()
-      assert(systemProperties(MEMORY_OVERHEAD_FACTOR.key) === expectedFactor.toString)
+      assert(systemProperties(DRIVER_MEMORY_OVERHEAD_FACTOR.key) === expectedFactor.toString)
     }
   }
 
+  test(s"SPARK-38194: memory overhead factor precendence") {
+    // Choose a driver memory where the default memory overhead is > MEMORY_OVERHEAD_MIN_MIB
+    val driverMem =
+      ResourceProfile.MEMORY_OVERHEAD_MIN_MIB / DRIVER_MEMORY_OVERHEAD_FACTOR.defaultValue.get * 2
+
+    // main app resource, overhead factor
+    val sparkConf = new SparkConf(false)
+      .set(CONTAINER_IMAGE, "spark-driver:latest")
+      .set(DRIVER_MEMORY.key, s"${driverMem.toInt}m")
+
+    // New config should take precedence
+    sparkConf.set(DRIVER_MEMORY_OVERHEAD_FACTOR, 0.2)
+    sparkConf.set(MEMORY_OVERHEAD_FACTOR, 0.3)
+    val expectedFactor = 0.2

Review comment:
       move this up and use expected Factor above on line 231 to set it also.

##########
File path: resource-managers/kubernetes/core/src/test/scala/org/apache/spark/deploy/k8s/features/BasicDriverFeatureStepSuite.scala
##########
@@ -213,10 +213,62 @@ class BasicDriverFeatureStepSuite extends SparkFunSuite {
       assert(mem === s"${expected}Mi")
 
       val systemProperties = step.getAdditionalPodSystemProperties()
-      assert(systemProperties(MEMORY_OVERHEAD_FACTOR.key) === expectedFactor.toString)
+      assert(systemProperties(DRIVER_MEMORY_OVERHEAD_FACTOR.key) === expectedFactor.toString)
     }
   }
 
+  test(s"SPARK-38194: memory overhead factor precendence") {
+    // Choose a driver memory where the default memory overhead is > MEMORY_OVERHEAD_MIN_MIB
+    val driverMem =
+      ResourceProfile.MEMORY_OVERHEAD_MIN_MIB / DRIVER_MEMORY_OVERHEAD_FACTOR.defaultValue.get * 2
+
+    // main app resource, overhead factor
+    val sparkConf = new SparkConf(false)
+      .set(CONTAINER_IMAGE, "spark-driver:latest")
+      .set(DRIVER_MEMORY.key, s"${driverMem.toInt}m")
+
+    // New config should take precedence
+    sparkConf.set(DRIVER_MEMORY_OVERHEAD_FACTOR, 0.2)
+    sparkConf.set(MEMORY_OVERHEAD_FACTOR, 0.3)
+    val expectedFactor = 0.2
+
+    val conf = KubernetesTestConf.createDriverConf(
+      sparkConf = sparkConf)
+    val step = new BasicDriverFeatureStep(conf)
+    val pod = step.configurePod(SparkPod.initialPod())
+    val mem = amountAndFormat(pod.container.getResources.getRequests.get("memory"))
+    val expected = (driverMem + driverMem * expectedFactor).toInt
+    assert(mem === s"${expected}Mi")
+
+    val systemProperties = step.getAdditionalPodSystemProperties()
+    assert(systemProperties(DRIVER_MEMORY_OVERHEAD_FACTOR.key) === "0.2")

Review comment:
       use expectedFactor

##########
File path: core/src/main/scala/org/apache/spark/internal/config/package.scala
##########
@@ -105,6 +105,18 @@ package object config {
     .bytesConf(ByteUnit.MiB)
     .createOptional
 
+  private[spark] val DRIVER_MEMORY_OVERHEAD_FACTOR =
+    ConfigBuilder("spark.driver.memoryOverheadFactor")
+      .doc("Fraction of driver memory to be allocated as additional non-heap memory per driver " +
+        "process in cluster mode. This is memory that accounts for things like VM overheads, " +
+        "interned strings, other native overheads, etc. This tends to grow with the container " +
+        "size. This value is ignored if spark.driver.memoryOverhead is set directly.")

Review comment:
       Right so what I meant here is that since we deprecated  spark.kubernetes.memoryOverheadFactor we should remove it from the running-on-kubernetes documentation and then the description of 0.40 for non-JVM jobs needs to move here.  ie On Kubernetes the default is 0.10 for JVM-based jobs and 0.40 for non-JVM jobs.

##########
File path: resource-managers/kubernetes/core/src/test/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStepSuite.scala
##########
@@ -441,6 +441,59 @@ class BasicExecutorFeatureStepSuite extends SparkFunSuite with BeforeAndAfter {
     ))
   }
 
+  test(s"SPARK-38194: memory overhead factor precendence") {
+    // Choose an executor memory where the default memory overhead is > MEMORY_OVERHEAD_MIN_MIB
+    val defaultFactor = EXECUTOR_MEMORY_OVERHEAD_FACTOR.defaultValue.get
+    val executorMem = ResourceProfile.MEMORY_OVERHEAD_MIN_MIB / defaultFactor * 2
+
+    // main app resource, overhead factor
+    val sparkConf = new SparkConf(false)
+      .set(CONTAINER_IMAGE, "spark-driver:latest")
+      .set(EXECUTOR_MEMORY.key, s"${executorMem.toInt}m")
+
+    // New config should take precedence
+    sparkConf.set(EXECUTOR_MEMORY_OVERHEAD_FACTOR, 0.2)
+    sparkConf.set(MEMORY_OVERHEAD_FACTOR, 0.3)
+    val expectedFactor = 0.2

Review comment:
       similar comment move this up and use it where you have 0.2 hardcoded




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tgravescs commented on pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
tgravescs commented on pull request #35504:
URL: https://github.com/apache/spark/pull/35504#issuecomment-1064155053


   @dongjoon-hyun double checking, are you ok with this?  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Kimahriman commented on a change in pull request #35504: [SPARK-38194][YARN] Make yarn memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r805403660



##########
File path: .gitignore
##########
@@ -9,6 +9,7 @@
 *~
 .java-version
 .DS_Store
+.bloop

Review comment:
       I've had a bunch of local gitignore updates for a while so wanted to try to finally get them in, don't know if there's any process for adding to it, but there wasn't anything yet for the setup I use which is vscode with metals scala extension.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Kimahriman commented on a change in pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r809001068



##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala
##########
@@ -490,7 +490,7 @@ private[spark] object Config extends Logging {
       .doubleConf
       .checkValue(mem_overhead => mem_overhead >= 0,
         "Ensure that memory overhead is non-negative")
-      .createWithDefault(0.1)
+      .createOptional

Review comment:
       Done

##########
File path: core/src/main/scala/org/apache/spark/internal/config/package.scala
##########
@@ -105,6 +105,17 @@ package object config {
     .bytesConf(ByteUnit.MiB)
     .createOptional
 
+  private[spark] val DRIVER_MEMORY_OVERHEAD_FACTOR =
+    ConfigBuilder("spark.driver.memoryOverheadFactor")
+      .doc("The amount of non-heap memory to be allocated per driver in cluster mode, " +
+      "as a fraction of total driver memory. If memory overhead is specified directly, " +
+      "it takes precedence.")
+      .version("3.3.0")
+      .doubleConf
+      .checkValue(factor => factor >= 0 && factor < 1,
+        "Ensure that memory overhead is a double between 0 --> 1.0")

Review comment:
       Done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r807293853



##########
File path: core/src/main/scala/org/apache/spark/SparkConf.scala
##########
@@ -636,7 +636,9 @@ private[spark] object SparkConf extends Logging {
       DeprecatedConfig("spark.blacklist.killBlacklistedExecutors", "3.1.0",
         "Please use spark.excludeOnFailure.killExcludedExecutors"),
       DeprecatedConfig("spark.yarn.blacklist.executor.launch.blacklisting.enabled", "3.1.0",
-        "Please use spark.yarn.executor.launch.excludeOnFailure.enabled")
+        "Please use spark.yarn.executor.launch.excludeOnFailure.enabled"),
+      DeprecatedConfig("spark.kubernetes.memoryOverheadFactor", "3.3.0",

Review comment:
       This seems wrong and inconsistent with this PR because this PR uses `spark.kubernetes.memoryOverheadFactor` to override the driver or executor specific configurations.
   ```
     private val memoryOverheadFactor = conf.get(MEMORY_OVERHEAD_FACTOR)
        .getOrElse(conf.get(DRIVER_MEMORY_OVERHEAD_FACTOR))
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Kimahriman commented on a change in pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r817276622



##########
File path: resource-managers/kubernetes/core/src/test/scala/org/apache/spark/deploy/k8s/features/BasicDriverFeatureStepSuite.scala
##########
@@ -188,21 +188,21 @@ class BasicDriverFeatureStepSuite extends SparkFunSuite {
   // Memory overhead tests. Tuples are:
   //   test name, main resource, overhead factor, expected factor
   Seq(
-    ("java", JavaMainAppResource(None), None, MEMORY_OVERHEAD_FACTOR.defaultValue.get),
+    ("java", JavaMainAppResource(None), None, DRIVER_MEMORY_OVERHEAD_FACTOR.defaultValue.get),

Review comment:
       I don't think so, i'll add one




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tgravescs commented on pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
tgravescs commented on pull request #35504:
URL: https://github.com/apache/spark/pull/35504#issuecomment-1069319504


   merged to branch3.3 as well


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tgravescs commented on a change in pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
tgravescs commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r829306393



##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = if (kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+    kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+  } else {
+    kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+  }

Review comment:
       in my mind this is a small bug and it is easily fixed. I disagree with not a straight forward contribution.  If you want to revert in branch 3.3 go ahead and we can just remerge when fixed
   
   @Kimahriman logic in ExecutorFeatureSTeps stays the same.  EXECUTOR_MEMORY_OVERHEAD_FACTOR is used if its specified, otherwise it falls back to whatever MEMORY_OVERHEAD_FACTOR is and in this case the only difference is DriverFeatureStep takes care of setting it to the default when its not explicitly specified by user.  




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r829332470



##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = if (kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+    kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+  } else {
+    kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+  }

Review comment:
       Please unblock K8s module QA period by reverting this. We can land it back after having a healthy commit.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tgravescs commented on a change in pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
tgravescs commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r829236177



##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = if (kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+    kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+  } else {
+    kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+  }

Review comment:
       this pr changed it to:
   ```
   -     MEMORY_OVERHEAD_FACTOR.key -> overheadFactor.toString)
   +      DRIVER_MEMORY_OVERHEAD_FACTOR.key -> overheadFactor.toString)
   ```
   
   And the ExecutorFeatureStep doesn't look at DRIVER_MEMORY_OVERHEAD.  Previous it was looking at the common MEMORY_OVERHEAD_FACTOR.key.
   
   We could change this back so that it would propagate the default setting or we could do the same logic looking to see if non jvm then use the 0.4.  Not sure which is easier will look some more.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tgravescs commented on a change in pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
tgravescs commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r829250618



##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = if (kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+    kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+  } else {
+    kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+  }

Review comment:
       I think it might be easiest just to propagate it as MEMORY_OVERHEAD_FACTOR. This way BasicExecutorFeatureStep doesn't need to recheck if nonjvm resource. So its either explicitly set in  EXECUTOR_MEMORY_OVERHEAD_FACTOR or it uses MEMORY_OVERHEAD_FACTOR which would either be explicitly set or default to 0.1 for jvm and 0.4 for nonjvm.  
   
   let me know if you see a problem with that.
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Kimahriman commented on a change in pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r829344947



##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = if (kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+    kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+  } else {
+    kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+  }

Review comment:
       Okay I think I get it, those "system properties" end up as default spark configs on the executor. Clear as mud




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Yikun commented on a change in pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
Yikun commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r829225722



##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = if (kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+    kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+  } else {
+    kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+  }

Review comment:
       Yep, I think so.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Kimahriman commented on a change in pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r829266848



##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = if (kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+    kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+  } else {
+    kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+  }

Review comment:
       Still a little confused though, where/how is `EXECUTOR_MEMORY_OVERHEAD_FACTOR` now used for k8s or is it just ignored in favor of whatever is passed from the driver?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tgravescs commented on a change in pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
tgravescs commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r829349391



##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = if (kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+    kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+  } else {
+    kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+  }

Review comment:
       https://github.com/apache/spark/pull/35900 revert pr




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Kimahriman commented on a change in pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r809000854



##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,13 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+    .getOrElse(kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR))
 

Review comment:
       Went with `if (conf.contains...` to more easily handle the type conversions and default value of the backup setting




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tgravescs commented on pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
tgravescs commented on pull request #35504:
URL: https://github.com/apache/spark/pull/35504#issuecomment-1069283660


   Since no feedback from @dongjoon-hyun I'm going to go ahead and merge this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tgravescs commented on pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
tgravescs commented on pull request #35504:
URL: https://github.com/apache/spark/pull/35504#issuecomment-1060859922


   +1. Thanks @Kimahriman 
   
   @dongjoon-hyun  you have changed requested, can you take another look?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Kimahriman commented on a change in pull request #35504: [SPARK-38194][YARN] Make yarn memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r805403660



##########
File path: .gitignore
##########
@@ -9,6 +9,7 @@
 *~
 .java-version
 .DS_Store
+.bloop

Review comment:
       I've had a bunch of local gitignore updates for a while so wanted to try to finally get them in, don't know if there's any process for adding to it, but there wasn't anything yet for the setup I use which is vscode with metals scala extension.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Kimahriman commented on pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on pull request #35504:
URL: https://github.com/apache/spark/pull/35504#issuecomment-1046265322


   Added the new configs to the docs page, let me know how they sound or if there's any suggestions on wording


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tgravescs commented on a change in pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
tgravescs commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r829209348



##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = if (kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+    kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+  } else {
+    kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+  }

Review comment:
       yeah thanks this is what I was also just looking at but I'm not sure how it was propagated to the executors. I was looking at through the KubernetesDriverconf somehow or possible through the pod system properties:
    MEMORY_OVERHEAD_FACTOR.key -> overheadFactor.toString.
   
   If you find it let me know, still investigating.  

##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = if (kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+    kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+  } else {
+    kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+  }

Review comment:
       ok, I think I see how this is happening:
   
   ```
    val resolvedDriverSpec = builder.buildFromFeatures(conf, kubernetesClient)
       val configMapName = KubernetesClientUtils.configMapNameDriver
       val confFilesMap = KubernetesClientUtils.buildSparkConfDirFilesMap(configMapName,
         conf.sparkConf, resolvedDriverSpec.systemProperties)
   ```
   
   We build the driver spec, which includes the added system properties:
   
   `spec.systemProperties ++ addedSystemProperties`
   
   Added system properties in driver feature steps add the memory overhead setting there:
   ```
   
   val additionalProps = mutable.Map(
         KUBERNETES_DRIVER_POD_NAME.key -> driverPodName,
         "spark.app.id" -> conf.appId,
         KUBERNETES_DRIVER_SUBMIT_CHECK.key -> "true",
         MEMORY_OVERHEAD_FACTOR.key -> overheadFactor.toString)
   ```

##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = if (kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+    kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+  } else {
+    kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+  }

Review comment:
       ok, I think I see how this is happening:
   
   ```
    val resolvedDriverSpec = builder.buildFromFeatures(conf, kubernetesClient)
       val configMapName = KubernetesClientUtils.configMapNameDriver
       val confFilesMap = KubernetesClientUtils.buildSparkConfDirFilesMap(configMapName,
         conf.sparkConf, resolvedDriverSpec.systemProperties)
   ```
   
   We build the driver spec, which includes the added system properties:
   
   `spec.systemProperties ++ addedSystemProperties`
   
   Added system properties in driver feature steps add the memory overhead setting there:
   ```
   
   val additionalProps = mutable.Map(
         KUBERNETES_DRIVER_POD_NAME.key -> driverPodName,
         "spark.app.id" -> conf.appId,
         KUBERNETES_DRIVER_SUBMIT_CHECK.key -> "true",
         MEMORY_OVERHEAD_FACTOR.key -> overheadFactor.toString)
   ```
   
   Then the KubernetesClientUtils.buildSparkConfDirFilesMap is called which propagates it to the executors (I think)

##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = if (kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+    kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+  } else {
+    kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+  }

Review comment:
       this pr changed it to:
   -     MEMORY_OVERHEAD_FACTOR.key -> overheadFactor.toString)
   +      DRIVER_MEMORY_OVERHEAD_FACTOR.key -> overheadFactor.toString)
   
   And the ExecutorFeatureStep doesn't look at DRIVER_MEMORY_OVERHEAD.  Previous it was looking at the common MEMORY_OVERHEAD_FACTOR.key.
   
   We could change this back so that it would propagate the default setting or we could do the same logic looking to see if non jvm then use the 0.4.  Not sure which is easier will look some more.

##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = if (kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+    kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+  } else {
+    kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+  }

Review comment:
       this pr changed it to:
   ```
   -     MEMORY_OVERHEAD_FACTOR.key -> overheadFactor.toString)
   +      DRIVER_MEMORY_OVERHEAD_FACTOR.key -> overheadFactor.toString)
   ```
   
   And the ExecutorFeatureStep doesn't look at DRIVER_MEMORY_OVERHEAD.  Previous it was looking at the common MEMORY_OVERHEAD_FACTOR.key.
   
   We could change this back so that it would propagate the default setting or we could do the same logic looking to see if non jvm then use the 0.4.  Not sure which is easier will look some more.

##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = if (kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+    kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+  } else {
+    kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+  }

Review comment:
       I think it might be easiest just to propagate it as MEMORY_OVERHEAD_FACTOR. This way BasicExecutorFeatureStep doesn't need to recheck if nonjvm resource. So its either explicitly set in  EXECUTOR_MEMORY_OVERHEAD_FACTOR or it uses MEMORY_OVERHEAD_FACTOR which would either be explicitly set or default to 0.1 for jvm and 0.4 for nonjvm.  
   
   let me know if you see a problem with that.
   

##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = if (kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+    kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+  } else {
+    kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+  }

Review comment:
       I think it might be easiest just to propagate it as MEMORY_OVERHEAD_FACTOR. This way BasicExecutorFeatureStep doesn't need to recheck if nonjvm resource. So its either explicitly set in  EXECUTOR_MEMORY_OVERHEAD_FACTOR or it uses MEMORY_OVERHEAD_FACTOR which would either be explicitly set or default to 0.1 for jvm and 0.4 for nonjvm.  
   
   let me know if you see a problem with that.
   
   Need to update docs as well.
   

##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = if (kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+    kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+  } else {
+    kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+  }

Review comment:
       @Kimahriman do you have time to put up PR to fix?

##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = if (kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+    kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+  } else {
+    kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+  }

Review comment:
       in my mind this is a small bug and it is easily fixed. I disagree with not a straight forward contribution.  If you want to revert in branch 3.3 go ahead and we can just remerge when fixed
   
   @Kimahriman logic in ExecutorFeatureSTeps stays the same.  EXECUTOR_MEMORY_OVERHEAD_FACTOR is used if its specified, otherwise it falls back to whatever MEMORY_OVERHEAD_FACTOR is and in this case the only difference is DriverFeatureStep takes care of setting it to the default when its not explicitly specified by user.  

##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = if (kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+    kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+  } else {
+    kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+  }

Review comment:
       I should also state that if people don't want in 3.3, I'm personally fine with it just need input from anyone interested in the feature.

##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = if (kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+    kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+  } else {
+    kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+  }

Review comment:
       @Kimahriman sorry, I guess there is one correction to what I say, we only want MEMORY_OVERHEAD_FACTOR to reflect what it would be if it wasn't set via DRIVER_MEMORY_OVERHEAD_FACTOR

##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = if (kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+    kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+  } else {
+    kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+  }

Review comment:
       I realized that might get ugly, I'll put up a pr shortly

##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = if (kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+    kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+  } else {
+    kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+  }

Review comment:
       https://github.com/apache/spark/pull/35900 revert pr




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tgravescs commented on pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
tgravescs commented on pull request #35504:
URL: https://github.com/apache/spark/pull/35504#issuecomment-1070969510


   yeah that was my understanding from reading the code as well, I'll look into it some more as well.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tgravescs commented on a change in pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
tgravescs commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r829250618



##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = if (kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+    kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+  } else {
+    kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+  }

Review comment:
       I think it might be easiest just to propagate it as MEMORY_OVERHEAD_FACTOR. This way BasicExecutorFeatureStep doesn't need to recheck if nonjvm resource. So its either explicitly set in  EXECUTOR_MEMORY_OVERHEAD_FACTOR or it uses MEMORY_OVERHEAD_FACTOR which would either be explicitly set or default to 0.1 for jvm and 0.4 for nonjvm.  
   
   let me know if you see a problem with that.
   
   Need to update docs as well.
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r829359559



##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = if (kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+    kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+  } else {
+    kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+  }

Review comment:
       Thank you for your decision, @tgravescs .




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Yikun commented on a change in pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
Yikun commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r829202638



##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = if (kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+    kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+  } else {
+    kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+  }

Review comment:
       https://github.com/apache/spark/blob/cd86df881c9570d32f6522ee163b884fda00a530/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicDriverFeatureStep.scala#L62-L63
   
   I found it, it is propagated to executors from driver




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] mridulm commented on pull request #35504: [SPARK-38194][YARN] Make yarn memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
mridulm commented on pull request #35504:
URL: https://github.com/apache/spark/pull/35504#issuecomment-1038630569


   This is a useful change to have, thanks for working on this @Kimahriman !
   +CC @zhouyejoe, @rmcyang who have worked on something similar IIRC


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Kimahriman commented on pull request #35504: [SPARK-38194][YARN] Make yarn memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on pull request #35504:
URL: https://github.com/apache/spark/pull/35504#issuecomment-1040241946


   Created new global configs and updated yarn, mesos, and kubernetes to use them. Still need to try to add tests for mesos and kubernetes to make sure they're taking effect.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Kimahriman commented on a change in pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r817270752



##########
File path: core/src/main/scala/org/apache/spark/internal/config/package.scala
##########
@@ -105,6 +105,17 @@ package object config {
     .bytesConf(ByteUnit.MiB)
     .createOptional
 
+  private[spark] val DRIVER_MEMORY_OVERHEAD_FACTOR =
+    ConfigBuilder("spark.driver.memoryOverheadFactor")
+      .doc("This sets the Memory Overhead Factor on the driver that will allocate memory to " +
+        "non-JVM memory, which includes off-heap memory allocations, non-JVM tasks, various " +
+        "systems processes, and tmpfs-based local directories.")

Review comment:
       Yeah I left that as is with the hard-coded 0.4, if someone wants to address that later they can




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Kimahriman commented on a change in pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r817276412



##########
File path: docs/configuration.md
##########
@@ -198,6 +198,16 @@ of the most common options to set are:
   </td>
   <td>2.3.0</td>
 </tr>
+<tr>
+  <td><code>spark.driver.memoryOverheadFactor</code></td>
+  <td>0.10</td>
+  <td>
+    Fraction of driver memory to be allocated as additional non-heap memory per driver process

Review comment:
       Combined them and used the snippet from the `memoryOverhead` instead on the reason so they match. Let me know how they sound

##########
File path: docs/configuration.md
##########
@@ -287,6 +297,15 @@ of the most common options to set are:
   </td>
   <td>2.3.0</td>
 </tr>
+<tr>
+  <td><code>spark.executor.memoryOverheadFactor</code></td>
+  <td>0.10</td>
+  <td>
+    Fraction of executor memory to be allocated as additional non-heap memory per executor process.

Review comment:
       Same as driver above




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Kimahriman commented on a change in pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r817771869



##########
File path: core/src/main/scala/org/apache/spark/internal/config/package.scala
##########
@@ -105,6 +105,18 @@ package object config {
     .bytesConf(ByteUnit.MiB)
     .createOptional
 
+  private[spark] val DRIVER_MEMORY_OVERHEAD_FACTOR =
+    ConfigBuilder("spark.driver.memoryOverheadFactor")
+      .doc("Fraction of driver memory to be allocated as additional non-heap memory per driver " +
+        "process in cluster mode. This is memory that accounts for things like VM overheads, " +
+        "interned strings, other native overheads, etc. This tends to grow with the container " +
+        "size. This value is ignored if spark.driver.memoryOverhead is set directly.")

Review comment:
       Gotcha I'll add it to the doc. Don't use kubernetes at all so not even sure what a Spark non-JVM job is




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Yikun commented on a change in pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
Yikun commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r829214853



##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = if (kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+    kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+  } else {
+    kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+  }

Review comment:
       ~might related this: https://github.com/apache/spark/commit/3404a73f4cf7be37e574026d08ad5cf82cfac871 https://github.com/apache/spark/commit/6f27027d96ada29d8bb1d626f2cc7c856df3d597 , should be set some localProperties in sc.~




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Kimahriman commented on a change in pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r829228069



##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = if (kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+    kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+  } else {
+    kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+  }

Review comment:
       so is it that the system property added now is `spark.driver.memoryOverheadFactor` so the executor ignores it?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tgravescs commented on pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
tgravescs commented on pull request #35504:
URL: https://github.com/apache/spark/pull/35504#issuecomment-1070969510


   yeah that was my understanding from reading the code as well, I'll look into it some more as well.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r829285680



##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = if (kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+    kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+  } else {
+    kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+  }

Review comment:
       Hi, @tgravescs and @mridulm . Yes, we need to fix this, but before that I'm wondering if we really need this at Apache Spark 3.3. As we noticed here, it turns out this is not a straight forward contribution. If you don't mind, I'd like to recommend to keep fixing and stabilizing in `master` branch while reverting this from `branch-3.3` at least.
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Yikun edited a comment on pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
Yikun edited a comment on pull request #35504:
URL: https://github.com/apache/spark/pull/35504#issuecomment-1069843470


   
   ![image](https://user-images.githubusercontent.com/1736354/158721631-a970220c-0667-473b-a655-7526aa6e1e8e.png)
   
   FYI, some K8S IT test failed since this commit: https://github.com/Yikun/spark/runs/5573264340?check_suite_focus=true
   
   cc @dongjoon-hyun @tgravescs 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Kimahriman commented on a change in pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r817276733



##########
File path: resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnAllocatorSuite.scala
##########
@@ -706,4 +706,18 @@ class YarnAllocatorSuite extends SparkFunSuite with Matchers with BeforeAndAfter
       sparkConf.set(MEMORY_OFFHEAP_SIZE, originalOffHeapSize)
     }
   }
+
+  test("SPARK-38194: Configurable memory overhead factor") {
+    val executorMemory = sparkConf.get(EXECUTOR_MEMORY).toInt
+    try {
+      sparkConf.set(EXECUTOR_MEMORY_OVERHEAD_FACTOR, 0.5)

Review comment:
       I don't think so, I'll add one




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tgravescs commented on a change in pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
tgravescs commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r817743101



##########
File path: core/src/main/scala/org/apache/spark/internal/config/package.scala
##########
@@ -105,6 +105,17 @@ package object config {
     .bytesConf(ByteUnit.MiB)
     .createOptional
 
+  private[spark] val DRIVER_MEMORY_OVERHEAD_FACTOR =
+    ConfigBuilder("spark.driver.memoryOverheadFactor")
+      .doc("This sets the Memory Overhead Factor on the driver that will allocate memory to " +
+        "non-JVM memory, which includes off-heap memory allocations, non-JVM tasks, various " +
+        "systems processes, and tmpfs-based local directories.")

Review comment:
       see my comment below. I don't mean to address it, I mean to document it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Kimahriman commented on pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on pull request #35504:
URL: https://github.com/apache/spark/pull/35504#issuecomment-1070799705


   It looks like the non-jvm memory overhead default isn't being applied to the executors, but I don't see how or where it would applied before this, as the value is only used in `BasicDriverFeatureStep`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Yikun edited a comment on pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
Yikun edited a comment on pull request #35504:
URL: https://github.com/apache/spark/pull/35504#issuecomment-1069843470


   
   ![image](https://user-images.githubusercontent.com/1736354/158721631-a970220c-0667-473b-a655-7526aa6e1e8e.png)
   
   FYI, some K8S IT test failed since this commit: https://github.com/Yikun/spark/runs/5573264340?check_suite_focus=true
   
   cc @dongjoon-hyun @tgravescs 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Yikun commented on a change in pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
Yikun commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r829214853



##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = if (kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+    kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+  } else {
+    kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+  }

Review comment:
       might related this: https://github.com/apache/spark/commit/6f27027d96ada29d8bb1d626f2cc7c856df3d597 , should be set some localProperties in sc.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Kimahriman commented on a change in pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r807317733



##########
File path: core/src/main/scala/org/apache/spark/SparkConf.scala
##########
@@ -636,7 +636,9 @@ private[spark] object SparkConf extends Logging {
       DeprecatedConfig("spark.blacklist.killBlacklistedExecutors", "3.1.0",
         "Please use spark.excludeOnFailure.killExcludedExecutors"),
       DeprecatedConfig("spark.yarn.blacklist.executor.launch.blacklisting.enabled", "3.1.0",
-        "Please use spark.yarn.executor.launch.excludeOnFailure.enabled")
+        "Please use spark.yarn.executor.launch.excludeOnFailure.enabled"),
+      DeprecatedConfig("spark.kubernetes.memoryOverheadFactor", "3.3.0",

Review comment:
       Yeah the idea was if you still want to use that we will, otherwise we'll look for the new thing, I thought that was how it was normally done, but I can change it if I got that wrong. Not sure how to handle it since the new config has a default value. "Use the new config if it's specifically passed, otherwise try the old config, and if that's not there use the default value for the new config". 
   
   Wasn't sure exactly how to handle the fact that k8s already had this as its own setting but the other didn't.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Kimahriman commented on pull request #35504: [SPARK-38194][YARN] Make yarn memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on pull request #35504:
URL: https://github.com/apache/spark/pull/35504#issuecomment-1040241946


   Created new global configs and updated yarn, mesos, and kubernetes to use them. Still need to try to add tests for mesos and kubernetes to make sure they're taking effect.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] mridulm commented on a change in pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
mridulm commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r808318065



##########
File path: core/src/main/scala/org/apache/spark/internal/config/package.scala
##########
@@ -105,6 +105,17 @@ package object config {
     .bytesConf(ByteUnit.MiB)
     .createOptional
 
+  private[spark] val DRIVER_MEMORY_OVERHEAD_FACTOR =
+    ConfigBuilder("spark.driver.memoryOverheadFactor")
+      .doc("The amount of non-heap memory to be allocated per driver in cluster mode, " +
+      "as a fraction of total driver memory. If memory overhead is specified directly, " +
+      "it takes precedence.")

Review comment:
       We can use the current description for `spark.kubernetes.memoryOverheadFactor` in `docs/running-on-kubernetes.md` as template (instead of non-heap memory).

##########
File path: core/src/main/scala/org/apache/spark/internal/config/package.scala
##########
@@ -105,6 +105,17 @@ package object config {
     .bytesConf(ByteUnit.MiB)
     .createOptional
 
+  private[spark] val DRIVER_MEMORY_OVERHEAD_FACTOR =
+    ConfigBuilder("spark.driver.memoryOverheadFactor")
+      .doc("The amount of non-heap memory to be allocated per driver in cluster mode, " +
+      "as a fraction of total driver memory. If memory overhead is specified directly, " +
+      "it takes precedence.")
+      .version("3.3.0")
+      .doubleConf
+      .checkValue(factor => factor >= 0 && factor < 1,
+        "Ensure that memory overhead is a double between 0 --> 1.0")

Review comment:
       `factor > 0` would should be sufficient, we can have cases where overhead is higher than jvm Xmx.

##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala
##########
@@ -490,7 +490,7 @@ private[spark] object Config extends Logging {
       .doubleConf
       .checkValue(mem_overhead => mem_overhead >= 0,
         "Ensure that memory overhead is non-negative")
-      .createWithDefault(0.1)
+      .createOptional

Review comment:
       Keep default (see below).

##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,13 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+    .getOrElse(kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR))
 

Review comment:
       Same as in `BasicDriverFeatureStep` - change order of config query.

##########
File path: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala
##########
@@ -85,6 +84,8 @@ private[spark] class Client(
   private var appMaster: ApplicationMaster = _
   private var stagingDirPath: Path = _
 
+  private val amMemoryOverheadFactor = sparkConf.get(DRIVER_MEMORY_OVERHEAD_FACTOR)

Review comment:
       Do this only when in cluster mode - in client mode, AM is not the driver.

##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicDriverFeatureStep.scala
##########
@@ -53,18 +53,20 @@ private[spark] class BasicDriverFeatureStep(conf: KubernetesDriverConf)
 
   // Memory settings
   private val driverMemoryMiB = conf.get(DRIVER_MEMORY)
+  private val memoryOverheadFactor = conf.get(MEMORY_OVERHEAD_FACTOR)
+    .getOrElse(conf.get(DRIVER_MEMORY_OVERHEAD_FACTOR))

Review comment:
       Use `DRIVER_MEMORY_OVERHEAD_FACTOR` and fallback to `MEMORY_OVERHEAD_FACTOR` when missing.
   We probably need to use `get(key, defaultValue)` method here though.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] mridulm commented on a change in pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
mridulm commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r810269455



##########
File path: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala
##########
@@ -85,6 +84,8 @@ private[spark] class Client(
   private var appMaster: ApplicationMaster = _
   private var stagingDirPath: Path = _
 
+  private val amMemoryOverheadFactor = sparkConf.get(DRIVER_MEMORY_OVERHEAD_FACTOR)

Review comment:
       We can add it for AM later if required - in client mode, AM is not really doing much.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] mridulm commented on a change in pull request #35504: [SPARK-38194][YARN] Make yarn memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
mridulm commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r805496558



##########
File path: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/config.scala
##########
@@ -325,6 +325,17 @@ package object config extends Logging {
     .bytesConf(ByteUnit.MiB)
     .createWithDefaultString("512m")
 
+  /* Shared driver and executor configuration. */
+
+  private[spark] val MEMORY_OVERHEAD_FACTOR =

Review comment:
       Memory overhead is not specific to yarn, but applies to all resource managers.
   Let us add this to core's `org.apache.spark.internal.package.scala`
   Also, split it by driver and executor as well, to independently control them.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Kimahriman commented on a change in pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r810642574



##########
File path: core/src/main/scala/org/apache/spark/SparkConf.scala
##########
@@ -636,7 +636,9 @@ private[spark] object SparkConf extends Logging {
       DeprecatedConfig("spark.blacklist.killBlacklistedExecutors", "3.1.0",
         "Please use spark.excludeOnFailure.killExcludedExecutors"),
       DeprecatedConfig("spark.yarn.blacklist.executor.launch.blacklisting.enabled", "3.1.0",
-        "Please use spark.yarn.executor.launch.excludeOnFailure.enabled")
+        "Please use spark.yarn.executor.launch.excludeOnFailure.enabled"),
+      DeprecatedConfig("spark.kubernetes.memoryOverheadFactor", "3.3.0",

Review comment:
       Checking for new config now before the old config




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Kimahriman commented on a change in pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r810494736



##########
File path: core/src/main/scala/org/apache/spark/internal/config/package.scala
##########
@@ -105,6 +105,17 @@ package object config {
     .bytesConf(ByteUnit.MiB)
     .createOptional
 
+  private[spark] val DRIVER_MEMORY_OVERHEAD_FACTOR =
+    ConfigBuilder("spark.driver.memoryOverheadFactor")
+      .doc("This sets the Memory Overhead Factor on the driver that will allocate memory to " +
+        "non-JVM memory, which includes off-heap memory allocations, non-JVM tasks, various " +
+        "systems processes, and tmpfs-based local directories.")
+      .version("3.3.0")
+      .doubleConf
+      .checkValue(factor => factor > 0,

Review comment:
       https://github.com/apache/spark/pull/35504#discussion_r808320307
   
   I think that's a fraction of existing memory, where this is an amount of additional memory to add, so there's no reason to set an upper bound on this?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhouyejoe commented on a change in pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
zhouyejoe commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r810426309



##########
File path: core/src/main/scala/org/apache/spark/internal/config/package.scala
##########
@@ -105,6 +105,17 @@ package object config {
     .bytesConf(ByteUnit.MiB)
     .createOptional
 
+  private[spark] val DRIVER_MEMORY_OVERHEAD_FACTOR =
+    ConfigBuilder("spark.driver.memoryOverheadFactor")
+      .doc("This sets the Memory Overhead Factor on the driver that will allocate memory to " +
+        "non-JVM memory, which includes off-heap memory allocations, non-JVM tasks, various " +
+        "systems processes, and tmpfs-based local directories.")
+      .version("3.3.0")
+      .doubleConf
+      .checkValue(factor => factor > 0,

Review comment:
       Should we also check whether the factor is configured to be larger than 1.0? Same issue applies below for the spark.executor.memoryOverheadFactor? 
   We do check this for spark.memory.fraction and spark.memory.storageFraction.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Kimahriman commented on a change in pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r829228069



##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = if (kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+    kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+  } else {
+    kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+  }

Review comment:
       so is it that the system property added now is `spark.driver.memoryOverheadFactor` so the executor ignores it?

##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = if (kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+    kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+  } else {
+    kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+  }

Review comment:
       Should/can the Executor feature step add it's own memory overhead or do we need to also calculate the executor overhead in the driver feature step and added it as a system prop?

##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = if (kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+    kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+  } else {
+    kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+  }

Review comment:
       Still a little confused though, where/how is `EXECUTOR_MEMORY_OVERHEAD_FACTOR` now used for k8s or is it just ignored in favor of whatever is passed from the driver?

##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = if (kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+    kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+  } else {
+    kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+  }

Review comment:
       Okay I think I get it, those "system properties" end up as default spark configs on the executor. Clear as mud




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #35504:
URL: https://github.com/apache/spark/pull/35504#issuecomment-1071393344


   This is reverted from `branch-3.3` via https://github.com/apache/spark/pull/35900 .


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tgravescs commented on a change in pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
tgravescs commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r829209348



##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = if (kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+    kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+  } else {
+    kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+  }

Review comment:
       yeah thanks this is what I was also just looking at but I'm not sure how it was propagated to the executors. I was looking at through the KubernetesDriverconf somehow or possible through the pod system properties:
    MEMORY_OVERHEAD_FACTOR.key -> overheadFactor.toString.
   
   If you find it let me know, still investigating.  




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tgravescs commented on a change in pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
tgravescs commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r829251468



##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = if (kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+    kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+  } else {
+    kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+  }

Review comment:
       @Kimahriman do you have time to put up PR to fix?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tgravescs commented on a change in pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
tgravescs commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r816843877



##########
File path: docs/configuration.md
##########
@@ -287,6 +297,15 @@ of the most common options to set are:
   </td>
   <td>2.3.0</td>
 </tr>
+<tr>
+  <td><code>spark.executor.memoryOverheadFactor</code></td>
+  <td>0.10</td>
+  <td>
+    Fraction of executor memory to be allocated as additional non-heap memory per executor process.

Review comment:
       same thing here, have descriptions match config and combine

##########
File path: docs/configuration.md
##########
@@ -198,6 +198,16 @@ of the most common options to set are:
   </td>
   <td>2.3.0</td>
 </tr>
+<tr>
+  <td><code>spark.driver.memoryOverheadFactor</code></td>
+  <td>0.10</td>
+  <td>
+    Fraction of driver memory to be allocated as additional non-heap memory per driver process

Review comment:
       this text doesn't match the config text, I would expect them to be the same.  I think combining them would be good, the description of the config is more clear but this one also add the "in cluster mode" and the precendence line. 

##########
File path: core/src/main/scala/org/apache/spark/internal/config/package.scala
##########
@@ -105,6 +105,17 @@ package object config {
     .bytesConf(ByteUnit.MiB)
     .createOptional
 
+  private[spark] val DRIVER_MEMORY_OVERHEAD_FACTOR =
+    ConfigBuilder("spark.driver.memoryOverheadFactor")
+      .doc("This sets the Memory Overhead Factor on the driver that will allocate memory to " +
+        "non-JVM memory, which includes off-heap memory allocations, non-JVM tasks, various " +
+        "systems processes, and tmpfs-based local directories.")

Review comment:
       so this doesn't cover the weirdness on k8s of the non-jvm jobs defaulting to 0.4. I also don't see that the Kubernetes spark.kubernetes.memoryOverheadFactor.

##########
File path: resource-managers/kubernetes/core/src/test/scala/org/apache/spark/deploy/k8s/features/BasicDriverFeatureStepSuite.scala
##########
@@ -188,21 +188,21 @@ class BasicDriverFeatureStepSuite extends SparkFunSuite {
   // Memory overhead tests. Tuples are:
   //   test name, main resource, overhead factor, expected factor
   Seq(
-    ("java", JavaMainAppResource(None), None, MEMORY_OVERHEAD_FACTOR.defaultValue.get),
+    ("java", JavaMainAppResource(None), None, DRIVER_MEMORY_OVERHEAD_FACTOR.defaultValue.get),

Review comment:
       is there a test to make sure the fallback to the original config works?

##########
File path: resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnAllocatorSuite.scala
##########
@@ -706,4 +706,18 @@ class YarnAllocatorSuite extends SparkFunSuite with Matchers with BeforeAndAfter
       sparkConf.set(MEMORY_OFFHEAP_SIZE, originalOffHeapSize)
     }
   }
+
+  test("SPARK-38194: Configurable memory overhead factor") {
+    val executorMemory = sparkConf.get(EXECUTOR_MEMORY).toInt
+    try {
+      sparkConf.set(EXECUTOR_MEMORY_OVERHEAD_FACTOR, 0.5)

Review comment:
       do we test precedence of spark.driver/executor.memoryOverhead working over the overhead factor?  If not can we add a test.
   
   
   
   <br class="Apple-interchange-newline" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; text-decoration: none;">

##########
File path: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala
##########
@@ -280,9 +280,12 @@ private[yarn] class YarnAllocator(
       // track the resource profile if not already there
       getOrUpdateRunningExecutorForRPId(rp.id)
       logInfo(s"Resource profile ${rp.id} doesn't exist, adding it")
+
+      val memoryOverheadFactor = sparkConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)

Review comment:
       we are getting this on every call, just make a class variable and get it once




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Yikun commented on a change in pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
Yikun commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r829192039



##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = if (kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+    kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+  } else {
+    kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+  }

Review comment:
       The reason should be in here, before kubernetesConf.get(MEMORY_OVERHEAD_FACTOR) was used as default factor, it's `0.4` according my real debug watch on IT `Run PySpark on simple pi.py example`.
   
   But current EXECUTOR_MEMORY_OVERHEAD_FACTOR has more priority than so MEMORY_OVERHEAD_FACTOR is be overrited. (so 0.1 by default). So that the default behavior changed.
   
   But I haven't found why `kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)` is `0.4` yet, I couldn't find a code in IT to set this explictly.
   
   cc @Kimahriman @tgravescs 

##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = if (kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+    kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+  } else {
+    kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+  }

Review comment:
       https://github.com/apache/spark/blob/cd86df881c9570d32f6522ee163b884fda00a530/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicDriverFeatureStep.scala#L62-L63
   
   I found it, it is propagated to executors from driver, how?

##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = if (kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+    kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+  } else {
+    kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+  }

Review comment:
       https://github.com/apache/spark/blob/cd86df881c9570d32f6522ee163b884fda00a530/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicDriverFeatureStep.scala#L62-L63
   
   I found it, it is propagated to executors from driver

##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = if (kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+    kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+  } else {
+    kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+  }

Review comment:
       might related this: https://github.com/apache/spark/commit/6f27027d96ada29d8bb1d626f2cc7c856df3d597 , should be set some localProperties in sc.

##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = if (kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+    kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+  } else {
+    kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+  }

Review comment:
       might related this: https://github.com/apache/spark/commit/3404a73f4cf7be37e574026d08ad5cf82cfac871 https://github.com/apache/spark/commit/6f27027d96ada29d8bb1d626f2cc7c856df3d597 , should be set some localProperties in sc.

##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = if (kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+    kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+  } else {
+    kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+  }

Review comment:
       Yep, I think so.

##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = if (kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+    kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+  } else {
+    kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+  }

Review comment:
       ~might related this: https://github.com/apache/spark/commit/3404a73f4cf7be37e574026d08ad5cf82cfac871 https://github.com/apache/spark/commit/6f27027d96ada29d8bb1d626f2cc7c856df3d597 , should be set some localProperties in sc.~

##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = if (kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+    kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+  } else {
+    kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+  }

Review comment:
       Yep, I think so.
   https://github.com/apache/spark/blob/cd86df881c9570d32f6522ee163b884fda00a530/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/submit/KubernetesClientUtils.scala#L72




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tgravescs commented on a change in pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
tgravescs commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r829236177



##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = if (kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+    kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+  } else {
+    kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+  }

Review comment:
       this pr changed it to:
   -     MEMORY_OVERHEAD_FACTOR.key -> overheadFactor.toString)
   +      DRIVER_MEMORY_OVERHEAD_FACTOR.key -> overheadFactor.toString)
   
   And the ExecutorFeatureStep doesn't look at DRIVER_MEMORY_OVERHEAD.  Previous it was looking at the common MEMORY_OVERHEAD_FACTOR.key.
   
   We could change this back so that it would propagate the default setting or we could do the same logic looking to see if non jvm then use the 0.4.  Not sure which is easier will look some more.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tgravescs commented on a change in pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
tgravescs commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r829309966



##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = if (kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+    kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+  } else {
+    kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+  }

Review comment:
       @Kimahriman sorry, I guess there is one correction to what I say, we only want MEMORY_OVERHEAD_FACTOR to reflect what it would be if it wasn't set via DRIVER_MEMORY_OVERHEAD_FACTOR




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Kimahriman commented on pull request #35504: [SPARK-38194][YARN] Make yarn memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on pull request #35504:
URL: https://github.com/apache/spark/pull/35504#issuecomment-1039064063


   > Memory overhead is not specific to yarn, but applies to all resource managers. Also, split it by driver and executor as well, to independently control them.
   > 
   > See all usages of `EXECUTOR_MEMORY_OVERHEAD` and `DRIVER_MEMORY_OVERHEAD`
   
   So are you saying to just use this opportunity to unify all the difference resource managers with this config? Currently they all have their slight own ways of dealing with it but I can try to


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Kimahriman commented on a change in pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r808994036



##########
File path: core/src/main/scala/org/apache/spark/internal/config/package.scala
##########
@@ -105,6 +105,17 @@ package object config {
     .bytesConf(ByteUnit.MiB)
     .createOptional
 
+  private[spark] val DRIVER_MEMORY_OVERHEAD_FACTOR =
+    ConfigBuilder("spark.driver.memoryOverheadFactor")
+      .doc("The amount of non-heap memory to be allocated per driver in cluster mode, " +
+      "as a fraction of total driver memory. If memory overhead is specified directly, " +
+      "it takes precedence.")
+      .version("3.3.0")
+      .doubleConf
+      .checkValue(factor => factor >= 0 && factor < 1,
+        "Ensure that memory overhead is a double between 0 --> 1.0")

Review comment:
       yeah wasn't sure why there was an upper limit, just copied from k8s




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Kimahriman commented on pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on pull request #35504:
URL: https://github.com/apache/spark/pull/35504#issuecomment-1042905083


   Should I add these to the main docs page as well?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] mridulm commented on pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
mridulm commented on pull request #35504:
URL: https://github.com/apache/spark/pull/35504#issuecomment-1045024911


   @tgravescs This is actually a usecase for us - in some of our internal analysis, the 10% overhead was found to be insufficient for some classes of applications - instead of tuning this individually, and have users forget to keep this in sync, bumping up the percentage is an option being looked into.
   @Kimahriman happened to create a PR for this independently :-)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhouyejoe commented on a change in pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
zhouyejoe commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r810515573



##########
File path: core/src/main/scala/org/apache/spark/internal/config/package.scala
##########
@@ -105,6 +105,17 @@ package object config {
     .bytesConf(ByteUnit.MiB)
     .createOptional
 
+  private[spark] val DRIVER_MEMORY_OVERHEAD_FACTOR =
+    ConfigBuilder("spark.driver.memoryOverheadFactor")
+      .doc("This sets the Memory Overhead Factor on the driver that will allocate memory to " +
+        "non-JVM memory, which includes off-heap memory allocations, non-JVM tasks, various " +
+        "systems processes, and tmpfs-based local directories.")
+      .version("3.3.0")
+      .doubleConf
+      .checkValue(factor => factor > 0,

Review comment:
       Oh, right. Yes, there shouldn't be  an upper bound




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Kimahriman commented on a change in pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r818200277



##########
File path: resource-managers/kubernetes/core/src/test/scala/org/apache/spark/deploy/k8s/features/BasicDriverFeatureStepSuite.scala
##########
@@ -213,10 +213,62 @@ class BasicDriverFeatureStepSuite extends SparkFunSuite {
       assert(mem === s"${expected}Mi")
 
       val systemProperties = step.getAdditionalPodSystemProperties()
-      assert(systemProperties(MEMORY_OVERHEAD_FACTOR.key) === expectedFactor.toString)
+      assert(systemProperties(DRIVER_MEMORY_OVERHEAD_FACTOR.key) === expectedFactor.toString)
     }
   }
 
+  test(s"SPARK-38194: memory overhead factor precendence") {
+    // Choose a driver memory where the default memory overhead is > MEMORY_OVERHEAD_MIN_MIB
+    val driverMem =
+      ResourceProfile.MEMORY_OVERHEAD_MIN_MIB / DRIVER_MEMORY_OVERHEAD_FACTOR.defaultValue.get * 2
+
+    // main app resource, overhead factor
+    val sparkConf = new SparkConf(false)
+      .set(CONTAINER_IMAGE, "spark-driver:latest")
+      .set(DRIVER_MEMORY.key, s"${driverMem.toInt}m")
+
+    // New config should take precedence
+    sparkConf.set(DRIVER_MEMORY_OVERHEAD_FACTOR, 0.2)
+    sparkConf.set(MEMORY_OVERHEAD_FACTOR, 0.3)
+    val expectedFactor = 0.2

Review comment:
       deduped

##########
File path: resource-managers/kubernetes/core/src/test/scala/org/apache/spark/deploy/k8s/features/BasicDriverFeatureStepSuite.scala
##########
@@ -213,10 +213,62 @@ class BasicDriverFeatureStepSuite extends SparkFunSuite {
       assert(mem === s"${expected}Mi")
 
       val systemProperties = step.getAdditionalPodSystemProperties()
-      assert(systemProperties(MEMORY_OVERHEAD_FACTOR.key) === expectedFactor.toString)
+      assert(systemProperties(DRIVER_MEMORY_OVERHEAD_FACTOR.key) === expectedFactor.toString)
     }
   }
 
+  test(s"SPARK-38194: memory overhead factor precendence") {
+    // Choose a driver memory where the default memory overhead is > MEMORY_OVERHEAD_MIN_MIB
+    val driverMem =
+      ResourceProfile.MEMORY_OVERHEAD_MIN_MIB / DRIVER_MEMORY_OVERHEAD_FACTOR.defaultValue.get * 2
+
+    // main app resource, overhead factor
+    val sparkConf = new SparkConf(false)
+      .set(CONTAINER_IMAGE, "spark-driver:latest")
+      .set(DRIVER_MEMORY.key, s"${driverMem.toInt}m")
+
+    // New config should take precedence
+    sparkConf.set(DRIVER_MEMORY_OVERHEAD_FACTOR, 0.2)
+    sparkConf.set(MEMORY_OVERHEAD_FACTOR, 0.3)
+    val expectedFactor = 0.2
+
+    val conf = KubernetesTestConf.createDriverConf(
+      sparkConf = sparkConf)
+    val step = new BasicDriverFeatureStep(conf)
+    val pod = step.configurePod(SparkPod.initialPod())
+    val mem = amountAndFormat(pod.container.getResources.getRequests.get("memory"))
+    val expected = (driverMem + driverMem * expectedFactor).toInt
+    assert(mem === s"${expected}Mi")
+
+    val systemProperties = step.getAdditionalPodSystemProperties()
+    assert(systemProperties(DRIVER_MEMORY_OVERHEAD_FACTOR.key) === "0.2")

Review comment:
       deduped

##########
File path: resource-managers/kubernetes/core/src/test/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStepSuite.scala
##########
@@ -441,6 +441,59 @@ class BasicExecutorFeatureStepSuite extends SparkFunSuite with BeforeAndAfter {
     ))
   }
 
+  test(s"SPARK-38194: memory overhead factor precendence") {
+    // Choose an executor memory where the default memory overhead is > MEMORY_OVERHEAD_MIN_MIB
+    val defaultFactor = EXECUTOR_MEMORY_OVERHEAD_FACTOR.defaultValue.get
+    val executorMem = ResourceProfile.MEMORY_OVERHEAD_MIN_MIB / defaultFactor * 2
+
+    // main app resource, overhead factor
+    val sparkConf = new SparkConf(false)
+      .set(CONTAINER_IMAGE, "spark-driver:latest")
+      .set(EXECUTOR_MEMORY.key, s"${executorMem.toInt}m")
+
+    // New config should take precedence
+    sparkConf.set(EXECUTOR_MEMORY_OVERHEAD_FACTOR, 0.2)
+    sparkConf.set(MEMORY_OVERHEAD_FACTOR, 0.3)
+    val expectedFactor = 0.2

Review comment:
       deduped




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Kimahriman commented on a change in pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r818200507



##########
File path: core/src/main/scala/org/apache/spark/internal/config/package.scala
##########
@@ -105,6 +105,18 @@ package object config {
     .bytesConf(ByteUnit.MiB)
     .createOptional
 
+  private[spark] val DRIVER_MEMORY_OVERHEAD_FACTOR =
+    ConfigBuilder("spark.driver.memoryOverheadFactor")
+      .doc("Fraction of driver memory to be allocated as additional non-heap memory per driver " +
+        "process in cluster mode. This is memory that accounts for things like VM overheads, " +
+        "interned strings, other native overheads, etc. This tends to grow with the container " +
+        "size. This value is ignored if spark.driver.memoryOverhead is set directly.")

Review comment:
       Updated the doc to describe the kubernetes non-jvm default. Also removed the kubernetes config from the docs




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r829329978



##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = if (kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+    kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+  } else {
+    kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+  }

Review comment:
       > I should also state that if people don't want in 3.3, I'm personally fine with it just need input from anyone interested in the feature.
   
   Thank you. Then, it's simpler because this PR was backported manually after feature freeze. :)
   We are currently discussing on the whitelisting about late arrivals like this PR, aren't we, @tgravescs ?
   We can discuss there together to get those input you need.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tgravescs commented on pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
tgravescs commented on pull request #35504:
URL: https://github.com/apache/spark/pull/35504#issuecomment-1069290518


   Merged to master, thanks @Kimahriman 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tgravescs commented on a change in pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
tgravescs commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r829216579



##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = if (kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+    kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+  } else {
+    kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+  }

Review comment:
       ok, I think I see how this is happening:
   
   ```
    val resolvedDriverSpec = builder.buildFromFeatures(conf, kubernetesClient)
       val configMapName = KubernetesClientUtils.configMapNameDriver
       val confFilesMap = KubernetesClientUtils.buildSparkConfDirFilesMap(configMapName,
         conf.sparkConf, resolvedDriverSpec.systemProperties)
   ```
   
   We build the driver spec, which includes the added system properties:
   
   `spec.systemProperties ++ addedSystemProperties`
   
   Added system properties in driver feature steps add the memory overhead setting there:
   ```
   
   val additionalProps = mutable.Map(
         KUBERNETES_DRIVER_POD_NAME.key -> driverPodName,
         "spark.app.id" -> conf.appId,
         KUBERNETES_DRIVER_SUBMIT_CHECK.key -> "true",
         MEMORY_OVERHEAD_FACTOR.key -> overheadFactor.toString)
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Yikun commented on a change in pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
Yikun commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r829214853



##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = if (kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+    kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+  } else {
+    kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+  }

Review comment:
       might related this: https://github.com/apache/spark/commit/3404a73f4cf7be37e574026d08ad5cf82cfac871 https://github.com/apache/spark/commit/6f27027d96ada29d8bb1d626f2cc7c856df3d597 , should be set some localProperties in sc.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tgravescs commented on a change in pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
tgravescs commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r829308083



##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = if (kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+    kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+  } else {
+    kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+  }

Review comment:
       I should also state that if people don't want in 3.3, I'm personally fine with it just need input from anyone interested in the feature.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] mridulm commented on a change in pull request #35504: [SPARK-38194][YARN] Make yarn memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
mridulm commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r805497282



##########
File path: .gitignore
##########
@@ -59,6 +62,7 @@ lint-r-report.log
 lint-js-report.log
 log/
 logs/
+metals.sbt
 out/

Review comment:
       Please move unrelated changes to a different PR




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Kimahriman commented on a change in pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r809001375



##########
File path: core/src/main/scala/org/apache/spark/internal/config/package.scala
##########
@@ -105,6 +105,17 @@ package object config {
     .bytesConf(ByteUnit.MiB)
     .createOptional
 
+  private[spark] val DRIVER_MEMORY_OVERHEAD_FACTOR =
+    ConfigBuilder("spark.driver.memoryOverheadFactor")
+      .doc("The amount of non-heap memory to be allocated per driver in cluster mode, " +
+      "as a fraction of total driver memory. If memory overhead is specified directly, " +
+      "it takes precedence.")

Review comment:
       Took part of it, let me know how it sounds




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tgravescs commented on pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
tgravescs commented on pull request #35504:
URL: https://github.com/apache/spark/pull/35504#issuecomment-1043053330


   so overall I'm fine with the concept and think should be consistent across spark.  But going back to how/when this config was added, I believe it was specifically decided at that time to not have a configuration by percent on YARN.  If I recall that was based on the use cases at the time so there very well be more/different use cases or we have just more experience.  I think some of that is many of the off heap type configs would be specific size (ie offheap=5g).  Can I ask what use cases this is targeting?
   
   If this goes in docs need to be clear on precedence of the configs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Kimahriman commented on pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on pull request #35504:
URL: https://github.com/apache/spark/pull/35504#issuecomment-1041478206


   @mridulm is this what you were thinking?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Kimahriman commented on a change in pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r808994998



##########
File path: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala
##########
@@ -85,6 +84,8 @@ private[spark] class Client(
   private var appMaster: ApplicationMaster = _
   private var stagingDirPath: Path = _
 
+  private val amMemoryOverheadFactor = sparkConf.get(DRIVER_MEMORY_OVERHEAD_FACTOR)

Review comment:
       So just keep the hard coded 0.1 for client mode AM overhead factor?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Kimahriman commented on a change in pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r808995342



##########
File path: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala
##########
@@ -85,6 +84,8 @@ private[spark] class Client(
   private var appMaster: ApplicationMaster = _
   private var stagingDirPath: Path = _
 
+  private val amMemoryOverheadFactor = sparkConf.get(DRIVER_MEMORY_OVERHEAD_FACTOR)

Review comment:
       Or just throw in a new config for that too while I'm at it




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tgravescs commented on pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
tgravescs commented on pull request #35504:
URL: https://github.com/apache/spark/pull/35504#issuecomment-1069298853


   oops, I see they cut the branch-3.3, I will see about merging this to that branch


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] asfgit closed pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
asfgit closed pull request #35504:
URL: https://github.com/apache/spark/pull/35504


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Yikun commented on pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
Yikun commented on pull request #35504:
URL: https://github.com/apache/spark/pull/35504#issuecomment-1069843470


   
   
   ![image](https://user-images.githubusercontent.com/1736354/158721631-a970220c-0667-473b-a655-7526aa6e1e8e.png)
   
   FYI, some K8S IT test failed since this commit: https://github.com/Yikun/spark/runs/5573264340?check_suite_focus=true
   
   cc @dongjoon-hyun @holdenk @tgravescs 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Yikun commented on a change in pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
Yikun commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r829192039



##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = if (kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+    kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+  } else {
+    kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+  }

Review comment:
       The reason should be in here, before kubernetesConf.get(MEMORY_OVERHEAD_FACTOR) was used as default factor, it's `0.4` according my real debug watch on IT `Run PySpark on simple pi.py example`.
   
   But current EXECUTOR_MEMORY_OVERHEAD_FACTOR has more priority than so MEMORY_OVERHEAD_FACTOR is be overrited. (so 0.1 by default). So that the default behavior changed.
   
   But I haven't found why `kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)` is `0.4` yet, I couldn't find a code in IT to set this explictly.
   
   cc @Kimahriman @tgravescs 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Yikun commented on a change in pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
Yikun commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r829202638



##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = if (kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+    kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+  } else {
+    kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+  }

Review comment:
       https://github.com/apache/spark/blob/cd86df881c9570d32f6522ee163b884fda00a530/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicDriverFeatureStep.scala#L62-L63
   
   I found it, it is propagated to executors from driver, how?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tgravescs commented on a change in pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
tgravescs commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r829216579



##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = if (kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+    kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+  } else {
+    kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+  }

Review comment:
       ok, I think I see how this is happening:
   
   ```
    val resolvedDriverSpec = builder.buildFromFeatures(conf, kubernetesClient)
       val configMapName = KubernetesClientUtils.configMapNameDriver
       val confFilesMap = KubernetesClientUtils.buildSparkConfDirFilesMap(configMapName,
         conf.sparkConf, resolvedDriverSpec.systemProperties)
   ```
   
   We build the driver spec, which includes the added system properties:
   
   `spec.systemProperties ++ addedSystemProperties`
   
   Added system properties in driver feature steps add the memory overhead setting there:
   ```
   
   val additionalProps = mutable.Map(
         KUBERNETES_DRIVER_POD_NAME.key -> driverPodName,
         "spark.app.id" -> conf.appId,
         KUBERNETES_DRIVER_SUBMIT_CHECK.key -> "true",
         MEMORY_OVERHEAD_FACTOR.key -> overheadFactor.toString)
   ```
   
   Then the KubernetesClientUtils.buildSparkConfDirFilesMap is called which propagates it to the executors (I think)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Kimahriman commented on a change in pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
Kimahriman commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r829230221



##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = if (kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+    kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+  } else {
+    kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+  }

Review comment:
       Should/can the Executor feature step add it's own memory overhead or do we need to also calculate the executor overhead in the driver feature step and added it as a system prop?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tgravescs commented on a change in pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
tgravescs commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r829323361



##########
File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
   private val isDefaultProfile = resourceProfile.id == ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
   private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) == Some(APP_RESOURCE_TYPE_PYTHON)
   private val disableConfigMap = kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+  private val memoryOverheadFactor = if (kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+    kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+  } else {
+    kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+  }

Review comment:
       I realized that might get ugly, I'll put up a pr shortly




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #35504: [SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #35504:
URL: https://github.com/apache/spark/pull/35504#issuecomment-1071393344


   This is reverted from `branch-3.3` via https://github.com/apache/spark/pull/35900 .


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org