You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by do...@apache.org on 2022/06/02 03:10:37 UTC

[spark] branch branch-3.3 updated: [SPARK-39360][K8S] Remove deprecation of `spark.kubernetes.memoryOverheadFactor` and recover doc

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.3
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.3 by this push:
     new 22686656636 [SPARK-39360][K8S] Remove deprecation of `spark.kubernetes.memoryOverheadFactor` and recover doc
22686656636 is described below

commit 2268665663684dd381adf266feb74ac97a53900d
Author: Dongjoon Hyun <do...@apache.org>
AuthorDate: Wed Jun 1 20:10:01 2022 -0700

    [SPARK-39360][K8S] Remove deprecation of `spark.kubernetes.memoryOverheadFactor` and recover doc
    
    ### What changes were proposed in this pull request?
    
    This PR aims to avoid the deprecation of `spark.kubernetes.memoryOverheadFactor` from Apache Spark 3.3. In addition, also recovers the documentation which is removed mistakenly at the `deprecation`. `Deprecation` is not a removal.
    
    ### Why are the changes needed?
    
    - Apache Spark 3.3.0 RC complains always about `spark.kubernetes.memoryOverheadFactor` because the configuration has the default value (which is not given by the users). There is no way to remove the warnings which means the directional message is not helpful and makes the users confused in a wrong way. In other words, we still get warnings even we use only new configurations or no configuration.
    ```
    22/06/01 23:53:49 WARN SparkConf: The configuration key 'spark.kubernetes.memoryOverheadFactor' has been deprecated as of Spark 3.3.0 and may be removed in the future. Please use spark.driver.memoryOverheadFactor and spark.executor.memoryOverheadFactor
    22/06/01 23:53:49 WARN SparkConf: The configuration key 'spark.kubernetes.memoryOverheadFactor' has been deprecated as of Spark 3.3.0 and may be removed in the future. Please use spark.driver.memoryOverheadFactor and spark.executor.memoryOverheadFactor
    22/06/01 23:53:50 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    22/06/01 23:53:50 WARN SparkConf: The configuration key 'spark.kubernetes.memoryOverheadFactor' has been deprecated as of Spark 3.3.0 and may be removed in the future. Please use spark.driver.memoryOverheadFactor and spark.executor.memoryOverheadFactor
    ```
    
    - The minimum constraint is slightly different because `spark.kubernetes.memoryOverheadFactor` allowed 0 since Apache Spark 2.4 while new configurations disallow `0`.
    
    - This documentation removal might be too early because the deprecation is not the removal of configuration. This PR recoveres the removed doc and added the following.
    ```
    This will be overridden by the value set by
    <code>spark.driver.memoryOverheadFactor</code> and
    <code>spark.executor.memoryOverheadFactor</code> explicitly.
    ```
    
    ### Does this PR introduce _any_ user-facing change?
    
    No. This is a consistent with the existing behavior.
    
    ### How was this patch tested?
    
    Pass the CIs.
    
    Closes #36744 from dongjoon-hyun/SPARK-39360.
    
    Authored-by: Dongjoon Hyun <do...@apache.org>
    Signed-off-by: Dongjoon Hyun <do...@apache.org>
    (cherry picked from commit 6d43556089a21b26d1a7590fbe1e25bd1ca7cedd)
    Signed-off-by: Dongjoon Hyun <do...@apache.org>
---
 core/src/main/scala/org/apache/spark/SparkConf.scala |  4 +---
 docs/running-on-kubernetes.md                        | 10 ++++++++++
 2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/core/src/main/scala/org/apache/spark/SparkConf.scala b/core/src/main/scala/org/apache/spark/SparkConf.scala
index cf121749b73..5f37a1abb19 100644
--- a/core/src/main/scala/org/apache/spark/SparkConf.scala
+++ b/core/src/main/scala/org/apache/spark/SparkConf.scala
@@ -636,9 +636,7 @@ private[spark] object SparkConf extends Logging {
       DeprecatedConfig("spark.blacklist.killBlacklistedExecutors", "3.1.0",
         "Please use spark.excludeOnFailure.killExcludedExecutors"),
       DeprecatedConfig("spark.yarn.blacklist.executor.launch.blacklisting.enabled", "3.1.0",
-        "Please use spark.yarn.executor.launch.excludeOnFailure.enabled"),
-      DeprecatedConfig("spark.kubernetes.memoryOverheadFactor", "3.3.0",
-        "Please use spark.driver.memoryOverheadFactor and spark.executor.memoryOverheadFactor")
+        "Please use spark.yarn.executor.launch.excludeOnFailure.enabled")
     )
 
     Map(configs.map { cfg => (cfg.key -> cfg) } : _*)
diff --git a/docs/running-on-kubernetes.md b/docs/running-on-kubernetes.md
index ee77e37beb3..9659a6ebe2f 100644
--- a/docs/running-on-kubernetes.md
+++ b/docs/running-on-kubernetes.md
@@ -1137,6 +1137,16 @@ See the [configuration page](configuration.html) for information on Spark config
   </td>
   <td>3.0.0</td>
 </tr>
+<tr>
+  <td><code>spark.kubernetes.memoryOverheadFactor</code></td>
+  <td><code>0.1</code></td>
+  <td>
+    This sets the Memory Overhead Factor that will allocate memory to non-JVM memory, which includes off-heap memory allocations, non-JVM tasks, various systems processes, and <code>tmpfs</code>-based local directories when <code>spark.kubernetes.local.dirs.tmpfs</code> is <code>true</code>. For JVM-based jobs this value will default to 0.10 and 0.40 for non-JVM jobs.
+    This is done as non-JVM tasks need more non-JVM heap space and such tasks commonly fail with "Memory Overhead Exceeded" errors. This preempts this error with a higher default.
+    This will be overridden by the value set by <code>spark.driver.memoryOverheadFactor</code> and <code>spark.executor.memoryOverheadFactor</code> explicitly.
+  </td>
+  <td>2.4.0</td>
+</tr>
 <tr>
   <td><code>spark.kubernetes.pyspark.pythonVersion</code></td>
   <td><code>"3"</code></td>


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org