You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2021/03/01 05:03:56 UTC

[GitHub] [spark] HyukjinKwon commented on a change in pull request #31598: [SPARK-34478][SQL] When build SparkSession, we should check config keys

HyukjinKwon commented on a change in pull request #31598:
URL: https://github.com/apache/spark/pull/31598#discussion_r584447955



##########
File path: docs/configuration.md
##########
@@ -114,12 +114,67 @@ in the `spark-defaults.conf` file. A few configuration keys have been renamed si
 versions of Spark; in such cases, the older key names are still accepted, but take lower
 precedence than any instance of the newer key.
 
-Spark properties mainly can be divided into two kinds: one is related to deploy, like
-"spark.driver.memory", "spark.executor.instances", this kind of properties may not be affected when
-setting programmatically through `SparkConf` in runtime, or the behavior is depending on which
-cluster manager and deploy mode you choose, so it would be suggested to set through configuration
-file or `spark-submit` command line options; another is mainly related to Spark runtime control,
-like "spark.task.maxFailures", this kind of properties can be set in either way.
+Note that Spark properties have different effective timing and they can be divided into three kinds:
+<table class="table">
+<tr><th>Configuration Type</th><th>Meaning</th><th>Examples</th></tr>
+<tr>
+  <td><code>Configurations needed at driver launch</code></td>
+  <td>
+    Configuration used to submit an application, such as <code>spark.driver.memory</code>, <code>spark.driver.extraClassPath</code>, these kind of properties only effect before driver's JVM is started, so it would be suggested to set through configuration file or <code>spark-submit</code> command line options.  
+  </td>
+  <td>
+    The following is a list of such configurations:
+    <ul>
+      <li><code>spark.driver.memory</code></li>
+      <li><code>spark.driver.memoryOverhead</code></li>
+      <li><code>spark.driver.cores</code></li>
+      <li><code>spark.driver.userClassPathFirst</code></li>
+      <li><code>spark.driver.extraClassPath</code></li>
+      <li><code>spark.driver.defaultJavaOptions</code></li>
+      <li><code>spark.driver.extraJavaOptions</code></li>
+      <li><code>spark.driver.extraLibraryPath</code></li>
+      <li><code>spark.driver.resource.*</code></li>
+      <li><code>spark.pyspark.driver.python</code></li>
+      <li><code>spark.pyspark.python</code></li>
+      <li><code>spark.r.shell.command</code></li>
+      <li><code>spark.launcher.childProcLoggerName</code></li>
+      <li><code>spark.launcher.childConnectionTimeout</code></li>
+      <li><code>spark.yarn.driver.*</code></li>
+     </ul>
+  </td>
+</tr>
+<tr>
+  <td><code>Application Deploy Related Configuration</code></td>
+  <td>
+    Like <code>spark.master</code>, <code>spark.executor.instances</code>, this kind of properties may not be affected when setting programmatically through <code>SparkConf</code> in runtime after SparkContext has been started, or the behavior is depending on which cluster manager and deploy mode you choose, so it would be suggested to set through configuration file, <code>spark-submit</code> command line options, or setting programmatically through <code>SparkConf</code> in runtime before start SparkContext.  
+  </td>
+  <td>
+     The following is a example such configurations:
+     <ul>
+       <li><code>spark.master</code></li>
+       <li><code>spark.app.name</code></li>
+       <li><code>spark.executor.memory</code></li>
+       <li><code>spark.submit.deployMode</code></li>
+       <li><code>spark.eventLog.enabled</code></li>
+       <li><code>etc...</code></li>

Review comment:
       I think you can remove etc ... 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org