You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2020/02/14 09:17:59 UTC

[GitHub] [spark] HyukjinKwon commented on a change in pull request #27577: [DOC] add config naming policy

HyukjinKwon commented on a change in pull request #27577: [DOC] add config naming policy
URL: https://github.com/apache/spark/pull/27577#discussion_r379322859
 
 

 ##########
 File path: core/src/main/scala/org/apache/spark/internal/config/ConfigEntry.scala
 ##########
 @@ -17,6 +17,34 @@
 
 package org.apache.spark.internal.config
 
+// ======================================================================================
+//                       The naming policy of configurations
+// ======================================================================================
+// In general, the config name should be a noun that describes its basic purpose. It's
+// recommended to add prefix to the config name to make the scope clearer. For example,
+// `spark.scheduler.mode` clearly indicates that this config is for the scheduler.
+//
+// A config name can have multiple prefixes that are structured, which is similar to a
+// qualified Java class name. Each prefix behaves like a namespace. We should only create
+// a namespace if it's meaningful and can be shared by multiple configs. For example,
+// `buffer.inMemoryThreshold` is preferred over `buffer.in.memory.threshold`.
+//
+// The followings are some best practices of naming configs for some common cases:
+// 1. When adding configs for a big feature, it's better to create an umbrella config that
+//    can turn the feature on/off, with a name like `featureName.enabled`. The other configs
+//    of this feature should be put under the `featureName` namespace. For example:
+//      - spark.sql.cbo.enabled
+//      - spark.sql.cbo.starSchemaDetection
+//      - spark.sql.cbo.joinReorder.enabled
+//      - spark.sql.cbo.joinReorder.dp.threshold
+// 2. When adding a boolean config, the name should be a verb that describes what
+//    happens if this config is set to true, e.g. `spark.shuffle.sort.useRadixSort`.
+// 3. When adding a config to specify a time duration, it's better to put the time unit
+//    in the config name. For example, `featureName.timeoutMs`, which clearly indicates
+//    that the time unit is millisecond. The config entry should be created with
+//    `ConfigBuilder#timeConf`, to support time strings like `2 minutes`.
+
 
 Review comment:
   No big deal but we better to use multiline comment style: `/* ... */`.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org