You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2019/10/01 00:10:18 UTC

[GitHub] [spark] mccheah commented on a change in pull request #25823: [SPARK-28211][Core][Shuffle] Propose Shuffle Driver Components API

mccheah commented on a change in pull request #25823: [SPARK-28211][Core][Shuffle] Propose Shuffle Driver Components API
URL: https://github.com/apache/spark/pull/25823#discussion_r329836394
 
 

 ##########
 File path: core/src/main/scala/org/apache/spark/SparkContext.scala
 ##########
 @@ -524,6 +528,19 @@ class SparkContext(config: SparkConf) extends Logging {
     executorEnvs ++= _conf.getExecutorEnv
     executorEnvs("SPARK_USER") = sparkUser
 
+    val configuredPluginClasses = conf.get(SHUFFLE_IO_PLUGIN_CLASS)
+    val maybeIO = Utils.loadExtensions(
+      classOf[ShuffleDataIO], Seq(configuredPluginClasses), conf)
+    require(maybeIO.nonEmpty, s"At least one valid shuffle plugin must be specified by config " +
+      s"${SHUFFLE_IO_PLUGIN_CLASS.key}, but $configuredPluginClasses resulted in zero valid " +
+      s"plugins.")
+    require(maybeIO.size == 1,
+        s"Specified shuffle plugin(s) $configuredPluginClasses resulted in more than one valid " +
+        s"plugin, but only one valid plugin should be specified")
+    _shuffleDriverComponents = maybeIO.head.driver()
+    _shuffleDriverComponents.initializeApplication().asScala.foreach {
+      case (k, v) => _conf.set(ShuffleDataIO.SHUFFLE_SPARK_CONF_PREFIX + k, v) }
 
 Review comment:
   In general I'm wary of logging configurations, particularly if the plugin implementation can include configurations that may be sensitive. Since this isn't going through the existing paths of Spark configuration logging - which, when that is done, redacts configuration values accordingly - I'd err on the side of caution and not do logging here.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org