You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by rx...@apache.org on 2014/11/26 08:16:08 UTC
spark git commit: [SPARK-4612] Reduce task latency and increase
scheduling throughput by making configuration initialization lazy
Repository: spark
Updated Branches:
refs/heads/master 346bc17a2 -> e7f4d2534
[SPARK-4612] Reduce task latency and increase scheduling throughput by making configuration initialization lazy
https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/executor/Executor.scala#L337 creates a configuration object for every task that is launched, even if there is no new dependent file/JAR to update. This is a heavy-weight creation that should be avoided if there is no new file/JAR to update. This PR makes that creation lazy. Quick local test in spark-perf scheduling throughput tests gives the following numbers in a local standalone scheduler mode.
1 job with 10000 tasks: before 7.8395 seconds, after 2.6415 seconds = 3x increase in task scheduling throughput
pwendell JoshRosen
Author: Tathagata Das <ta...@gmail.com>
Closes #3463 from tdas/lazy-config and squashes the following commits:
c791c1e [Tathagata Das] Reduce task latency by making configuration initialization lazy
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/e7f4d253
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/e7f4d253
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/e7f4d253
Branch: refs/heads/master
Commit: e7f4d2534bb3361ec4b7af0d42bc798a7a425226
Parents: 346bc17
Author: Tathagata Das <ta...@gmail.com>
Authored: Tue Nov 25 23:15:58 2014 -0800
Committer: Reynold Xin <rx...@databricks.com>
Committed: Tue Nov 25 23:15:58 2014 -0800
----------------------------------------------------------------------
core/src/main/scala/org/apache/spark/executor/Executor.scala | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
----------------------------------------------------------------------
http://git-wip-us.apache.org/repos/asf/spark/blob/e7f4d253/core/src/main/scala/org/apache/spark/executor/Executor.scala
----------------------------------------------------------------------
diff --git a/core/src/main/scala/org/apache/spark/executor/Executor.scala b/core/src/main/scala/org/apache/spark/executor/Executor.scala
index 5fa5845..835157f 100644
--- a/core/src/main/scala/org/apache/spark/executor/Executor.scala
+++ b/core/src/main/scala/org/apache/spark/executor/Executor.scala
@@ -334,7 +334,7 @@ private[spark] class Executor(
* SparkContext. Also adds any new JARs we fetched to the class loader.
*/
private def updateDependencies(newFiles: HashMap[String, Long], newJars: HashMap[String, Long]) {
- val hadoopConf = SparkHadoopUtil.get.newConfiguration(conf)
+ lazy val hadoopConf = SparkHadoopUtil.get.newConfiguration(conf)
synchronized {
// Fetch missing dependencies
for ((name, timestamp) <- newFiles if currentFiles.getOrElse(name, -1L) < timestamp) {
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org