You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by sr...@apache.org on 2015/02/23 12:29:29 UTC
spark git commit: [SPARK-5724] fix the misconfiguration in AkkaUtils
Repository: spark
Updated Branches:
refs/heads/master 757b14b86 -> 242d49584
[SPARK-5724] fix the misconfiguration in AkkaUtils
https://issues.apache.org/jira/browse/SPARK-5724
In AkkaUtil, we set several failure detector related the parameters as following
```
al akkaConf = ConfigFactory.parseMap(conf.getAkkaConf.toMap[String, String])
.withFallback(akkaSslConfig).withFallback(ConfigFactory.parseString(
s"""
|akka.daemonic = on
|akka.loggers = [""akka.event.slf4j.Slf4jLogger""]
|akka.stdout-loglevel = "ERROR"
|akka.jvm-exit-on-fatal-error = off
|akka.remote.require-cookie = "$requireCookie"
|akka.remote.secure-cookie = "$secureCookie"
|akka.remote.transport-failure-detector.heartbeat-interval = $akkaHeartBeatInterval s
|akka.remote.transport-failure-detector.acceptable-heartbeat-pause = $akkaHeartBeatPauses s
|akka.remote.transport-failure-detector.threshold = $akkaFailureDetector
|akka.actor.provider = "akka.remote.RemoteActorRefProvider"
|akka.remote.netty.tcp.transport-class = "akka.remote.transport.netty.NettyTransport"
|akka.remote.netty.tcp.hostname = "$host"
|akka.remote.netty.tcp.port = $port
|akka.remote.netty.tcp.tcp-nodelay = on
|akka.remote.netty.tcp.connection-timeout = $akkaTimeout s
|akka.remote.netty.tcp.maximum-frame-size = ${akkaFrameSize}B
|akka.remote.netty.tcp.execution-pool-size = $akkaThreads
|akka.actor.default-dispatcher.throughput = $akkaBatchSize
|akka.log-config-on-start = $logAkkaConfig
|akka.remote.log-remote-lifecycle-events = $lifecycleEvents
|akka.log-dead-letters = $lifecycleEvents
|akka.log-dead-letters-during-shutdown = $lifecycleEvents
""".stripMargin))
```
Actually, we do not have any parameter naming "akka.remote.transport-failure-detector.threshold"
see: http://doc.akka.io/docs/akka/2.3.4/general/configuration.html
what we have is "akka.remote.watch-failure-detector.threshold"
Author: CodingCat <zh...@gmail.com>
Closes #4512 from CodingCat/SPARK-5724 and squashes the following commits:
bafe56e [CodingCat] fix the grammar in configuration doc
338296e [CodingCat] remove failure-detector related info
8bfcfd4 [CodingCat] fix the misconfiguration in AkkaUtils
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/242d4958
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/242d4958
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/242d4958
Branch: refs/heads/master
Commit: 242d49584c6aa21d928db2552033661950f760a5
Parents: 757b14b
Author: CodingCat <zh...@gmail.com>
Authored: Mon Feb 23 11:29:25 2015 +0000
Committer: Sean Owen <so...@cloudera.com>
Committed: Mon Feb 23 11:29:25 2015 +0000
----------------------------------------------------------------------
.../scala/org/apache/spark/util/AkkaUtils.scala | 3 --
docs/configuration.md | 36 +++++++-------------
2 files changed, 12 insertions(+), 27 deletions(-)
----------------------------------------------------------------------
http://git-wip-us.apache.org/repos/asf/spark/blob/242d4958/core/src/main/scala/org/apache/spark/util/AkkaUtils.scala
----------------------------------------------------------------------
diff --git a/core/src/main/scala/org/apache/spark/util/AkkaUtils.scala b/core/src/main/scala/org/apache/spark/util/AkkaUtils.scala
index 3d9c619..48a6ede 100644
--- a/core/src/main/scala/org/apache/spark/util/AkkaUtils.scala
+++ b/core/src/main/scala/org/apache/spark/util/AkkaUtils.scala
@@ -79,8 +79,6 @@ private[spark] object AkkaUtils extends Logging {
val logAkkaConfig = if (conf.getBoolean("spark.akka.logAkkaConfig", false)) "on" else "off"
val akkaHeartBeatPauses = conf.getInt("spark.akka.heartbeat.pauses", 6000)
- val akkaFailureDetector =
- conf.getDouble("spark.akka.failure-detector.threshold", 300.0)
val akkaHeartBeatInterval = conf.getInt("spark.akka.heartbeat.interval", 1000)
val secretKey = securityManager.getSecretKey()
@@ -106,7 +104,6 @@ private[spark] object AkkaUtils extends Logging {
|akka.remote.secure-cookie = "$secureCookie"
|akka.remote.transport-failure-detector.heartbeat-interval = $akkaHeartBeatInterval s
|akka.remote.transport-failure-detector.acceptable-heartbeat-pause = $akkaHeartBeatPauses s
- |akka.remote.transport-failure-detector.threshold = $akkaFailureDetector
|akka.actor.provider = "akka.remote.RemoteActorRefProvider"
|akka.remote.netty.tcp.transport-class = "akka.remote.transport.netty.NettyTransport"
|akka.remote.netty.tcp.hostname = "$host"
http://git-wip-us.apache.org/repos/asf/spark/blob/242d4958/docs/configuration.md
----------------------------------------------------------------------
diff --git a/docs/configuration.md b/docs/configuration.md
index 541695c..c8db338 100644
--- a/docs/configuration.md
+++ b/docs/configuration.md
@@ -903,36 +903,24 @@ Apart from these, the following properties are also available, and may be useful
<td><code>spark.akka.heartbeat.pauses</code></td>
<td>6000</td>
<td>
- This is set to a larger value to disable failure detector that comes inbuilt akka. It can be
- enabled again, if you plan to use this feature (Not recommended). Acceptable heart beat pause
- in seconds for akka. This can be used to control sensitivity to gc pauses. Tune this in
- combination of `spark.akka.heartbeat.interval` and `spark.akka.failure-detector.threshold`
- if you need to.
- </td>
-</tr>
-<tr>
- <td><code>spark.akka.failure-detector.threshold</code></td>
- <td>300.0</td>
- <td>
- This is set to a larger value to disable failure detector that comes inbuilt akka. It can be
- enabled again, if you plan to use this feature (Not recommended). This maps to akka's
- `akka.remote.transport-failure-detector.threshold`. Tune this in combination of
- `spark.akka.heartbeat.pauses` and `spark.akka.heartbeat.interval` if you need to.
+ This is set to a larger value to disable the transport failure detector that comes built in to Akka.
+ It can be enabled again, if you plan to use this feature (Not recommended). Acceptable heart
+ beat pause in seconds for Akka. This can be used to control sensitivity to GC pauses. Tune
+ this along with `spark.akka.heartbeat.interval` if you need to.
</td>
</tr>
<tr>
<td><code>spark.akka.heartbeat.interval</code></td>
<td>1000</td>
<td>
- This is set to a larger value to disable failure detector that comes inbuilt akka. It can be
- enabled again, if you plan to use this feature (Not recommended). A larger interval value in
- seconds reduces network overhead and a smaller value ( ~ 1 s) might be more informative for
- akka's failure detector. Tune this in combination of `spark.akka.heartbeat.pauses` and
- `spark.akka.failure-detector.threshold` if you need to. Only positive use case for using
- failure detector can be, a sensistive failure detector can help evict rogue executors really
- quick. However this is usually not the case as gc pauses and network lags are expected in a
- real Spark cluster. Apart from that enabling this leads to a lot of exchanges of heart beats
- between nodes leading to flooding the network with those.
+ This is set to a larger value to disable the transport failure detector that comes built in to Akka.
+ It can be enabled again, if you plan to use this feature (Not recommended). A larger interval
+ value in seconds reduces network overhead and a smaller value ( ~ 1 s) might be more informative
+ for Akka's failure detector. Tune this in combination of `spark.akka.heartbeat.pauses` if you need
+ to. A likely positive use case for using failure detector would be: a sensistive failure detector
+ can help evict rogue executors quickly. However this is usually not the case as GC pauses
+ and network lags are expected in a real Spark cluster. Apart from that enabling this leads to
+ a lot of exchanges of heart beats between nodes leading to flooding the network with those.
</td>
</tr>
<tr>
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org