You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@celeborn.apache.org by GitBox <gi...@apache.org> on 2022/12/05 07:33:30 UTC

[GitHub] [incubator-celeborn] AngersZhuuuu commented on a diff in pull request #1047: [CELEBORN-102][REFACTOR] TIMEOUT default value should be changed with network timeout

AngersZhuuuu commented on code in PR #1047:
URL: https://github.com/apache/incubator-celeborn/pull/1047#discussion_r1039230604


##########
common/src/main/scala/org/apache/celeborn/common/CelebornConf.scala:
##########
@@ -691,6 +679,39 @@ class CelebornConf(loadDefaults: Boolean) extends Cloneable with Logging with Se
   def rpcCacheConcurrencyLevel: Int = get(RPC_CACHE_CONCURRENCY_LEVEL)
   def rpcCacheExpireTime: Long = get(RPC_CACHE_EXPIRE_TIME)
   def pushDataRpcTimeoutMs = get(PUSH_DATA_RPC_TIMEOUT)
+  def registerShuffleRpcAskTimeout: RpcTimeout = {
+    get(REGISTER_SHUFFLE_RPC_ASK_TIMEOUT).map { timeout =>
+      new RpcTimeout(
+        timeout.milli,
+        REGISTER_SHUFFLE_RPC_ASK_TIMEOUT.key)
+    }.getOrElse {
+      new RpcTimeout(
+        rpcAskTimeout.duration * (reserveSlotsMaxRetries + 2),

Review Comment:
   > why use such default value?
   
   Since  reserve slot can cost 1 rpcAskTimeout and it can retry `reserveSlotsMaxRetries` times, before retry reserve slot, LifecycleManager need ask master for `releaseSlot`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@celeborn.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org