You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@celeborn.apache.org by GitBox <gi...@apache.org> on 2023/01/17 03:06:54 UTC

[GitHub] [incubator-celeborn] boneanxs commented on a diff in pull request #1066: [CELEBORN-207] Support network bakpressure and control

boneanxs commented on code in PR #1066:
URL: https://github.com/apache/incubator-celeborn/pull/1066#discussion_r1071697818


##########
common/src/main/scala/org/apache/celeborn/common/CelebornConf.scala:
##########
@@ -2656,6 +2667,59 @@ object CelebornConf extends Logging {
       .timeConf(TimeUnit.SECONDS)
       .createWithDefaultString("10s")
 
+  val WORKER_RATE_LIMIT_ENABLED: ConfigEntry[Boolean] =
+    buildConf("celeborn.worker.ratelimit.enabled")
+      .categories("worker")
+      .doc("Whether to enable rate limit control or not.")
+      .version("0.3.0")
+      .booleanConf
+      .createWithDefault(false)
+
+  val WORKER_RATE_LIMIT_SAMPLE_TIME_WINDOW: ConfigEntry[Long] =
+    buildConf("celeborn.worker.ratelimit.sample.time.window")
+      .categories("worker")
+      .doc("The worker holds a time sliding list to calculate users' produce/consume rate")
+      .version("0.3.0")
+      .timeConf(TimeUnit.SECONDS)
+      .createWithDefaultString("10s")
+
+  val WORKER_RATE_LIMIT_LOW_WATERMARK: OptionalConfigEntry[Long] =
+    buildConf("celeborn.worker.ratelimit.low.watermark")
+      .categories("worker")
+      .doc("Will stop congest users if the total pending bytes of disk buffer is lower than " +
+        "this configuration")
+      .version("0.3.0")
+      .bytesConf(ByteUnit.BYTE)
+      .createOptional
+
+  val WORKER_RATE_LIMIT_HIGH_WATERMARK: OptionalConfigEntry[Long] =
+    buildConf("celeborn.worker.ratelimit.high.watermark")
+      .categories("worker")
+      .doc("If the total bytes in disk buffer exceeds this configure, will start to congest" +
+        "users whose produce rate is higher than the potential average consume rate. " +
+        "The congestion will stop if the produce rate is lower or equal to the " +
+        "average consume rate, or the total pending bytes lower than " +
+        s"${WORKER_RATE_LIMIT_LOW_WATERMARK.key}")
+      .version("0.3.0")
+      .bytesConf(ByteUnit.BYTE)
+      .createOptional
+
+  val WORKER_RATE_LIMIT_USER_INACTIVE_INTERVAL: ConfigEntry[Long] =
+    buildConf("celeborn.worker.ratelimit.user.inactive.interval")
+      .categories("worker")
+      .doc("How long will consider this user is inactive if it doesn't send data")
+      .version("0.3.0")
+      .timeConf(TimeUnit.MILLISECONDS)
+      .createWithDefaultString("10min")
+
+  val WORKER_RATE_LIMIT_CHECK_USER_STATUS_INTERVAL: ConfigEntry[Long] =

Review Comment:
   It's possible that some inactive users will be removed after 2 * `WORKER_RATE_LIMIT_USER_INACTIVE_INTERVAL`, usually `WORKER_RATE_LIMIT_USER_INACTIVE_INTERVAL` is much higher than the `WORKER_RATE_LIMIT_CHECK_USER_STATUS_INTERVAL`, so the intention here is try to remove inactive users as quickly as possible.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@celeborn.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org