You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2021/04/26 11:08:08 UTC

[GitHub] [pulsar] galrose opened a new issue #10382: PreciseTopicPublishRateLimiterEnable doesn't always work.

galrose opened a new issue #10382:
URL: https://github.com/apache/pulsar/issues/10382


   **Describe the bug**
   When using the pulsar-perf and writing messages either in large batches (>100) or large number of outstanding messages (>100).
   When limiting either the -bm or -o to a small number around 5, preferably 1 it works perfectly. 
   The easiest way to check is not setting the -bm or -o in pulsar-perf. 
   
   Slack thread: https://apache-pulsar.slack.com/archives/C5Z4T36F7/p1619419736156600?thread_ts=1619419736.156600&cid=C5Z4T36F7
   
   **To Reproduce**
   Steps to reproduce the behavior:
   1. preciseTopicPublishRateLimiterEnable=true in the broker.conf
   2. Create a new tenant, namespace, and topic. 
   3. Run 'pulsar-admin namespaces set-publish-rate mytenant/default -b 102400' 
   4. Run 'pulsar-perf produce -s 1024 -threads 1 -r 1000' 
   5. You can see both in the metrics and in the pulsar-perf that the rate is not limited precisely. 
   
   **Expected behavior**
   I expect the limit to be precise no matter the batch or outstanding messages. 
   
   **Desktop (please complete the following information):**
    - OS: Centos 7
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] galrose closed issue #10382: PreciseTopicPublishRateLimiterEnable doesn't always work.

Posted by GitBox <gi...@apache.org>.
galrose closed issue #10382:
URL: https://github.com/apache/pulsar/issues/10382


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] galrose commented on issue #10382: PreciseTopicPublishRateLimiterEnable doesn't always work.

Posted by GitBox <gi...@apache.org>.
galrose commented on issue #10382:
URL: https://github.com/apache/pulsar/issues/10382#issuecomment-826913446


   Yes you are correct my bad, you can also do just the -m for message limitation and you can reproduce it like that as well.
   I'm not sure if your PR covers this issue as well, but I'll check if it fixes it as soon as it is merged. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] lhotari commented on issue #10382: PreciseTopicPublishRateLimiterEnable doesn't always work.

Posted by GitBox <gi...@apache.org>.
lhotari commented on issue #10382:
URL: https://github.com/apache/pulsar/issues/10382#issuecomment-826873989


   @galrose thanks for reporting .
   
   When reproducing, I needed to also specify a message limit to the publish rate limit.
   For example, 
   `pulsar-admin namespaces set-publish-rate mytenant/default -b 102400 -m 1000`
   That seems to be another bug which is perhaps the "open issue" in PR #10384 .
   
   I was able to reproduce the issue. When the amount of outstanding messages is high the rate limiting is very inconsistent.
   
   For example:
   ```
   17:11:02.944 [main] INFO  org.apache.pulsar.testclient.PerformanceProducer - Throughput produced:    138.3  msg/s ---      1.1 Mbit/s --- failure      0.0 msg/s --- Latency: mean: 4596.594 ms - med: 4423.903 - 95pct: 7853.471 - 99pct: 7902.623 - 99.9pct: 7914.623 - 99.99pct: 7915.647 - Max: 7915.647
   17:11:12.996 [main] INFO  org.apache.pulsar.testclient.PerformanceProducer - Throughput produced:    494.7  msg/s ---      3.9 Mbit/s --- failure      0.0 msg/s --- Latency: mean: 2469.594 ms - med: 1781.367 - 95pct: 5994.559 - 99pct: 6994.687 - 99.9pct: 6997.087 - 99.99pct: 7776.479 - Max: 7776.479
   17:11:23.023 [main] INFO  org.apache.pulsar.testclient.PerformanceProducer - Throughput produced:    754.8  msg/s ---      5.9 Mbit/s --- failure      0.0 msg/s --- Latency: mean: 1304.984 ms - med: 1214.711 - 95pct: 2000.375 - 99pct: 2000.927 - 99.9pct: 2001.071 - 99.99pct: 2001.095 - Max: 2001.215
   17:11:33.044 [main] INFO  org.apache.pulsar.testclient.PerformanceProducer - Throughput produced:    626.0  msg/s ---      4.9 Mbit/s --- failure      0.0 msg/s --- Latency: mean: 1658.938 ms - med: 1773.103 - 95pct: 2983.343 - 99pct: 3003.791 - 99.9pct: 3003.887 - 99.99pct: 3003.903 - Max: 3004.079
   17:11:43.069 [main] INFO  org.apache.pulsar.testclient.PerformanceProducer - Throughput produced:    501.4  msg/s ---      3.9 Mbit/s --- failure      0.0 msg/s --- Latency: mean: 1930.409 ms - med: 1995.135 - 95pct: 3782.767 - 99pct: 3999.711 - 99.9pct: 4776.831 - 99.99pct: 4997.055 - Max: 4998.015
   ```
   
   The current algorithm in the rate limiter seems to have it's limitations. It seems to work in a way where it switches the "auto read" to false for the Netty channel to cause backpressure when the rate limit is reached. However, it seems that the rate limit will reset once per second. This has the consequence that buffered content will get resumed. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org