You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2020/11/25 23:52:47 UTC

[GitHub] [incubator-pinot] sajjad-moradi commented on pull request #6291: Add consumption rate limiter for LLConsumer

sajjad-moradi commented on pull request #6291:
URL: https://github.com/apache/incubator-pinot/pull/6291#issuecomment-733995748


   > Can you note in your checkin comments that what we are throttling is not really the consumption, but the _processing_ of messages. We will still consume as much as we can from the stream.
   > 
   > If we were to limit consumption, then the behavior will be somewhat like:
   > 
   > ```
   > while (true) {
   >   consumeMsgsAsPerSomeAllowedRate()
   >   processAllMsgsConsumed()
   >   sleepAsIndicatedByRateLimiter()
   > }
   > ```
   > 
   > Whereas by rate limiting the processing, we are doing the following:
   > 
   > ```
   > while (true) {
   >   consumeAllMsgsThatWeCan()
   >   foreach(msg) {
   >     processMsg()
   >     sleepAsDictatedByRateLimiter()
   >   }
   > }
   > ```
   > 
   > So, we may be taking up more heap in the second case ?
   
   I actually looked into that and Kafka doesn't provide an API to retrieve limited number of messages. IMO having a rate limit on processing will have similar effect as if we put the rate limit on the consumption because we synchronously process the messages after the messages are polled from Kafka. For example, let's assume for a bursty period, the incoming rate of Kafka messages is 100 msgs/sec and we have set the rate limit to 20 msgs/sec. That means for a period of 10 seconds, we only process 200 messages and while we're processing messages, we don't consume new messages. This effectively puts the consumption rate to 20msgs/sec while if there was no rate limit we would've consumed 1000 messages at rate 100 msgs/sec.
   
   Side note:
   Kafka consumer has this configuration `max.partition.fetch.byte` that limits the count of consumed bytes. That is a bit hard to utilize as the intention here is to consume less number of messages than consumed bytes.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org